Find threats: Cloud credential theft on Linux endpoints

The Sumo Logic Threat Labs team previously outlined the risks associated with unprotected cloud credentials found on Windows endpoints. This article builds on that work by providing detection and hunting guidance in the context of endpoints that run the Linux operating system.

Although workloads that support business functionality are increasingly moving to the cloud, these workloads are often managed through an endpoint that is often found on premises.

Should they gain access to these on premises endpoints, threat actors may be able to read and exfiltrate credential material which is often found on hosts unprotected and, in turn, may grant access to cloud resources.

To protect your organization, you’ll want to follow along as we highlight the telemetry, tooling as well as hunting and alerting strategies aimed at protecting cloud credential theft from Linux endpoints.

Threat hunting tools you'll need to instrument your telemetry

To instrument the necessary telemetry required, we will be using a number of tools.

A configuration and installation of auditd
Instrumentation of auditd with a configuration file, we will be using Florian Roth’s auditd configuration as a base. This configuration file can be found here
TruffleHog, which is a tool designed to find keys and other credential material on endpoints or in CI/CD pipelines
A custom script provided by the Threat Labs team which takes Trufflehog output and adds it to an existing auditd configuration
In order to make the auditd logs easier to work with, we will also be using Laurel

The Sumo Logic Threat Labs team recently released a new set of mappers and parsers for Laurel Linux Logs.

Cloud SIEM Enterprise users can take advantage of this new functionality, in addition to support for Sysmon for Linux telemetry.

Making sense of the generated telemetry

Examining the various tooling outlined above, it is evident that there are a lot of moving pieces involved in detecting cloud credential theft on Linux endpoints.

In order to untangle these pieces and illustrate how they work together, let us look at an example.

Our example endpoint is used by a cloud administrator, or cloud developer. This system accesses Amazon Web Services (AWS), Azure and Google Cloud resources, all through Command Line Interface (CLI) tools.

On this endpoint, we want to find where these CLI tools store their authentication material.

Here, TruffleHog can help us, via the following command:

trufflehog filesystem --directory /home/ubuntu/ --json

This command assumes that the CLI tools outlined above were installed with the default options, into the target users’ directory.

After running the command, we see that trufflehog managed to find our AWS credential material.

Making sense of the generated telemetry - code block 1

We can save this TruffleHog output to a JSON file, and use a script to automate the addition of TruffleHog entries into our auditd configuration file, so that we can get telemetry generated when a particular file is accessed.

Making sense of the generated telemetry - code block 2

The script will parse the TruffleHog output by utilizing the jq utility and will add the entries to your existing auditd configuration.

Making sense of the generated telemetry - code block 3

Here is the script. It is provided to the community for free use and modification without any support.

#!/bin/bash

# Color variables
red='\033[0;31m'
green='\033[0;32m'
yellow='\033[0;33m'
blue='\033[0;34m'
magenta='\033[0;35m'
cyan='\033[0;36m'
# Clear the color after that
clear='\033[0m'

if [ $# -eq 0 ]
  then
    echo -e "${red}No arguments supplied, need to provide a path to the output of Trufflehog, in JSON format${clear}"
    echo -e "${red}To generate the JSON file, run: trufflehog filesystem -j --directory [directory path that you want to scan] > results.json${clear}"
    echo -e "${red}After the results.json file is generated, run: ./trufflehog2auditd.sh results.json${clear}"
  else
    # Package checking logic from here: https://askubuntu.com/question...
    REQUIRED_PKG="jq"
    PKG_OK=$(dpkg-query -W --showformat='${Status}\n' $REQUIRED_PKG|grep "install ok installed")

    echo -e "${magenta}Checking for $REQUIRED_PKG: $PKG_OK${clear}"

    if [ "" = "$PKG_OK" ]; then
      echo -e "${green}No $REQUIRED_PKG. Setting up $REQUIRED_PKG.${clear}"
      sudo apt-get --yes install $REQUIRED_PKG
    fi
    for files in $(cat $1 | jq -r '.SourceMetadata.Data.Filesystem.file' | uniq)
      do
        # Format we want: -w /file_path_with_creds.txt -p r -k file_with_creds
          echo -e "${yellow}Adding $files to Auditd Rules${clear}"
          echo -w $files -p r -k from_trufflehog_output >> /etc/audit/rules.d/audit.rules
      done
    echo -e "${green}Done Adding entries, restarting auditd...${clear}"
    augenrules --load
    echo -e "${cyan}All done..${clear}"
  fi

As a test, we can try to cat or read the file containing our Ubuntu credentials.

Making sense of the generated telemetry - code block 4

And if we check our Laurel logs, we should see an entry similar to this.

Making sense of the generated telemetry - code block 5

We can see from this event which file was read, what command was used to read the file, and which processes spawned the command.

These are all critical items that we will use to build our hunting hypothesis, baselining and alerting strategies.

It should be noted that in our testing, TruffleHog did not identify Google Cloud or Azure keys.

This table outlines where you can find these keys to help add these entries into your auditd configurations.

Service	Credential Material Location	Auditd Entry
Azure CLI	/home/ubuntu/.azure/msal_token_cache.json	-w /home/ubuntu/.azure/msal_token_cache.json -p r -k sensitive_cloud_cred
Azure CLI	/home/ubuntu/.azure/msal_token_cache.bin	-w /home/ubuntu/.azure/msal_token_cache.bin -p r -k sensitive_cloud_cred
Google Cloud	/home/ubuntu/.config/gcloud/access_tokens.db	-w /home/ubuntu/.config/gcloud/access_tokens.db -p r -k sensitive_cloud_cred
Google Cloud	/home/ubuntu/.config/gcloud/credentials.db	-w /home/ubuntu/.config/gcloud/credentials.db-p r -k sensitive_cloud_cred
Google Cloud	/home/ubuntu/.config/gcloud/legacy_credentials/{{Username}}/adc.json	-w /home/ubuntu/.config/gcloud/legacy_credentials/{{Username}}/adc.json -p r -k sensitive_cloud_cred

Baselining

At this point, we have instrumented our Linux host with the necessary telemetry and have identified where our cloud credentials are found on this system.

The next step is to follow the data to determine what Linux processes access our sensitive cloud credential material.

We can accomplish this by looking at the following Sumo Logic query:

_index=sec_record_audit metadata_product = "Laurel Linux Audit" metadata_deviceEventId = "System Call-257" // Looking at the normalized data index in Sumo CIP for Laurel logs and our specific Syscall
| %"fields.PATH.1.name" as file_accessed // Renaming a field for readability
| where file_accessed matches /(\.aws\/credentials|msal\_token\_cache|gcloud\/)/ // Regular expression match for file names that contain our cloud credentials only
| count(file_accessed) by baseImage,parentBaseImage // Count how many times a particular process accessed the files we filtered on

Looking at our results, by navigating to the “Aggregation” tab in Sumo CIP and clicking the “Donut” chart option, we see the following:

We can see some Python processes accessing our credential material, as well as some recognizable utilities like aws - we also see an interesting cat command which was used while testing our telemetry pipeline.

All the pieces are now in place for us to look at some hunting strategies.

Hunting

In order to generate some malicious or abnormal data, we will be using the incredibly powerful Mythic C2 Framework, utilizing the Poseidon payload in order to emulate threat attacker activity.

We can instruct our command and control agent to read files containing our sensitive cloud credentials.

Now that we have generated some malicious activity, let’s take a look at our data again, with the same baselining query outlined earlier and look at the result.

This time around, we notice a new process accessing our sensitive cloud credentials (highlighted in red).

We now have our telemetry set up, in addition to a baseline and some “malicious” activity.

As a next step, let’s bubble up this malicious activity using qualifier queries.

_index=sec_record_audit metadata_product = "Laurel Linux Audit" metadata_deviceEventId = "System Call-257" // Looking at the normalized data index in Sumo CIP for Laurel logs and our specific Syscall, for the x86/x64 architecture

// Initialize variables
| 0 as score
| "" as messageQualifiers
| "" as messageQualifiers1
| "" as messageQualifiers2
| "" as messageQualifiers3
| "" as messageQualifiers4


| %"fields.PATH.1.name" as file_accessed // Renaming a field for readability

| where file_accessed matches /(\.aws\/credentials|msal\_token\_cache|gcloud\/)/ // Regular expression match for file names that contain our cloud credentials only

// First qualifer, if the base image contains a legitimate Google Cloud tool
| if(baseImage matches /(google-cloud-sdk)/,concat(messageQualifiers, "Legit Gcloud CLI Use: ",file_accessed,"\nBy Parent Image: " ,baseImage,"\n# score: 3\n"),"") as messageQualifiers

// Second qualifer, if the base image contains a legitimate Azure CLI tool
| if(baseImage matches /(opt\/az\/bin\/python*)/,concat(messageQualifiers1, "Legit Azure CLI Use: ",file_accessed,"\nBy Parent Image: " ,baseImage,"\n# score: 3\n\n"),"") as messageQualifiers1

// Third qualifier, if the base image contains a legitimate AWS CLI tool
| if(baseImage matches /(\/usr\/local\/aws\-cli*)/,concat(messageQualifiers2, "Legit AWS CLI Use: ",file_accessed,"\nBy Parent Image: " ,baseImage,"\n# score: 3\n\n"),"") as messageQualifiers2

// Fourth qualifier, if the base image contains the "cat" binary
| if(baseImage matches /(\/usr\/bin\/cat)/,concat(messageQualifiers3, "Manual Cat of Cloud Creds: ",file_accessed,"\nBy Parent Image: " ,baseImage,"\n# score: 30\n\n"),"") as messageQualifiers3

// Final qualifier, if the process accessing our cloud credentials is in the home directory, label it suspicious and increase the score
| if (baseImage matches /(\/home\/)/,concat(messageQualifiers4, "Suspicious Cred Access: ",file_accessed,"\nBy Parent Image: " ,baseImage,"\n# score: 60\n\n"),"") as messageQualifiers4

| concat(messageQualifiers,messageQualifiers1,messageQualifiers2,messageQualifiers3,messageQualifiers4) as q //Concact all the qualifiers together

| parse regex field=q "score:\s(?<score>-?\d+)" multi //Grab the score from the qualifiers

| where !isEmpty(q) //Only return results if there is a qualifier of some kind

| values(q) as qualifiers,sum(score) as score by _sourcehost //Return our full qualifiers and sum the score by host

// Can add a final line to filter results further
// | where score > {{Score Number}}

The returned results found in the screenshot below are organized into three distinct categories, legitimate access of cloud credentials via known-good CLI tools, suspicious access via cat commands, and finally, potentially malicious access of cloud credentials by processes that are not known to typically access such credential material.

The above query is designed to flag on exfiltration or other types of access of cloud credential material.

The query is not designed to flag on scenarios where a threat actor may be present on a system and is using legitimate CLI tools that have already performed authentication and authorization to cloud services.

The query can also be tweaked, with additional paths and qualifiers added and scores changed, depending on the type of cloud credential material found and how these hosts are used to administer cloud resources.

If you followed along with this article, you instrumented a Linux host with telemetry that generates an audit event when a particular file containing cloud credential material is accessed. You also baselined these data in order to determine what normal activity looks like on this host. Finally, you generated malicious data in order to differentiate and separate this activity from normal administrative activity.

Final thoughts

As previously outlined, cloud credential material found unprotected on endpoints presents an attractive target for threat actors. Access to such cloud credential material is typically not monitored at the host level, particularly on Linux endpoints which are often not instrumented with comprehensive telemetry.

Now you know how to help narrow this visibility gap.

CSE rules

The threat labs team has developed and deployed the following rules for Cloud SIEM Enterprise customers.

Rule ID		Rule Name
MATCH-S00841	Suspicious AWS CLI Keys Access on Linux Host
MATCH-S00842	Suspicious Azure CLI Keys Access on Linux Host
MATCH-S00843	Suspicious GCP CLI Keys Access on Linux Host

MITRE ATT&CK mapping

Name		ID
Steal Application Access Token	T1528
Unsecured Credentials: Credentials In Files	T1552.001
Unsecured Credentials: Private Keys	T1552.004

References

Complete visibility for DevSecOps

Reduce downtime and move from reactive to proactive monitoring.

Start free trial

Find threats: Cloud credential theft on Linux endpoints

Threat hunting tools you'll need to instrument your telemetry

Making sense of the generated telemetry

Baselining

Hunting

Final thoughts

CSE rules

MITRE ATT&CK mapping

References

Complete visibility for DevSecOps

Categories

Spotlight

Sumo Logic cloud-native SaaS analytics

Anton Ovrutsky

People who read this also enjoyed

What are the best practices for log management?

Implementing a log management program: What is best to start with?

What is XDR? Is the security impact real or hyped?

Threat hunting tools you'll need to instrument your telemetry

Making sense of the generated telemetry

Baselining

Hunting

Final thoughts

CSE rules

MITRE ATT&CK mapping

References

Complete visibility for DevSecOps

Categories

Spotlight

Share

Sumo Logic cloud-native SaaS analytics

Anton Ovrutsky

You're in good company