[ENIAC Thesis Award] Evading AV using Code Injection

[ENIAC Thesis Award] Evading AV using Code Injection

One of the main driving forces of many cyber security incidents is the use of malware. Last year alone, the AVTest Institute already recorded almost 100,000,000 new malware soaring through the internet, and in 2021 during the COVID-19 pandemic, this number went up as high as 150,000,000.

These kinds of statistics beg the question: How does this happen? Why is malware such a problem? Did we not solve this already with anti-virus software? Typically, it is fairly obvious that our system was infected from the moment it ended up on our computer. For example, you may have clicked a link, downloaded some file, opened it, and suddenly all your files are deleted and replaced with an encrypted version of them. It is then not rocket science to come up with the idea that the file that you just downloaded is probably malware. The actions are very visible and direct (i.e., a mass deletion of files), and the timing is also very telling (i.e., right after opening the file). However, things become more difficult when we also try to defend ourselves against Advanced Persistent Threats (APTs). APTs are threat actors that try to stay unnoticed for as long as possible, and thus, the malware that they use to infiltrate systems is specifically crafted to stay undetected by anti-virus. And one of the techniques that many APTs use to avoid detection is a technique known as code injection.

Code injection

The general idea of code injection is to somehow copy machine code instructions from one program into another, and then trick the other application into executing it. This way, the other application starts doing something it was not originally programmed to do. By extension, if a malicious program copies its malicious code into a legitimate application (such as a text editor or web browser), it is not the original malware itself that exhibits the malicious behavior, but rather the application that was previously considered to be benign. This has some interesting implications for a defender’s perspective. All of a sudden, the task of defending a computer system against malware becomes significantly more difficult, as it means we cannot limit our vision to just unknown programs that we downloaded from the web. With code injection, any program can start exhibiting malicious behavior, including the ones that are developed by trusted vendors such as Microsoft and Google. As such, the entire system needs to be monitored, which can be quite computationally expensive. And it gets worse. Even if we are able to monitor everything and notice that our text editor starts doing something bad, should we kill the text editor? It may temporarily stop the malicious behavior, but we do not really tackle the problem at its core and remove the malware the injection came from. Besides, malware can decide to inject itself again into some other program a second time, and killing off all programs on a computer does not sound like a favorable choice either.

So what does such a code injection look like? It may sound like a complicated process, but it can be quite straightforward. In fact, most people actually use code injection on a daily basis without realizing it. For example, if you are using a web browser, chances are that at some point, you installed some kind of plugin or extension (e.g., an ad blocker). This is actually a form of code injection, as it is deliberately injecting additional functionality into your browser by loading some extra code that runs on every page load. Similarly, most operating systems also have built-in features that allow for similar constructions, where additional code or entire modules are loaded into other processes to extend the functionality of the operating system itself. For instance, Windows defines a function called WriteProcessMemory, which allows you to directly alter the memory (and thus also code) of any process running on the system. Furthermore, if you are using anti-virus software, chances are that it is registering itself as a system module (similar to a plugin) such that small monitoring routines are injected into your running applications. Both examples illustrate precisely why it is a problem and why we cannot fully abolish code injection either. There are many different methods of injecting code into another application, and the line between legitimate and illegitimate code injection is significantly blurred.

Malware analysis

In our research, we set out to get a more fundamental understanding of code injection usage in malware. To do so, we collected 17 of the most commonly used techniques, and test them to see if they are still working on a typical Windows 10 machine. We then continued by comparing every technique to each other and identified reoccurring features and characteristics. We quickly realized that there are two main camps that a specific technique can fall into. The first category comprises techniques that are actively interacting with the target process, while the second category includes methods that leverage features from the operating system itself and passively wait for the injection to happen instead. The WriteProcessMemory example can be considered an active technique, while system modules are types of code injections that fall into the passive category instead.

Interestingly, during our research, we found that most malware analysis tools check for the presence of active techniques. This is likely because these are very straightforward, well-known, and widely adopted in hacker communities. Furthermore, they are easy to detect, because they translate to well-known functions that can be monitored. On the other hand, passive techniques are often left out, because it is significantly more difficult to determine whether the operating system itself is performing some operation indicative of code injection, as opposed to the malware explicitly calling some function to initiate the injection process.

Figure 1: Hacker collective Anonymous uses malware to target governments

Conclusions

To see how much of a problem this really is, we continued with a prevalence assessment by analyzing 3,000 real-world malware samples from the years 2017 to 2020. We found that about 11.15% of these samples used some form of code injection, of which a growing trend used a passive technique (from 40% to 48%). This indicates that malware developers indeed have realized that passive techniques are difficult to detect, and thus are an attractive option for them to further delay getting caught. Anti-virus developers and malware analysts should therefore be aware that this is a method that really should be taken into account when analyzing a program for malicious behavior.

Luckily, there is a lot of work put into this field of malware behavioral analysis, and currently, I am also continuing this project as a Ph.D. candidate at the UT as well. However, what can the average Joe do in the meantime? Of course, one obvious answer is to get into the habit of not trusting everything that you download blindly, but the best thing to do is to accept that these kinds of things just happen. Malware developers are constantly trying to circumvent the radars of anti-virus and have become exceedingly efficient at it. Therefore, keep your anti-virus software up-to-date, regularly scan your device for malware, and make sure you have back-ups of your important files in case things go wrong.

About Jerre

Jerre is originally from a small farmer’s village in the north of The Netherlands. When he was young, he quickly found a passion for mathematics, computers, and programming, and this prompted him to pursue a bachelor’s and master’s degree in Computer Science and Cyber Security in Enschede.

Currently, Jerre is doing research on the analysis of malicious software as a Ph.D. student at the University of Twente. In particular, his main interests lie in improving the automated extraction of malicious behaviors exhibited by computer viruses. He uses an approach that combines methods found in the fields of software reverse engineering, formal software verification, and compiler theory, and he intends to bring malware analysis to a more automated process that is both more scalable and manageable this way.

Jerre strongly believes in open source and open science, where researchers and developers can easily gain access to results of previously conducted research, and are welcome to contribute their own ideas and implementations to further improve the project.

References

Sources used to write this article can be found in Jerre’s thesis, published at: https://essay.utwente.nl/88617