Java Deserialization Security FAQ

post thumb
Vulnerability
by Christian Schneider / on 27 Apr 2016

Archived Blog Post from 2016

Deserialization Vulnerabilities

This FAQ covers some questions I’ve been asked after talking about Java deserialization vulnerabilities at conferences during the last months.

After the major rise of awareness in 2015, the well-known topic of remote code execution (RCE) during deserialization of untrusted (Java) data has received many new aspects and facets, as new research was performed. Consequently this deepened research led to new findings (gadgets, endpoints, protection attempts, bypass techniques, etc.).

As this fast-paced development in the last months might have left some peoples’ questions unanswered, I try to shed some more light on this by providing some sort of FAQ - mainly focussed at developers.

Since when is the risk of code execution during deserialization known?

Security research around remote code execution (RCE) via Java deserialization has been performed around 2010 with its roots even lasting back for the bug class until 2006 creating denial-of-service like exploitations and laying the general ground for Java deserialization attack research. Between 2011 and 2015 more remote code executions have been published until in 2015 this research has received a strong boost in attention and further exploration with the introduction of exploitation tools. The best matching CWE for this topic is CWE-502: Deserialization of Untrusted Data.

Where can I find details about the problem and its exploitation?

Many conference talks, videos and blog posts by several researchers cover the general problem of Java deserialization security. Here’s just a list of some samples:

Does this affect me only when I explicitly deserialize data in my code?

This directly affects you when you deserialize data to (Java) objects in your applications.

But this might also indirectly affect you when you use frameworks, components or products that use deserialization (mostly as a way to remotely communicate) under the hood. Just to mention a few technologies which to some extend use deserialization internally: RMI, JMX, JMS, Spring Service Invokers (like HTTP invoker etc.), management protocols of application servers, etc. just to mention a few.

The main difference between these two ways of “being affected” is that either you have to update your code or at least the frameworks and/or products you’re using.

How can I find out where I am deserializing data in my application?

Statically:

  • You can check your code (and the code you use) for calls to ObjectInputStream.readObject() and ObjectInputStream.readUnshared() where the InputStream is potentially untrusted (i.e. attacker controlled).
  • Also you should include to check where you use XStream to unmarshal XML, since XStream invokes the Java deserialization “magic methods” internally as well (see other questions about XStream in this FAQ).

Dynamically:

What kind of protections have been proclaimed?

Many - good and bad… Some examples include:

  • Removing gadget classes from ClassPath
  • Using a defensive deserialization in form of a Lookahead ObjectInputStream
    • with a blacklist of known gadget classes to prevent from being deserialized
    • with a whitelist of only allowed (safe) classes to deserialize
  • Wrapping a strict ad-hoc SecurityManager around the code which performs deserialization
  • Switching to another (remoting) technology - effectively avoiding Java deserialization

Why does “Remove gadget from ClassPath” not help?

The problem is simply that way too many gadgets (as part of complex gadget chains) exist… Many have been found and released, many yet also still to be found. Also some high-value gadgets exist in pure JRE library code, which means that removing them is simply not feasible.

Why does using a “Defensive Deserialization via a Lookahead ObjectInputStream” eventually not help?

Well, it’s at least an option…

Lookahead ObjectInputStream implementations overwrite the “resolveClass()” method of Java’s ObjectInputStream to check for the classname to deserialize beforehand.

Generally two styles of “Lookahead ObjectInputStream” implementations exist:

  • Those which use a blacklist of forbidden gadget classes and block them from being deserialised and
  • those which use a whitelist of allowed (business) classes to deserialize, effectively disallowing the rest.

The blacklist based implementations can often be bypassed by either new (not yet publicly known) gadgets as well as explicitly by a new gadget type found in 2015 and introduced in 2016 as part of our RSA Conference talk, which effectively allows for nested (unprotected) deserializations.

Why does using an “Ad-hoc SecurityManager around code deserializing data” not help?

Some “magic methods” exist that are executed not directly during deserialization, but indirectly at a later point in time. For example the “finalize()” method, called by GC, is a way to achieve a deferred execution outside of the direct deserialization flow. For example: As part of our RSA Conference 2016 talk, we showed a bypass gadget, which uses the .finalize() “magic method” to achieve a deferred execution. This effectively means that the code execution flow triggers after the deserialization has already happened, so that any ad-hoc SecurityManagers have already been removed.

But don’t get me wrong: Using a SecurityManager as a defense-in-depth mechanism is still a good point.

What really protects me?

Do not deserialize untrusted data - never!

It’s just that simple: avoid it.

As a pentester with a strong developer background I feel your pain when it comes to the big change in architecture when completely switching to another remoting technology… But that’s the only real solution. It might become more clear in the next questions about “denial of service” attacks and a “migration path”…

What can I do to save time during the migration path towards another remoting technology?

You can (and probably should) apply a defensive deserialization with a “Lookahead ObjectInputStream” with a strict whitelist.

Try to use an agent-based “Lookahead ObjectInputStream” solution (Java agents like “NotSoSerial” and others) to apply the whitelist via instrumentation to all occurrences of ObjectInputStream in order to also protect those instances not directly used by your code. This effectively shields also the ObjectInputStream instances used during nested deserializations as in our bypass gadgets.

On the other hand, when you’re a platform vendor (like app server provider, portal server provider etc.) it might not be possible to use a JVM-wide agent, since that might catch too many instances of ObjectInputStream outside of your code (i.e. within your users’/customers’ code). That’s not an easy situation you’re in… Eventually using an explicit subclass of ObjectInputStream then only in your usages of deserialization on your platform might be better to avoid breaking overall compatibility. But then please allow your users/customers to configure a whitelist matching their deserialization usages as well.

How can I build such a whitelist (for Java deserialization and XStream)?

The SWAT Java instrumentation agent (“Serial Whitelist Application Trainer”) logs all Java deserialisations (including classnames) and also XStream deserialisations in separate log files. These classnames can be used to build a whitelist and configure your Java deserialization protection as well as XStream’s whitelist with.

But I’ve applied a “Lookahead ObjectInputStream” with a strict whitelist… Why isn’t that enough?

Well, it saves you time… But it’s still possible to use “denial of service” gadgets which are available even in many strict whitelists (Arrays, nested HashMaps, HashSets, Strings, etc.) causing very high CPU load and/or memory consumption. So attackers could at least denial-of-service you - even when you’re using a strict whitelist…

What about patches of frameworks, libraries, products that I use?

Both - vendors of products offering untrusted deserialization endpoints as well as maintainers of libraries offering abusable gadgets - have issued and will continue to issue patches that either more or less protect the deserialization endpoint or remove the exploitability from the gadgets. So constantly watching out for patches and applying them makes sense - as always in infosec…

Should I check my own code for gadgets?

Probably yes. Especially when you are a framework / product / library vendor and your classes are used in many deployments. That way you protect your users having a deserialization endpoint in their products from being vulnerable to remote code execution and similar severe exploitations when having your library on their ClassPath.

Use SAST to watch out for interesting method calls (reflection, I/O, execution, file handling, sockets, class laading, etc.) reached by “magic methods” invoked during or after deserialization like:

  • java.io.Externalizable.readExternal()
  • java.io.Serializable.readObject()
  • java.io.Serializable.readObjectNoData()
  • java.io.Serializable.readResolve()
  • java.io.ObjectInputValidation.validateObject()
  • java.lang.reflect.InvocationHandler.invoke()
  • javassist.util.proxy.MethodHandler.invoke()
  • org.jboss.weld.bean.proxy.MethodHandler.invoke()
  • java.lang.Object.finalize()
  • .toString(), .hashCode() and .equals()
  • (static initializer)
  • etc.

What can I do in pentests to detect these issues?

Watch out for any data (network traffic, web requests, import files, etc.) where serialized Java data is exchanged - as these might be potential vectors to reach and exploit your endpoints. You can detect serialized Java data by the magic byte header 0xAC 0xED as well when it is Base64 encoded (for web requests for example) rO0. Be aware that compression could’ve been applied as well. There are already some nice Burp plugins existing to passively scan for this in web related traffic and also actively try to exploit it.

Also keep in mind to test any XML data you observe for exploitation via XStream and XMLDecoder based “deserialization” issues (see other questions in this FAQ regarding XStream).

What exploitation tools exist?

The best one is definitely ysoserial from Chris Frohoff and Gabriel Lawrence, which contains a great collection of gadgets and an easy to use CLI for gadget chain generation. Also, something similar exists for .NET called ysoserial.net created by Alvaro Muñoz.

Are other formats than Java serialization affected?

Yes, especially XML when used with XStream (known since 2013), since XStream uses (invokes) the Java deserialization “magic methods” during its unmarshalling process - hence being a Java deserialization endpoint as well. XStream even allows to “deserialize” non-serializable types, broadening the attack surface to gadget classes not implementing the Serializable interface. XStream has acted upon this side effect by introducing a blacklist and optional whitelist capability. The whitelist part should (must!) be explicitly configured by developers using XStream reading untrusted data. Unfortunately, many developers are not fully aware of this and still use unprotected or only blacklisted XStream instances, which led to several CVE around other projects using XStream with a default unconfigured whitelist.

XMLDecoder also has had similar issues related to this field.

Other serialization formats like Kryo also have been found to be (at least in terms of Denial-of-Service) somewhat vulnerable.

Are other languages (on the JVM) affected also?

Yes: Groovy and Scala - when using the ObjectInputStream based deserialization.

Are other languages than Java affected?

Yes: PHP, Python and others have been found to be vulnerable to similar issues in the past, which relate to untrusted deserialization and/or object injection.

Where can I find further hardening tips?