Excelsior Logo Home
Buy   
Download   
Support   
 Forum   
Blog   

Protect Your Java Code — Through Obfuscators And Beyond

Last update: 26-Dec-2013

By Dmitry LESKOV

Reverse engineering of your proprietary applications by unfair competition or malicious hackers may result in highly undesirable exposure of your algorithms and ideas, proprietary data formats, licensing and security mechanisms, and, most importantly, your customer's data. Here is why Java is particularly weak in this respect compared to C++:

Target Instruction Set

C++: Compiles to a low-level instruction set that operates on raw binary data and is specific to the target hardware, such as x86 or PowerPC

Java: Compiles to a higher-level portable bytecode that operates on classes and primitive types.

Compiler Optimizations

C++: Numerous code optimizations are performed at compile time. Inline substitution results in copies of the given (member) function being scattered around the binary image; use of the preprocessor combined with compile-time evaluation of expressions may leave no trace of the constants defined in the source code; and so on.

Java: Relies on dynamic (Just-In-Time) compilation for performance improvement. The standard javac compiler is straightforward, it does no compile time optimizations commonly found in C++ compilers. The idea is to enable the JIT compiler to perform all optimizations at run time, taking the execution profile into account.

Linkage

C++: Programs are statically linked, and metaprogramming facilities (reflection) are absent in the core language. So the names of classes, members, and variables need not be present in the compiled and linked program, except for names exported from dynamic libraries (DLLs/shared objects.)

Java: Dependencies are resolved at run time, when classes are loaded. So the name of the class and names of its methods and fields must be present in a class file, as well as names of all imported classes, called methods, and accessed fields.

Delivery Format

C++: An application is delivered as a monolithic executable (maybe with a few dynamic libraries), so it is not easy to identify all member functions of a given class or reconstruct the class hierarchy.

Java: An application is delivered as a set of jar files, which are just non-encrypted archives containing individual classes.

As a result, the decompilation of Java programs is a much simpler task compared to C++ and therefore may be fully automated. Class hierarchy, high-level statements, names of classes, methods and fields - all this can be retrieved from class files emitted by the standard javac compiler. Any person of ordinary skills in programming can download a Java decompiler, run your program through it and read the source code almost as if it was open source.

Let's see what can be done to prevent that.

Bytecode Encryption - Straightforward But Totally Flawed

The solution that first comes to mind is to encrypt the class files. Unfortunately, this approach is fundamentally flawed, because the JVM may not load and execute encrypted classes, period. A class must be decrypted so that JVM could load it, and it is fairly easy to intercept the original non-encrypted bytecode at that point. This technique is described in details in [1] and [2].

The introduction of the java.lang.instrument API back in Java 5 had provided hackers with one more way to circumvent class file encryption mechanisms.

Note that reverse engineering of the routines implementing decryption logic is not required. As long as the user has both the encrypted application and the decryption key, he can obtain the original classes fairly easily regardless of how they were encrypted.

Yes, the authors of contemporary bytecode encryptors know about those standard mechanisms and try to disable them. However, as Java is now open source, one may simply download the OpenJDK source code, patch it to dump loaded classes to disk and force the -XX:+CompileTheWorld option.

In fact, a security engineer frustrated by false claims of vendors whose tools implement bytecode encryption has put together an article [3] showing how easily OpenJDK can be modified to defeat any bytecode protection scheme.

Two refinements:

  1. Of course, strong encyption does its job when a malicious competitor or hacker gets hold of encrypted class files only, so it may help reduce exposure of server-side application code running in a controlled environment to some extent. (That is, until the system administrator account gets hacked. :) ) But any code designed to run on third-party systems has to include or be accompanied with the respective decryption key and hence may not be protected.
  2. The already oversized heading of this section should have read "Software-only Bytecode Encryption...", because with the help of a small piece of silicon Java bytecode may be encrypted very securely.

    Validy SoftNaOS

    The tradeoff? Performance impact several orders of magnitude deep…

Okay, so software-only bytecode encryption makes little sense, and the use of hardware has its own issues and is not always possible. How about making the bytecode less comprehensible? This is what obfuscation is all about - change the program so that it produces the same results when run on the JVM, but its decompiled source is substantially harder to understand.

Name Obfuscation

Name obfuscation is the process of replacing the identifiers you have carefully chosen to your company's coding standards, such as com.mycompany.TradeSystem.Security.checkFingerprint(), with meaningless sequences of characters, i.e. a.a0(). The obfuscator must process the entire application to ensure consistency of name changes across all classes and jars.

The more advanced obfuscators take one step further. As you surely know, a Java class may have more than one method with the same name if their signatures are different, Utilizing that fact, an obfuscator can rename setPos(int x, int y) and setColor(int color) to, say, a(int a, int b) and a(int a).

A nice side effect of name obfuscation is the substantial reduction of class file size, which results in somewhat smaller downloads and faster cold starts of desktop Java applications, and lets your cellphone hold more of all those fancy Java ME apps games. But just like any other technique, name obfuscation has its limitations and downsides:

  • You may not obfuscate the names of standard Java API classes that are part of the JRE, so all uses of those classes remain clearly seen in the decompiled code.
  • Entities accessed via reflection or JNI at run time may not be renamed. Problem is, you cannot say for sure whether this particular class or method may be accessed dynamically, especially if it belongs to a third-party library, component, or framework, or to a part of your application that was written by someone else.

    Many frameworks and tools rely heavily on reflection. One notable example is the JavaBeans component architecture and the respective visual programming tools. The EJB specification requires (and the container enforces) specific signatures of the callback methods such as ejbCreate.

  • Names of serializable classes may not be obfuscated. Most obfuscators would automatically exclude classes that implement the java.io.Serializable interface. Similarly, RMI suffixes _Stub and _Skel, and classes that extend java.rmi.Remote must trigger the name obfuscator's exclude mechanism.

String Encryption

String encryption is another feature commonly found in Java obfuscators. Replacing string literals with calls to a method that decrypts its parameter makes the hacker's life more interesting, but unfortunately not too much.

Problem is, the strings must be decrypted at run time, so the respective code must be included in the application. Moreover, in most tools string encryption is implemented so straightforward that the hacker even does not need to reverse-engineer that code! All he or she has to do is write a program that would call the decrypting method(s) for all the strings.

Code and Data Flow Obfuscation

Simply put, flow obfuscation is about modifying the program so that it yields the same result when run, but is impossible to decompile into a well-structured Java source and/or is more difficult to understand.

Most code obfuscators would replace instructions produced by a Java compiler with gotos and other instructions that may not be decompiled into valid Java source. A decompiler expecting conventional javac output would either fail or produce pseudocode with lots of labels and goto statements. However, not all decompilers are that dumb.

An interesting yet obscured offshoot of the Soot bytecode analysis and optimization framework, developed by the Sable group at McGill University, is the Dava decompiler project [4]. It aims at decompiling Java bytecode produced by any tool, not just the javac compiler, into readable source, so it is effectively an attempt to create a deobfuscator.

(The funny part is that other people in the same group are working on a Java bytecode obfuscator called JBCO. I wonder if they hold an internal "obfuscate-decompile" tournament.)

However, even if you use a code obfuscator that forces all decompilers to fail completely, a bytecode disassembler would still work. Remember that the JVM instruction set includes high-level instructions, as opposed to real CPUs such as x86 or ARM, so disassembled Java is easier to understand than disassembled C++. It would therefore make sense to also "distort" the overall structure of the program. The more advanced obfuscation techniques include class hierarchy changes, method inlining and outlining, loop unrolling, array folding/flattening, etc.

Embedding a custom virtual machine into the application and translating the most sensitive methods to its instructions set is perhaps the most effective but at the same time one of the most expensive transformations.

Finally, sophisticated algorithms may also be protected through mathematics transformation. The transformed code would compute the same results using different data types. However, such tools are much more expensive and often available only as part of a custom risk management solution. Another point is that such transformations may easily slow down the code by order of magnitude and beyond, so you'd better apply them only to most sensitive pieces of code, provided they are not performance-critical.

Limitations:

  • Overobfuscated classes may not pass stricter bytecode verification anticipated in future JVM implementations.
  • Code flow obfuscation has a negative impact on performance.
  • Field engineering may be difficult.
  • As I mentioned above, the standard Java API classes may not be obfuscated, so references to them would be present in the decompiled/disassembled code. The obfuscator may however be capable of replacing some of the basic Java APIs with custom, obfuscated implementations.

Extra capabilities

A few things to keep in mind when choosing an obfuscator.

Incremental obfuscation

If you plan to issue updates to your obfuscated application, you have to ensure that the names of classes in the new version of your application are consistent with the version originally shipped to end users. When choosing an obfuscator make sure it can reproduce the renames made during the previous obfuscation session.

Class file optimizations

Many obfuscators can optionally optimize the class files for size by removing the unneeded elements such as unused methods, fields, and strings, design-time metadata, etc. However, care must be taken when using this feature, because a method or field may also be accessed using JNI or reflection, and it is not possible to reliably detect all such accesses even by analyzing the running program.

Bytecode optimizations supported by some obfuscators include constant expression evaluation, assignment of static and final attributes, inlining of simple methods such as getters and setters, peephole optimizations, and so on. However, the benefits of such optimizations are only substantial in constrained Java ME CLDC environments. The more sophisticated JVMs, such as Sun/Oracle HotSpot, IBM J9, and BEA (now also Oracle) JRockit, would apply these and many other optimizations during JIT compilation. You'd better not stay in their way.

Debug info obfuscation

By default, the javac compiler writes source file names and, optionally, line number information (with -g option) to the resulting class files. Those are required to get meaningful stack traces. An obfuscator may remove that information altogether, or change file names to meaningless strings. If you rely on stack traces when resolving customer issues, make sure your obfuscator comes with a reverse mapping utility that can reconstruct the original stack trace with unobfuscated names of classes and source files.

Note also that certain third-party libraries and frameworks require stack trace information to function properly. One example is Apache log4j.

Watermarking

Some obfuscators may embed a hidden customer or distributor ID into your class files, just like in digital media, enabling you to track down software pirates.

Source code obfuscation

Suppose your proprietary Java source triggers an annoying bug in your favorite IDE, and you have decided to reduce your source code to a test case. Before sending it to the IDE vendor, you may wish to run it through a source code obfuscator in order to replace identifiers with nonsense and remove comments.

Strengthening Protection with Ahead-Of-Time Compilation

As you may see, all three main approaches to Java code obfuscation have certain drawbacks and limitations, and don't solve the fundamental problems listed in the introduction. Fortunately, there exist a class of tools originally developed with the goal of improving performance of Java applications. These tools are Ahead-Of-Time native code compilers, which take your jars and classes as input, compile them to optimized native code, and produce a conventional executable.

Remember the C++ to Java comparison at the top of the article? Most statements from the C++ column apply to AOT-compiled Java:

  • native, low-level instruction set
  • highly optimized code
  • static linking into a monolithic executable

This naturally leads us to the idea of a two-step approach to Java code protection:

  1. Obfuscate names and encrypt strings using the tools not relying on the application being delivered in bytecode form. Make sure to disable control/data flow obfuscations.
  2. Compile the obfuscated application down to native code.

Refer to my other article for more information about AOT compilers.

Popular Obfuscators

An Internet search for "Java obfuscator" would return way too many results. For your convenience, I have reduced the list to a handful of actively maintained products, both commercial and free.

Product License Price
Allatori Proprietary $
Dash-O-Pro Proprietary $$$?
GuardIT for Java Proprietary $$$?
Proguard Open Source -
Stringer Proprietary $
yGuard Freeware -
Zelix Klassmaster Proprietary $

There are two products that stand out of the above crowd:

Stringer does not only encrypt strings as its name might suggest. It can also encrypt resource files — images, media clips, etc. — found in your jar files, hide (standard) library calls, and inject integrity checks in your application, making sure that only the original, unmodified versions of your class files can execute properly.

GuardIT for Java does code/name obfuscation and string encryption, but then goes above and beyond to wrap your application code into an active protection system capable of detecting tampering in real-time. These advanced techniques operate on bytecode thus making GuardIT for Java non-interoperable with AOT compilers.

If you need to obfuscate your Java source, Semantic Designs' Thicket™ family of source code formatters includes a Java formatter with obfuscation capability. Another Java source obfuscator is Mangle-It from PC Sentinel Software.

Drop me an email if you know of a tool worth adding to this list.

As I have already mentioned above, the Sable group at McGill University has among its research projects a Java bytecode obfuscator called JBCO. It is not usable commercially, and will likely never be, but is worth looking at if you aim at Building a Better Obfuscator.

Further Reading

Books

The links to publishers and stores are not affiliate links.

Despite its title, Decompiling Java by Godfrey Nolan has a chapter on code protection, most of which is in turn devoted to obfuscation.

Update 22-May-2013: Godfrey has recently published a new book, Decompiling Android, which also has a section on protection.

Alex Kalinovsky in his Covert Java: Techniques for Decompiling, Patching, and Reverse Engineering again mostly covers the topics listed in the book title, but has also included a chapter on obfuscation and cracking obfuscated code. By coincidence, that particular chapter is available online, so I have just saved you twenty dollars. :)

Reversing: Secrets of Reverse Engineering by Eldad Eilam is not Java-specific, so it offers a broader view.

Popular articles

If you want to learn more about code and data flow obfuscation techniques and how they rank against each other in terms of potency, resilience and cost, the three-part series by Sonali Gupta, appeared in the Palisade Magazine in Aug-Oct 2005, would make a good start:

Research publications

A group of Rowan University researchers led by Prof. Ravi P. Ramachandran has recently studied commercial Java Obfuscators. Their findings are summarized in two articles:

If you want some theory and controversy, the paper "On the (Im)possibility of Obfuscating Programs" by Boaz Barak et al, found in the 21st Annual International Cryptology Conference Proceedings, proves that obfuscation is impossible. It sparked quite some discussion and confusion, so Boaz had later written an essay explaining what that result really means, in his opinion.

For more information about software protection in general, refer to Watermarking, Tamper-Proofing, and Obfuscation - Tools for Software Protection by C. Collberg and C. Thomborson, or the more recent Revisiting Software Protection by P.C. van Oorschot of Carleton University, Canada. Dozens of works listed in their References sections may keep you busy reading for many hours.

Vendor publications

PreEmptive Solutions, the maker of the Dash-O-Pro Java obfuscator, maintains a nice collection of whitepapers, presentations, articles, demos, reference information, and links related to obfuscation and software protection in general.

References

  1. A. Sundararajan. Retrieving .class files from a running app, 2007.
  2. V. Roubtsov. Cracking Java byte-code encryption, JavaWorld.com, 09-May-2003
  3. I. Kinash. A Gun is a Great Equalizer: OpenJDK Hack vs. Class Encryption, Java.DZone.com, 25-Jun-2012
  4. N. A. Naeem, L. Hendren. Programmer-friendly Decompiled Java, 2006 (PDF)

Obfuscation Examples

Let's consider a fictional application that stores user passwords as SHA digests:

import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.*;

public class Authentication {

    public static byte[] encryptPassword(String password) 
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        String saltedPassword = password + "Add-Some-Salt";
        byte[] digestive = saltedPassword.getBytes("ISO-8859-1");
        MessageDigest md = MessageDigest.getInstance("SHA");
        md.update(digestive);
        return md.digest();
    }

    public static boolean checkPassword(String password, byte[] digest) 
        throws UnsupportedEncodingException, NoSuchAlgorithmException {
        
        if (Arrays.equals(encryptPassword(password),digest)) return true;
        System.out.println("Wrong password");
        return false;
    }
}

Compiling this class with the standard javac compiler and decompiling the resulting class file using one of the freely available decompilers yields:

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Arrays;

public class Authentication
{
    public static byte[] encryptPassword(String s)
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        String s1 = (new StringBuilder()).append(s).append("Add-Some-Salt").toString();
        byte abyte0[] = s1.getBytes("ISO-8859-1");
        MessageDigest messagedigest = MessageDigest.getInstance("SHA");
        messagedigest.update(abyte0);
        return messagedigest.digest();
    }

    public static boolean checkPassword(String s, byte abyte0[])
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        if(Arrays.equals(encryptPassword(s), abyte0))
        {
            return true;
        } else
        {
            System.out.println("Wrong password");
            return false;
        }
    }
}

As you may see, the only major differences in the decompiled source code are automatically generated names of parameters and local variables.

Let's run the above sample through a name obfuscator and decompile the resulting class:

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Arrays;

public class a
{
    public static byte[] a(String a)
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        String s = (new StringBuilder()).append(a).append("Add-Some-Salt").toString();
        byte abyte0[] = s.getBytes("ISO-8859-1");
        MessageDigest messagedigest = MessageDigest.getInstance("SHA");
        messagedigest.update(abyte0);
        return messagedigest.digest();
    }
    public static boolean a(String a, byte a[])
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        if(Arrays.equals(a(a), a))
        {
            return true;
        } else
        {
            System.out.println("Wrong password");
            return false;
        }
    }

Even though the obfuscator has replaced the public identifiers Authentication, encryptPassword() and checkPassword with meaningless, overloaded a, it is clear that these methods deal with the Security API and use the SHA algorithm. The salt string is also exposed.

Now, enabling string encryption makes the decompiled code a little bit more, well, cryptic:

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Arrays;

public class b
{
    public static byte[] a(String a)
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        String s = (new StringBuilder()).append(a).append(a.a("X|~4Nws|<Ksu!")).toString();
        byte abyte0[] = s.getBytes(a.a("]FX9(-&-1d"));
        MessageDigest messagedigest = MessageDigest.getInstance(a.a("E_\024"));
        messagedigest.update(abyte0);
        return messagedigest.digest();
    }
    public static boolean a(String a, byte a[])
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        if(Arrays.equals(a(a), a))
        {
            return true;
        } else
        {
            System.out.println(a.a("Cgxzw5cuofh{j1"));
            return false;
        }
    }
}

Okay, strings are now encrypted, but the import list is still there and it is perfectly clear that these methods use the Java Security API. So the hacker would still have little doubt over where to look for sensitive code, and s/he does not even need to reverse the encryption algorithm. All s/he needs is to extract the calls of the decryption method from the decompiled source:

// crack.java

public class crack {

    public static void main( String[] args ) {
        System.out.println(a.a("X|~4Nws|<Ksu!"));
        System.out.println(a.a("]FX9(-&-1d"));
        System.out.println(a.a("E_\024"));
        System.out.println(a.a("Cgxzw5cuofh{j1"));
    }
}

... and then compile and run the resulting code:

$ javac crack.java
$ java crack
Add-Some-Salt
ISO-8859-1
SHA
Wrong password
$

Voila!

That said, the more advanced tools such as Stringer take serious countermeasures against the above attack.

Let's try code flow obfuscation now. On first sight, there is not much code to obfuscate here, and code and data flows are pretty simple: just a bunch of standard API calls without any loops or exception handling. Indeed, the obfuscator I was using could only make one change after I enabled code obfuscation:

        // Code ofbuscation disabled
        String s = (new StringBuilder()).append(a).append(a.a("X|~4Nws|<Ksu!")).toString();
        byte abyte0[] = s.getBytes(a.a("]FX9(-&-1d"));
        // Code ofbuscation enabled
        byte abyte0[] = (new StringBuilder()).append(a).append(a.a("X|~4Nws|<Ksu!")).toString().getBytes(a.a("]FX9(-&-1d"));

Perhaps that is just a weakness of the code obfuscation features implemented in a particular product? Indeed, it is possible to make the result of decompilation much less readable with JBCO. But before I move on, a word of caution:

DO NOT TRY THIS AT HOME!

or, more seriously, do not try to use JBCO in a production environment. It is a research project, and as such is aimed at enabling researchers to try their ideas. It is not meant to be scalable, robust, and well documented.

Anyway, I was able to push JBCO to the limits on the original version of Authentication.class using the following command line:

java -Xmx256m -cp sootclasses-2.2.4.jar;polyglotclasses-1.3.4.jar;jasminclasses-2.2.4.jar;. soot.jbco.Main -cp .;scimark2lib.jar;"C:\Program Files\Java\jre1.5.0_13\lib\rt.jar";"C:\Program Files\Java\jre1.5.0_13\lib\jce.jar" -t:9:wjtp.jbco_cr -t:9:wjtp.jbco_mr -t:9:wjtp.jbco_fr -t:9:wjtp.jbco_bapibm -t:9:wjtp.jbco_blbc -t:9:jtp.jbco_gia -t:9:jtp.jbco_adss -t:9:jtp.jbco_cae2bo -t:9:bb.jbco_cb2ji -t:9:bb.jbco_dcc -t:9:bb.jbco_rds -t:9:bb.jbco_riitcb -t:9:bb.jbco_iii -t:9:bb.jbco_plvb -t:9:bb.jbco_rlaii -t:9:bb.jbco_ctbcb -t:9:bb.jbco_ecvf -t:9:bb.jbco_ptss Authentication

Here is the output produced by the decompiler:

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.Arrays;

public class Authentication
{

// JavaClassFileOutputException: Stack underflow

    public static byte[] II1(String s)
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
_L2:
        StringBuilder stringbuilder;
        S$$(stringbuilder, s);
        stringbuilder = S$$(s);
        return stringbuilder;
        stringbuilder = JVM INSTR new #121 <Class StringBuilder>;
        stringbuilder.StringBuilder();
        stringbuilder = S$$(s, stringbuilder);
        stringbuilder = S$$(S$$(ll1, stringbuilder));
        stringbuilder = S$$(S$$, stringbuilder);
        s = S$$(III);
        if(true) goto _L2; else goto _L1
_L1:
        throw ;
    }

    public static boolean S$$(String s, byte abyte0[])
        throws UnsupportedEncodingException, NoSuchAlgorithmException
    {
        if(!S$$(II1(s), abyte0)) goto _L2; else goto _L1
_L1:
        lll;
        lI1.booleanValue();
        JVM INSTR ifge 28;
           goto _L3 _L4
_L3:
        JVM INSTR pop ;
        break MISSING_BLOCK_LABEL_49;
_L5:
        throw ;
_L4:
        return;
_L2:
        JVM INSTR pop ;
        S$$(____, System.out);
        return II1;
        null;
          goto _L5
    }

    public static StringBuilder S$$(String s, StringBuilder stringbuilder)
    {
        return stringbuilder.append(s);
    }

    public static String S$$(StringBuilder stringbuilder)
    {
        return stringbuilder.toString();
    }

    public static byte[] S$$(String s, String s1)
    {
        return s1.getBytes(s);
    }

    public static MessageDigest S$$(String s)
    {
        return MessageDigest.getInstance(s);
    }

    public static void S$$(byte abyte0[], MessageDigest messagedigest)
    {
        // Placing a breakpoint here would reveal the salt string
        messagedigest.update(abyte0);  
    }

    public static byte[] S$$(MessageDigest messagedigest)
    {
        return messagedigest.digest();
    }

    public static boolean S$$(byte abyte0[], byte abyte1[])
    {
        return Arrays.equals(abyte0, abyte1);
    }

    public static void S$$(String s, PrintStream printstream)
    {
        printstream.println(s);
    }

    public static Boolean lI1;
    public static boolean ___;
    public static String ll1;
    public static String S$$;
    public static String III;
    public static String ____;
    public static int II1;
    public static int lll = 1;
        
    {
          goto _L1
_L3:
        Boolean boolean1;
        lI1 = boolean1;
        return;
_L1:
        ____ = "Wrong password";
        III = "SHA";
        S$$ = "ISO-8859-1";
        ll1 = "Add-Some-Salt";
        boolean1 = JVM INSTR new #15  <Class Boolean>>;
        boolean1.Boolean(true);
        if(true) goto _L3; else goto _L2
_L2:
        throw ;
    }
}

Reverting the above to a piece of Java source resembling the original Authentication.java is a very time-consuming task. But that is not necessarily what the attacker wants to achieve. Running the application under a debugger with a breakpoint set on MessageDigest.update() may reveal enough information about the password encryption scheme used in this fictional app.

Impact of Flow Obfuscation on Performance

You may now wonder what might be the degree of impact of such extensive transformations on application performance. So did I, therefore my next step was running a well-known benchmark suite through JBCO.

I have selected the SciMark 2.0a benchmark. It measures the performance of numerical computations typically found in scientific and engineering applications. These are the types of applications one may wish to protect against decompilation.

Another good thing about SciMark is that it validates the result of each test, which is useful for checking whether the transformations made by the obfuscator preserved the semantics of the original code. (Strictly speaking, I also had to disable obfuscation of the validation code so that it could serve as 100% proof.)

Here is the JBCO command line that I used:

java -Xmx384m -cp sootclasses-2.2.4.jar;polyglotclasses-1.3.4.jar;jasminclasses-2.2.4.jar;. soot.jbco.Main -cp .;scimark2lib.jar;"C:\Program Files\Java\jre1.5.0_13\lib\rt.jar";"C:\Program Files\Java\jre1.5.0_13\lib\jce.jar" -t:9:wjtp.jbco_cr -t:9:wjtp.jbco_mr -t:9:wjtp.jbco_fr -t:9:wjtp.jbco_bapibm -t:9:wjtp.jbco_blbc -t:9:jtp.jbco_gia -t:9:jtp.jbco_cae2bo -t:9:bb.jbco_cb2ji -t:9:bb.jbco_rds -t:9:bb.jbco_riitcb -t:9:bb.jbco_iii -app jnt.scimark2.commandline

(As you may see, I had to disable some of the transformations until finding a combination that does not cause JBCO to crash or hang on SciMark classes and results in emission of correct, verifiable bytecode. As I said above, JBCO is not meant to be production-ready.)

SciMark reports measurement results in terms of scores. A higher score is better. The original, non-obfuscated version produced the following output on Sun HotSpot 1.5.0_13:

SciMark 2.0a

Composite Score: 235.0564919826097
FFT (1024): 95.00221034941461
SOR (100x100):   457.27459020598775
Monte Carlo : 40.94500403786447
Sparse matmult (N=1000, nz=5000): 217.5829952655074
LU (100x100): 364.4776600542744
   .  .  .

Compared to that, the obfuscated version is slow like a worm:

SciMark 2.0a

Composite Score: 10.297235073178708
FFT (1024): 4.571244265099369
SOR (100x100):   14.39229057848735
Monte Carlo : 12.282002422350816
Sparse matmult (N=1000, nz=5000): 8.355773238345918
LU (100x100): 11.884864861610085
   .  .  .

The slowdown ranges from 3.3x for the Monte Carlo test to over 30x for SOR and LU. The composite score is 22.8x lower for the obfuscated version!

On the one hand, this means you have to be careful when obfuscating performance-sensitive code. On the other hand, if flow obfuscation has little or no impact on performance, it may be an indicator of obfuscation weakness. What an optimizing compiler such as Sun Hotspot may figure out, may also be figured out by a person of ordinary programming skills, especially if equipped with something like Understand for Java.

All that being said, I claim again that ahead-of-time compilation to native code is way better than flow obfuscation, and invite you to try Excelsior JET, a Java SE 7 compilant JVM with AOT compiler that my company makes. What else you would have expected to find at the end of an article on a vendor site, anyway?

An open-source alternative to Excelsior JET is GNU Compiler for Java (GCJ). GCJ supports more platforms, but is way behind in terms of standard compliance, and has enjoyed little activity after the OpenJDK announcement. Please refer to my other article for a head-to-head comparison.

Let me reiterate that AOT compilers are interoperable with Java code protection tools that do not rely on the protected application remaining in the bytecode form. If you are totally paranoid For maximum protection, use such tools to obfuscate names and encrypt strings before native compilation.

I update this article regularly, so if you have any comments or questions, or know of resource/tool URLs which I should have added, please send them to me.

Was the above article useful? If yes, we have more content for you!

Check out other articles written by Excelsior staff members:

Home | Company | Products | Services | Resources | Contact

Store | Downloads | Support | Forum | Blog | Sitemap

© 1999-2014 Excelsior LLC. All Rights Reserved.