Techsupport Fairy Tales: Main Thread Lives Two Lives

A number of old tests get a new life every time we add support for another target architecture or platform. Here comes a fresh record from our labs:

Environment

OS X

Patient

Popular free desktop Java Swing app that has been part of the Excelsior JET test suite for many years.

Symptoms

Clicking the right mouse button on the application’s tools panel results in:

  • expected behavior on Apple Java SE 6

  • NoClassDefFoundError on Oracle HotSpot 1.7.0_55, thrown because the class apple.laf.AquaPopupMenuUI is absent in the Oracle JRE for OS X. The application catches it and displays a dialog inviting the user to try the latest version or report the problem.

The latter is what we expected the natively compiled application to do, because the Excelsior JET Runtime includes exactly the same classes as the respective version of the Oracle JRE for the same platform.

However, the app did not show that dialog if natively built. Launched from a terminal, it would dump the call stack to that terminal, but without a terminal it appeared to the user that some functionality has been silently disabled for no apparent reason.

Examination

The execution paths diverge in the code that handles the initial NoClassDefFoundError: in the native build, that code itself throws a NullPointerException, effectively canceling the creation of the error dialog.

Here is a small excerpt:

Thread[] ts = new Thread[Thread.activeCount()];
Thread.enumerate(ts);

As you might have guessed, code that follows iterates over the ts array, assuming that it has been filled with references to all active threads. Indeed, on HotSpot the entire array is full of valid references to Thread instances. But in the native build, the last element of the array appears to be null, and the absence of a guard against that leads to an NPE.

It is easy to reproduce this problem on a small sample by having the above code executed from SwingUtilities.invokeLater().

Diagnosis

Fact is, activeCount() returns the number of all threads, both alive and dead, whereas enumerate() only goes over the threads for which isAlive() returns true. So in the general case, one must check for null references when iterating over an array filled returned by Thread.enumerate().

But why does the app work on HotSpot, whereas the native build fails? Here one subtle difference between the two implementations of the JVM specification comes into play:

Upon termination of the main thread, the HotSpot VM "reincarnates" it as a new, DestroyJavaVM()-thread, which waits for all other threads to terminate. (Actually, those two are the same thread, but from the JVM’s point of view they are different, hence me quoting the word "reincarnates".)

In the Excelsior JET Runtime, however, the main thread is marked as dead upon termination, and then "waits" for other threads to die. As a result, when called after the termination of the main thread, activeCount() returns a number that is greater than the number of threads that a subsequent call of enumerate() would, well, enumerate, greater by at least one.

Treatment

We could possibly change the Excelsior JET Runtime so that it behaves identically with HotSpot in the above scenario. However, the specification does not guarantee that the result of an activeCount() call is at all times equal to the length of the array that a subsequent call of enumerate() would fill. In fact, that cannot be guaranteed in the presence of dead threads, so the said code only works on HotSpot if there are none yet. A close inspection of Thread.activeCount javadoc would confirm that:

Returns an estimate of the number of active threads…

Therefore, the code of the application is incorrect both with respect to the Java specification — it relies on the behavior of a specific implementation, HotSpot, and in general — checking for null array elements would have prevented the problem.

Verdict: no treatment necessary (on our side, anyway).

Categories: Excelsior JET, Java

Tags: , , ,

|