I’ve known for a long time that the Java Just-In-Time compiler produces some very fast code but I didn’t realize how fast until I went one-on-one with it and lost.
At work, my manager sponsored a little programming contest. It’s the first of its kind in our department and the goal is to foster creativity and get developers talking with each other about problems solving. The problem itself is trivial. Given the Java method signature
public void reverseSent(char[] sentence)
reverse the order of the words contained in the char array. So, a sentence like, “Mike Heath is cool” would be morphed to “cool is Heath Mike”.
I wrote the following Java code to solve the problem:
public void reverseSent(char[] sentence) {
int i = 0;
int j = sentence.length - 1;
// Reverse the array
while (i < j) {
char tmp = sentence[i];
sentence[i] = sentence[j];
sentence[j] = tmp;
i++; j--;
}
// Reverse each word
i = 0;
for (int k = 0; k < sentence.length; k++) {
if (sentence[k] == ' ') {
j = k - 1;
while (i < j) {
char tmp = sentence[i];
sentence[i] = sentence[j];
sentence[j] = tmp;
i++; j--;
}
i = k + 1;
}
}
j = sentence.length - 1;
while (i < j) {
char tmp = sentence[i];
sentence[i] = sentence[j];
sentence[j] = tmp;
i++; j--;
}
}
My solution reverses the entire array and then re-reverses each word. This is so that I don’t have to allocate a second array and copy data back and forth between two arrays.
After writing it, I started thinking that this would be the perfect problem for pointer arithmetic in C or C++. So I decided to cheat in the contest. I would provide a Java method that would invoke some C++ code using Java Native Interface (JNI).
The C++ code I wrote is found below:
JNIEXPORT void JNICALL Java_NativeReverseSentence_reverseSent
(JNIEnv *env, jobject self, jcharArray sentence)
{
int length = env->GetArrayLength(sentence);
jchar *sentencePtr = (jchar *)env->GetPrimitiveArrayCritical(sentence, NULL);
// Reverse the sentence
jchar *i = sentencePtr;
jchar *j = sentencePtr + length - 1;
while (i < j) {
swap(i, j);
i++; j--;
}
// Reverse each word
i = sentencePtr;
for (jchar* k = sentencePtr; k < sentencePtr + length; k++) {
if (*k == ' ') {
j = k - 1;
while (i < j) {
swap(i, j);
i++; j--;
}
i = k + 1;
}
}
// Swap last word
j = sentencePtr + length - 1;
while (i < j) {
swap(i, j);
i++; j--;
}
env->ReleasePrimitiveArrayCritical(sentence, sentencePtr, 0);
}
The algorithms are identical but the in the C++ example, I’m not indexing each array; I’m using pointer arithmetic. Since Java doesn’t let you use pointers, I figured the C++ code must be faster. Array indexing requires multiplication and my pointer arithmetic is only doing addition and subtraction. A slight advantage but an advantage nonetheless.
To compare the performance of the two methods, I ran the Java version 1 million times to make sure the JIT had optimized as much is possible and then ran it a 1 million times again. I ran it 1 million times a third time just for good measure. I measured how much time it took to run each batch of 1 million. I did the same with the native code. The second and third Java measurements always had very similar run times. The first Java measurement was always slower but this was expected since the JIT needs time to go in and optimize the code.
The native code always ran at the same speed and had very little variance. This is expected because the C++ code is compiled once and doesn’t have any dynamic run-time compiling as is the case with the Java code.
The results of my experiment shocked me. Java won by a large margin. I was mad. How could Java beat my finely crafted C code?
I run Linux so, of course, I used GCC to compile my C code. I went back and recompiled it with -O3. This tells GCC to optimize as much as possible. After turning on the optimizations, my C code was faster. I smiled and thought to myself, “Should I have them put ‘Mike Heath’ or ‘Michael Heath’ on my programming contest trophy.” I didn’t think any of my coworkers would use JNI.
The contest was going to be run on Windows so I pulled out my Dell Latitude D810, booted it to Windows and compiled my code using GCC under Windows. I would have used the Visual C++ compiler but I could not create a .dll project with Visual Studio Express. Microsoft apparently doesn’t believe in allowing their free tools to be used for much more than a trivial project. That’s a rant for another day. Apparently it can be done with the command-line tools but I was too lazy to figure that out. Suffice it to say, I hate Visual Studio more now than ever.
It didn’t take long at all to create a .dll using GCC on Cygwin. Honestly, Cygwin is the only thing that makes Windows usable for a Unix geek like myself.
I ran through all my tests and sure enough. My C++ code still ran faster than the Java equivalent on Windows.
After getting everything working under Windows, I decided I had better try Java 6 to see how it compares. Java 6 has a number of significant performance improvements over Java 5. Java 6 was significantly faster than Java 5 but my C code was still faster. I patted myself on the back and decided to run my code on my Dell Lattitude D820. I wanted to make sure that everything would run fine without Cygwin being installed. I’ve only used Windows on my D820 to play Age of Empires III. Windows might not be hacker friendly but it is gamer friendly.
I ran the code on my D820 and much to my surprise, the Java code ran faster than the C++ code. I was shocked. How could this be? The C++ code was always faster on the D810. The only explanation I can think of is that the Java just-in-time compiler is producing code that takes advantage of my D820’s Intel Core 2 Duo processor. I’m sure it’s not using both cores (that would require threading) but the Core 2 Duo does have a 64-bit instruction set that the JIT may be using to push more data through more quickly. Further investigation is required.
So, I decided to submit my native code to the contest anyway because it lets my true geek shine through and I’m relatively certain the code will be run on a D810.
I will post a matrix in the near future that compares various JVM’s (Sun Java 5, Sun Java 6, IBM Java 5, JRocket, etc.) to my native code. I’ll run the experiments on both my D810 and my D820 and do both Windows and Linux on my D820. It will take some time to compile all the results but they should be interesting. I’ll also trying running the experiment on my single core 64-bit Athlon processor.
8 Responses
RR
February 26th, 2007 at 10:47 am
1I wonder if you had coded this in Emacs if your results would differ.
Totally Nerdy Information » Java JIT vs Native Code Performance Matrix
February 28th, 2007 at 9:51 pm
2[...] The performance results of our programming contest at work are listed below: Dell Lattitude D810 Pentium M 2.13 GHz [...]
Dave Wellman
March 1st, 2007 at 12:52 pm
3Nice Mike,
So did your C++ solution win the programming contest?
Dave
John
March 1st, 2007 at 2:29 pm
4This doesn’t surprise me. Your c code most likely IS faster, however there is a significant amount of overhead involved in JNI calls which probably accounts for the slowdown.
David Burggraaf
March 2nd, 2007 at 12:37 pm
5Comparing compilers (C++, Java Jit, etc.) on microbenchmarks (small peices of code) is generally not that useful. Optimizations for this sort of code have been around for a long time and any average and above compiler should do fine here (the differences will only be where a specific compiler was tuned for a specific peice of hardware). Where it gets interesting is when you look at large code bases. Here a just-in-time compiler (jitter), trades off more complex optimizations, for runtime jit performance; where as the heavy-weight, build-lab C++ compiler can perform very expensive operations.
It is very common for Java and .NET folks to take a tiny sample like this and say “look it’s faster”:).
Regarding you Visual C++ comments, here is a URL where you can download the express SKU. It contains all the tools necessary to build a DLL, or really, any compent that can be built with the full version - though there are some highed optimizations that have been removed, that are very valuable on itanium, and for any large performance sensitive applications.
It has been a fun contest:).
-David
David Burggraaf
March 2nd, 2007 at 12:38 pm
6I forget to give the URL for the Visual C++ express product: http://msdn.microsoft.com/vstudio/express/downloads/
mheath
March 2nd, 2007 at 1:39 pm
7David,
I downloaded the Visual C++ Express product but the option to build a DLL was disabled. Perhaps there a way to do it with the command-line compiler and linker but I didn’t have the time to dig that deep. Should I be doing something differently?
I agree about the microbenchmarks. They’re really not that useful but it’s still fun to see how different JIT compilers optimized differently. It would be really fun to analyze the assembly produced by each compiler and see what they’re doing differently.
Programming Contest Results by Totally Nerdy Information
April 6th, 2007 at 11:43 am
8[...] programming contest results are finally in. My C code prevailed and I am the biggest geek in the [...]
RSS feed for comments on this post · TrackBack URI
Leave a reply
Categories
Archives
Links
Meta
Calendar
Recent Entries
Recent Comments
Most Commented
Totally Nerdy Information is proudly powered by WordPress - BloggingPro theme by: Design Disease