Urban legends are kind of like mind viruses; even though we know they are probably not true, we often can’t resist the urge to retell them (and thus infect other gullible “hosts”) because they make for such good storytelling. Most urban legends have some basis in fact, which only makes them harder to stamp out. Unfortunately, many pointers and tips about Java performance tuning are a lot like urban legends — someone, somewhere, passes on a “tip” that has (or had) some basis in fact, but through its continued retelling, has lost what truth it once contained.
Eugenia, while I do love your sites really, is there any reason why we get informed about articles from early 2003?
Java’s performance is a myth
actually, for an environment that defaults to turtle speed, java has become quite fast… no way i’m going to use java programs 🙁
maybe i’m a bit biased, but hey – i prefer being biased, using C programs, and able to use old hardware at amazing speeds 🙂
“Write once run every where”.
Even Sun has admitted to this.
I have been making a living from C# & .NET for years but recently had a client requirement for an app that ran on MacOS and Windows.
Well, a Windows Forms app in C# would run great on Win 2000 or XP but *probably* not on MacOS.
So, I built the app w/ Java/SWT and I can honestly say I’m QUITE impressed!
I was worried about speed and resource consumption but I did not have a single problem. The app was blazingly fast and was the perfect solution for the job.
The native look & feel thanks to SWT was a nice bonus but Swing would have been plenty fast according to my own tests, however, I opted for the native widgets for the “wow” factor and the ease of coding in SWT.
Startups were quite minimal (few seconds, even w/ Swing) and if it hadn’t required the JRE and launched from a jar file no one would have known it was a Java app at all.
1) Synchronization isn’t going to be slow if they’re using that tiny microbenchmark. One of the reasons why synchronized methods are seen as slow is because they lock the entire object. So once you call a synchronized method, no one else can call any methods that belong to that object until the synchronized method has finished execution. This is why synchronization is seen as slow and why many have come up with the ‘double-check lock’ approach.
2) Agreed. It’s surprising how frequently this comes up. My guess is the C++ programmers using Java see that all Java methods default to being virtual and they get their knickers in a bunch, being unaware that the JVM does inlining at runtime.
3) Immutable objects can be bad for performance. The reason his example code didn’t produce any noticable difference is because string concatenation in Java creates multiple String objects anyway. So in his code:
stringHolder.setString(“/” + stringHolder.getString() + “/”);
“/” + stringHolder.getString() + “/” will create 4 string objects (3 strings, plus 1 resulting string). Notice that the same line appears in his example of immutable code rendering his test completely useless since he isn’t benchmarking what he thinks he’s benchmarking.
For a concrete example of why Strings and immutable objects are slow, try implementing a simple program that counts the words in a file. If you’re working with Strings, the number of objects that get created increase exponentially and you’re performance is gonna be toast. Rewrite that code to use char arrays, and you’ll see a 10x increase in speed.
All in all, the article is a nice attempt at debunking some of Java’s performance myths. Sadly, it leaves much to be desired.
If you’re programming java on OS X, Swing is a very good choice since it is well integrated into the system. In fact, you couldn’t tell apart a Swing app from a native app on OS X.
Works fine on my machine an iBook G3 800 MHz. I’ve run JEdit on a Dell Inspiron 3700 (Celeron 433) and it works fine too.
Your machine has got issues.
Whenever moores law has made the performance of java programs somewhat tolerable, the people at sun invent new features to reduce it. For example java generics. It does not eliminate boxing/unboxing like the .NET version, but it just hides it.
Just run the following two benchmarks to see what I mean:
http://www.lambda-computing.com/publications/rants/csvsjava/javagenerics
http://www.lambda-computing.com/publications/rants/csvsjava/csgenerics
The java version is 20 times slower than the C# version and uses 20 bytes per int instead of 4 bytes per int for the C# version.
Conclusion:
java combines the speed of python with the flexibility of C
Same here, jEdit on my ibook G3 800 with 384megs of ram just runs fine. Not really a speeddemon, but very workable..
Although I ditched jEdit since TextMate came out. It has a better OSX feel to it, something java applications never seem to get. They may look native, but they generally deviate more from the Apple HIG. TextMate is a real OSX app, has nice features, but has somewhat limited syntax hilighting support. Supports ruby though, and that’s enough for me 🙂
1- there is no generics in C# now. But hey, you can use it in java
2- For a real comparison of JAva and C# generics check here: http://www.jot.fm/issues/issue_2004_08/column8 . indeed java is slow while using generics with value types (primitives) but there is not much glory for C# when subject is references..
3- Generics and collections with primitive types are not the only part of a applications. Maybe only a fraction, and Java is still way faster than .Net (C#) in most other areas..
4- This is just the beginning for java, in terms of usage, there is almost no difference in JAva and C# generics (java is a little better in usage), and under the hood , VM dynamics, performance issues can be improved further.. Overall, i am pretty sure Java gives better performance most of the situations with TRUE platform indpendence, true cross platform IDE’s, true open source applications…
Quote from article.:
“For integer types (primitive type in Java and value type in C#), generic Java was 2.29 slower than generic C#. That is significant. For reference types, generic Java was 1.20 times slower. That is not significant and is in part due to the overall efficiency of Java JIT compared with the C# JIT (other experiments have suggested that the C# JIT is slightly more efficient than the Java 1.5 JIT).
In conclusion, the design decision to use code specialization for value types in C# has paid off in performance benefits. The wildcard semantics in generic Java appear to provide a more straight-forward mechanism for expressing complex generic constructions than the equivalent C# implementation.”
Have you run my benchmark?
I am doing nothing esoteric. Just a list with 1000000 ints. That should be a no-op for modern processors. Still the java version is *much* slower and consumes 5 times more memory than the C# version.
I know that microbenchmarks are not that significant. If there was a factor of 2 difference between the two implementations, I would have dismissed it as unimportant.
But a factor of 20 performance difference and a factor of 5 memory overhead can not be dismissed. The java generics implementation is fundamentally broken, and not just performance wise.
Take a look at the weblog of Bruce Eckel (Thinking in Java) <http://mindview.net/WebLog/log-0050> . He thinks that the java generics implementation is broken from a semantics point of view and do not deserve to be called generics.
He advocates calling it autocasting since the only thing it does is that it does the casts for you.
I must admit that boxing objects for collections hasn’t been a big worry for me. I understand C# has excelent optimisations for native types, but a quick review of my code didn’t really see many places where I would take advantage of it. (Most of my lists contain objects).
I certainly wouldn’t give up the platform support for java for that tiny improvement in the applications I write. If I needed collections for native types I could always write them. It’s not rocket science to write a basic Data Types like a Tree, List or Hashtable. Every Computer Science Lamer should be able to do at least that much.
I understand that C# does the writing bit for you (which is cool), but if autoboxed collections are really slowing me down, I know how to get around it.
I sometimes wonder how many serious programmers are on here if they are pushing X language because it has a smarter List objects. Not knowing how to roll your own linked list, array backed list, LIFO stack, FIFO queue, hashtable, and various tree structures isn’t a good sign.
Do you people never wonder how they do kernel programming in C, which doesn’t come with any of this stuff by default.
As a matter of fact, I *do* linux kernel programming <http://www.lambda-computing.com/projects/dnotify/> , so I am quite capable to write my own collections, thank you very much.
But the point of a high level language should be to reduce code duplication and let one concentrate on the problem instead of reimplementing basic data structures again and again.
Java fails miserably in this respect. That is why abdominations like this exist <http://www.osnews.com/story.php?news_id=8459> . C# is somewhat better, but not perfect.
I never tried lisp, but in my favorite language <http://www.cs.kun.nl/~clean/> the level of code reusability is outstanding. Unfortunately I have to work with C# for a living, but it beats java any day.
Java collections’ support for primitive types are horribly broken. Generics add nothing apart from being syntactic sugar. While I understand their decision to implement autoboxing, I think it’s a bad choice.
Java applications that need high speed collections on primitives don’t use the built in collection classes. Instead, most people I know use Colt, which is a very fast set of collections for primitive datatypes. If speed is important to you, don’t bother with the built-in collections and use something like Colt. http://dsd.lbl.gov/~hoschek/colt/
Sorry, wasn’t meaning anything. I haven’t done any C for years (can’t say I’m particularly favourable to any particular language anymore either).
I haven’t used the trove stuff. I’d generally just roll my own if really needed. Most of the time the boxing of primitives isn’t a big deal as its only a tiny part of the programming I do. If I’m ever dealing with massive collections of native types I guess I’ll think about it. (hasn’t happened yet though)
There are some things I must say I’m not sure that I like about .Net. I see the performance advantages of using a lot of native libraries (Perl does a lot of that too) whereas Java definitely tries to do most things in bytecode. Its a bit of a double edged sword in terms of performance vs sandboxing, tracing and platform independence.
Despite what everyone says I see little advantage in a virtual machine that is tied to a limited platforms either by support for the virtual machine or support for the binary libraries that everything is sure to be using. Surely compiling C# to native code would be better again? OCaml is apparently a language that takes higher level approaches and produces native binaries.
I see a lot of people doing microbenchmarks (ie: some test program running in a tight loop) and getting good results.
But that’s not the whole picture, there’re other real issues with java, like startup time of the VM or memory consumption in a long time running app. Same goes for C# (The beagle guys have in their TODO list to find some way to restart the daemon because of the largue amount of mem mono takes)
The more I think about it, the more I think that “virtual machines” are the wrong answer to today’s problems. There’s already a real machine, running software over a virtual one has not sense. Security, portability….are issues that could be solved with traditional methods without hitting the possible overhead/overdesign issues of “running a program in a virtual machine which runs in a real machine”. Virtual machines are just a quick and ugly hack for solving those problems, IMHO; and the discourage the efforts neccesary to research solutions to the *real* problems.
I agree that many .NET libraries rely way too much on windows native libraries (e.g. System.Windows.Forms). But you can write quite well-performing code in pure, safe C# and there is a large repository of good C# libraries on the net.
So you don’t have to use native libraries if you don’t want to. Of course writing pure managed code takes a little more effort, but IMHO it is worth it.
Speed is important, but the actual speed gain for me would be miniscule. Its a handy thing to know that there are libraries out there if I ever need them though.
Most of my work is server side java (“programming with strings for dummies”) I use a few Maps with effectively primitive keys, but not enough to replace the standard collections.
I can still write some horrid stuff in java that will run like a one legged turtle, but I can do that in nearly any language.
Performance is always a consideration and I have various levels of caching depending on the task at hand. However Java performance in recent years has improved significantly.
I can’t see myself switching to .Net anytime soon. I am keeping an eye on Parrot though. Perl is a real productivity language, especially for smaller projects.
“Same goes for C# (The beagle guys have in their TODO list to find some way to restart the daemon because of the largue amount of mem mono takes) ”
The mono garbage collector is not exactly state of the art. So something like this is to be expected. There is nothing wrong with the concept of virtual machines, and you already get close to native performance with many virtual machines.
With stuff like native code caching virtual machines have all the advantages of native code without the downsides like having to recompile for each architecture.
Before you dismiss virtual machines, please note that every modern x86 processor is a superscalar RISC processor emulating an ancient CISC instruction set.
p.s. I am writing this from a transmeta crusoe processor, which uses a VLIW core and a JIT compiler internally.
I fully understand that. Which is why high performance primitive collections aren’t really a priority for Sun. Sure, it would be nice to have especially if you’re writing a lot of numerical code, but how many people do that in Java?
Java is very strong on the server and most server apps don’t need fast primitive collections. Strings aren’t primitives so the existing collection classes work fine (enough).
But for those who do write numerical and scientific code in Java, it’s good to know that you don’t have to write your own classes since alternative collection classes like Colt do exist to make your life a lot easier.
micro benchmarks are nothing but nonsense. i still say that the real applications matter and putting a meaningless 1000000 int to a collection is not a real application. plus, when you go to reference values, there is almost no advantage after all, and honestly in most cases i use objects in my collections, not primitives(value types). how frequently do you use a Map for storing integer values? or a Set for doubles?. i might prefer using arrays for that purposes, if they fit of course. if a see that performance “really” hurts for putting primitives to collections, i use different third party primitive collection libraries. there are plenty of them, but as i noted, i never really needed that. Java performance never failed me.
So, make a real application benchmark, lets say a word processor, bittorrent client, or a DB application, or a image processing library, spell checker.. what ever, a real solid application and apply it to both java and C#, compare the development and running speeds, of course after C# has generics, then we may talk.
How’s the performance of the Transmeta Crusoe? From all the benchmarks I’ve seen, it seems to be a very poorly performing chip. Just curious as to how you find it’s performance since you come across as a knowledgable developer.
The performance is varying. When you start up Windows XP it is quite slow since all the code has to be JITed. However I rarely do that. I hibernate it instead.
Some applications that perform repetitive tasks (e.g. DivX movie playback or encoding) perform remarkably well. Other tasks that require a larger working set are not so hot.
However, I bought this notebook (a Compaq TC1000 tablet pc) primarily for developing tablet applications and for its long runtime (5.5 hours with the background light set to low, 3 hours with full brightness).
Running visual studio.NET on the machine is not exactly blazingly fast, but certainly usable. That is the biggest application I routinely use on this machine. Everything else is quite fast. So overall I am quite happy with the machine.
With regards to the “final” myth, if it doesn’t result in higher performance, then the Java compiler is doing a poor job of type inference. The reason why declaring classes final should help performance is this:
Consider the class hieirarchy A -> B -> C.
Now, consider the fragment “t.doFoo()” where doFoo() is defined in A, and t is known to be some derivative of type A.
If the compiler can infer that ‘t’ must derived from type C, and ‘C’ is declared final (which means that nothing in another compilation unit could have inherited from C), that means that ‘t’ must be of type C, and thus the call to t.doFoo() can be inlined.
The java JIT does a remarkably bad job in optimizing stuff. For example all the primitive type wrappers such as Integer, Long etc. are immutable.
As fans of functional programming know, as long as a variable is immutable there is no difference between value semantics and reference semantics. The problem only shows if you change a variable that is referenced from somewhere else, which is of course impossible with immutable variables.
So in theory the java JIT could optimize the primitive type wrappers so that they reside on the stack and not on the heap. But it does not do that.
Ran this on an IBM JVM(jit enabled) 1.4.1 and the trivial synchronized test took ~10x more time than the trivial unsychronized test to run. Kind of blows his argument out of the water…
hello, I was the one who tried the minibench with the IBM VM(on linux ppc, kernel 2.4.22g).
I also just tried this on a powerbook using Apple’s implementation of Sun’s VM 1.4.1_01-14 and the results were an order of magnitude worse.
As to the age of the articles: A funny thing that I’ve noticed over the last few years, is that an interesting article appears here, then a few days, a week, or even a month(!) later sometimes it appears on /. So where does everyone else go for relevant news any more? /. used to be good, and OS News seems to be mostly good, but I wonder is it the dearth of technology news(real news, not news covering rehashes of old technology hacked up to do something slightly different or combined with some other technology to make a “new” product)? It seems like most of the “real” news recently has been related to things like the x-prize, biotechnology, and quantum mechanics…
You’re assuming that the JIT doesn’t do the inlining at runtime, which it does. Declaring doFoo to be final might result in a speed increase for a while when HotSpot is still analyzing the code, but once Hotspot is finished, the effect should be the same as if the method was inlined.
Note that HotSpot has a warm-up time that can take up to a few minutes, so if your program runs for shorter lengths of time, it might be beneficial to make your method final since the program terminates before hotspot can kick in.
I remember Sun pushing Java performance tuning at the JavaOne conference in 1998. They still haven’t managed to improve the performance of the (poorly designed) Java virtual machine.
Architecturally, Java can’t be fast. Sun needs to re-architect the virtual machine to truly enable Java’s performance.
Use .NET if you want speed.
LOL. your reasoning is perfect. i am sure you are a respected engineer in virtual machine engineering(!). Everybody knows Java VM design is superior. Hotspot server anyone?
And what part of the JVM is poorly designed? Why would Java programs be inherently slow? How is the .NET VM better? No details?
I smell trolls….
The problem is not as much with the JVM as with the bytecode.
For example you have no value types and it is impossible to return anything other than a primitive or a reference from a method. So you need objects on the heap for almost anything.
In the cases where you can avoid using objects, hotspot does an outstanding job. But even the best VM can’t eliminate the overhead created by the fact that you need to allocate a full-fledged object on the heap for the most trivial stuff.
Value types are not a real performance issue unless you do high speed computing stuff.
Having programmed with various VM languages and having done bigger projects the last 7-8 years I can say, that for almost 90% of the apps the VM speed is not a big issue, the biggest problems arise in wrongly used algorithms for certain tasks.
I am sure that around 95% of all programs out there can/could gain a significant speed boost by just locating hotspots with a good profiler and doing a redesign of the central algorithms.
Having value types and collections which go on native datatypes does not make that much of a difference, until all other means of optimizations are not done at least to 80%.
I think the major speed difference between the two concepts in real world usage comes in with the compile philosophy. JVM uses a write-once run-anywhere type mentality. The .NET CLR is similar but since it only gets interpreted to machine code once, and is then cached indefinitely, it turns into a write-once compile-anywhere type mentality. If I run the same app day after day, in Java I’m rerunning the conversion every time I start the app. In .NET I only do it one time. While the JIT is good at caching the frequently called functions, it would be better if it just dumped out a local executable one time, or until the local cache is cleared, and thus only have to do this at install time or first usage. Does JDK 1.5 allow something like this?
Actually in JDK1.5 they do a couple of things to speed up the compile process.
1. The bytecode is memory mapped into a shared memory space for the JVM shared classes.
This helps because the system classes cannot change thus once loaded by one VM, they can be shared by all.
2. Bytecode is also pre-compiled into a halfway step ready for inlining and optimization. Rather then compiling statically, Java Hotspot tries to dynamically inline as much code as it can. This is one major problem with ALL Object Oriented code. Simply using a get/set function instead of directly accessing the variable can be the cause of a large overhead in OO programming. However, with inlinining, this goes away. Hotspot can dynamically inline many functions to increase speed. 1.5 shares the precompiled bytecode for system classes. The next step might be to precompile every class, but I don’t think that is possible using current JVM apis thanks to dynamic class paths. If you use a dynamic class path then you cannot pre-compile code. They would have to make a new class loader that would statically determine class paths and pre-compile and store code on disk maybe.
Dynamic code generation via BCEL ( for XSLT ) and other bytecode generation libraries makes statically compiling code hard. There are also neat constructs like the Proxy interface in Java that allow for dynamic implementation of interfaces, created totatly at runtime. These features are more smalltalk, or like scripting languages.
Blaming the JVM or language for poor performance is akin to blaming cars for reckless driving. In my career doing Java development for the past 8 years, I’ve observed again and again that people’s ingenuity in contriving poorly-architected/designed/written systems will ALWAYS trump any systemic performance limitations. And usually by several orders of magnitude.
Gosh, what could possibly be the weaker link–the JVM with hundreds of millions of dollars of R&D behind it or the indentured H1-B fresh off the boat after a couple months of classroom training?
Why would you want recomple a program every time you run it?
“If the compiler can infer that ‘t’ must derived from type C, and ‘C’ is declared final (which means that nothing in another compilation unit could have inherited from C), that means that ‘t’ must be of type C, and thus the call to t.doFoo() can be inlined.”
IIRC Hotspot can inline virtual methods if there are no overrides – and then un-inline in the case that a class is loaded that breaks that.
So, unless things changed since that was (IIRC) true then, no, Java can inline that.
Final classes were hints for the JIT compilers but HotSpot takes a different approach.
.NET is better than the JVM? LOL You clearly haven’t looked at either one.
Example: JVM’s garbage collector is superior to .NET’s. It’s dynamic compilation is also better.
“Why would you want recomple a program every time you run it?”
Exactly. This is entirely impractical. Last time I did a full build on my most complicated C++ application, it took 3 hours to build. I don’t know about anyone else, but I don’t feel like starting my application and having to wait three hours before I can use it.
Upon your suggestion, I thought I’d take a look at Colt. However, the download page gives a 403 Forbidden. Do you know where else I might find it?
It works fine for me at the moment. But if that site fails, you can get an older version of Colt from http://hoschek.home.cern.ch/hoschek/colt/
There are also neat constructs like the Proxy interface in Java that allow for dynamic implementation of interfaces, created totatly at runtime. These features are more smalltalk, or like scripting languages.
Yet, those features never stopped Smalltalk from having good static compilers
The dynamic compilation is not a compilation per se, as far as I know 1.5 has (and probably older versions) a binary code cache which caches the compiled code.
But the main thing is from a C++ point of view, that static compilation in C++ takes ages, if you don´t have precompiled headers and other stuff, then you have the link process.
Java has none of this, no headers, the compilers itself are blazingly fast, and what happens then is, that the bytecode is compiled as you run on the fly against native machine code,
the next thing is, that constantly statistics are gathered to find performance hot spots and those on the fly are treated with optimization patterns, a speed gain of more than 100% within the first half minute of a java application is not uncommon.
So what we have here is some kind of extra thread which just tries to optimize during the application runs.
Given modern systems this is not a drag, the performance penalties of this are covered by the improved speed the constant optimization brings.
The program isn’t profiled constantly!! That would result in horrible performance. There is a warm-up time where HotSpot profiles the code to determine where a program spends most of its time. The optimizations are all done in this stage. Once the warm-up period is over, no more profiling is done and the program just runs. No more optimization.