One of the components of the GNU Compiler Collection (GCC) is GCJ, the GNU Compiler for the Java programming language. GCJ is a compiler that can generate both native code and bytecode from Java source files. GCJ includes a runtime library (libgcj) that provides all runtime support, the core class libraries, a garbage collector, and a bytecode interpreter. Programs created by gcj can dynamically load and interpret class files or native shared libraries resulting in pure, or mixed native/interpreted apps.
I work as a JAVA Programmer for a big IT company in germany and recently had the chance to deal with GCJ on WindowsXP (MingW version from http://www.thisiscool.com/gcc_mingw.htm). I’d like to share a few impressions that I made with GCJ 4.0.2.
New at work, I talked about the possibility to compile JAVA code natively for maximum performance, since that way you get rid of the VM and could benefit from huge code execution time etc. I mentioned GCJ as an example here. Coming from a former ASM and C corner I knew what I was talking here.
I first ran into the point where my co-workers started to not believing me that JAVA code can be compiled native but then one day I had the chance to convince them that it is possible.
My manager asked me whether I can compile his lexical analyzer code using “that software package which I once mentioned”. He initially assumed that it in best case would only gain 2-5% of speed at all but after the compile was finished it gave a maximum performance boost and later on we ended up that his code required 25 seconds to parse some stuff with JAVA and 5 seconds after GCJ compiled (the values I recall).
Though another example from one of my coworkers sadly required double the execution time using GCJ than when run through native JAVA. So his code on JAVA took 1 second and nearly 3 with GCJ. And I earned some jokes that native code isn’t as fast or stuff like JAVA is as fast as native code already. Though I am still convinced that this is not the case and I know that this is due the fact that GCJ still needs a lot of work in optimization and things like this.
Though what I want to say is that GCJ is definately something great and I would really like to see it to generate much faster executing code.
Another bad thing besides the execution time was that one of the big Parser.java files (800kb) it had to compile required nearly 1 hr to generate the matching Parser.o file with nearly 27mb of ASM output.
At the end I do keep a MingW system on my WindowsXP box to run open source software and to have the “Unixish” Environment around it and I keep the MingW version of GCC too for various other tasks such as compiling some *.dll files that we may require for other products.
So my final words to the GCC people, please make the things faster. Faster execution of code and faster compile of code.
Is that javas runtime optimizer already is excellent, so by a native compile you will get faster startup times, due to statically linking the core libs instead of having to push everything through a classloader/class verifier, but when it comes to raw execution speed, you wont get that significant speed boost in fact some of the statically compiled code will even run slower once the runtime optimizer has fully kicked in.
I came to realize that gcj was very usable when I started eclipse as usual, but noticed after a few hours that I was running under gcj instead of the sun jvm!
The combination of native compilation and free libraries with the classpath project makes it possible to develop software for platforms without any decent jvm. A boon for embedded/onboard projects!
As for the speed and where a good jit exists, native compilation does not bring much speed advantage. Otherwise, it’s the only game in town.
The one thing that really gets my goat about Java and discussions about performance is when people harp on-and-on about the initial start-up time. A lot of time on short benchmarks the start-up time tends to dominate the the benchmark, which makes it artificial. After all if you use a benchmark as a general indicator of performance you shouldn’t include something that is essentially a one-time-hit (per run).
The only time where Java’s start-up became an issue for me was when I tried replacing Notepad with jEdit as my quick-and-dirty editor; when I my open a file for only 45 seconds at a time and not use the app again for another 5 minutes so I close it.
Definitely a rant, but at least on topic!
Latencies, especially for computationally insignificant desktop applications, form a large part of the impression of performance to the user. This includes, naturally, the time it takes for a process to enter a usable state upon initial invocation.
If you are only concerned with throughput for comparison of compute-intensive problems then this startup cost is insignificant. The startup cost isn’t artificially significant in microbenchmarks with small test sets, the tests simply don’t demonstrate the properties that you’re interested in. Both latency and throughput results are important, and which if either is more important is dependent on the requirements of the user. It’s interesting, though, that various latencies can be hidden at the cost of memory while throughput isn’t as trivial to improve.
Those of you who have not yet tried the 4.1 version of GIJ/GCJ should first give it a shot before painting it with the same brush as the older versions. From experience and from reading about the features of 4.1, there is a very significant improvement vs. 4.0 and a massive improvement over 3.x (the same is somewhat true of C++ compilation, FYI).
I can only hope that more of the Java community figures out that GJC/GIJ could take the community to a new level on desktop computers (at the least) and dive in to make a good product into a great one.
Howdy
So your a “Java Programmer” for a big software house and you cannot under stand that JVM startup time would swamp your examples?
This does not bode well for your employer, static compilation in general WILL result in faster execution times but at bloated app sizes (this is excluding the Java runtime) which may or maynot balance out if the Java runtime is included in the calculations.
As for performance this is where it irks me about Java and the zealots that spew crap about how Java must only perform optimisations at run time.
I`ve had this debate on this site before and even with code examples of simple code bloat one “Java Programmer” ignored the facts and basically said it 100% should be done at runtime!
There is NO reason at all not to lightly optimise (static optimisation) some aspects of a java program at compile time (Javac) when it makes sense to do so, no I`m not talking about inlining, bound checks etc but simple things such as dead code removal (yes it does some now but misses probably the same ammount due to not recognising non final components).
Having said all this I realy hope the developers at Sun have a look at GCJ and cherry pick some ideas, Java really needs optimisation work both in runtime and memory performance not in productivity or new APIs.
Thanks for your reply but while your objection is generally correct, it didn’t fit in what I initially tried to say.
I was talking about execution time and not startup time – with most optimizations possible (e.g. placing stuff like text.length(); outside of ‘for, while’ loops in case the length won’t change) and these things. Also -O2 -funroll-loops etc. during compile.
Here an example (pseudo-)code of what I was talking:
void main()
{
String text = someFileLoader(“example.txt”);
long start = System.currentTimeMillis();
myParser(text); // measure time of this method.
long end = System.currentTimeMillis();
System.out.println(“execution time: ” + (end – start) + ” ms.”);
}
void myParser(String text)
{
// really complicated lexical parser code
}
As example this part of the code execution took ~1000 ms using JAVA but required ~3000 ms when compiled using GCJ. This is after the class initialization and after the file loading etc.
Just the pure execution of the method measured by time. I somehow did expect that GCJ would generate code with similar speed if not faster (I only have the 4.0.2 version since that was the last available on MingW’s site (the link I provide above). I can’t test with 4.1 or later but I do expect huge improvements of course and then I mentioned the GCJ version I used).
So please don’t mix startup time with execution time, since these are two totally different things. The same way we did measure execution time of C code (of course with different functions for time measurement) or in ASM (where we used to measure by raster lines and clock cycles).
I’ve been using FC5 for awhile and the GCJ 4.1 version of Azureus they pkg still has serious performance/memory/network issues. I can’t really use it for long so I downloaded Sun’s VM and that’s running very happily.