Linked by Peter Gerdes on Mon 10th Jan 2005 17:35 UTC
Editorial As a recent ACM Queue article observes the evolution of computer language is toward later and later binding and evaluation. So while one might quibble about the virtues of Java or the CLI (also known as it seems inevitable that more and more software will be written for or at least compiled to virtual machines. While this trend has many virtues, not the least of which is compatibility, current implementations have several drawbacks. However, by cleverly incorporating these features into the OS, or at least including support for them, we can overcome these limitations and in some cases even turn them into strengths.
Order by: Score:

Sorry to plug publications, but you may wish to look at:
The Jamaica project has developed an OS (based on JNode) built around the Jikes RVM. We have also developed the PearColator DBT which can be incorporated into it. This is all written in Java and downloadable, however, its not yet fully featured.

You seriously need to check out LLVM
by Chris Lattner on Mon 10th Jan 2005 18:09 UTC

LLVM ( provides many of the capabilities that you want without the drawbacks you describe. In particular, it gives you portability and performance and the ability to adapt to changing hardware. Its compile time costs are very low: it provides a CFG, SSA form, and many other things directly in the representation. It also allows for compile-time, link-time, install-time, run-time and off-line ("optimizing screensaver") optimization.

If you're interested in this, check out these papers:


Slim binaries
by Erik Terpstra on Mon 10th Jan 2005 18:57 UTC
kernel a bad idea?
by mattb on Mon 10th Jan 2005 19:00 UTC

just an ignorant java developer who has an idea of whats being talked about, but self-taught/hobby level

i would like to say this is one of the most interesting articles ive read here in awhile. Managed languages are definately the way of the future. But if they are destined to become such an intregal part of operating systems of the future, wouldnt it make more sense to put the low level stuff into the kernel? wouldnt that allow for stuff like load handling and process optimization? maybe even a sandbox for the sandbox, have a generic, vm independant layer that would allow the kernel to know more about whats going on with the vm. i would think that the fewer levels of abstraction between the vm and the cpu, the better. would you mind explaining why it would be a bad idea?

one of the big problems with java at any rate, is that although platform independant sounds real nice, you still have platform dependant bugs, and platform dependant optimization. profiling a java app on windows can give wildly different results as on linux. and if we take cross-platform benefits off the table for a sec, dont we just end up with a slower implementation of the kernel?

VM sharing
by Ian Burrell on Mon 10th Jan 2005 19:00 UTC

There is no need for special hooks in the OS for sharing the code of the JIT between processes. Most modern OSes keep a single copy of the read-only code segment in memory and share it between processes. The JVM and its libraries will be shared.

This doesn't apply to the code produced by the JIT. Instead of having special handling for code produced by the JIT, one solution is to use the existing shared library mechanism. This is done by gcj; it incrementally compiles Java bytecode into .so files. These can then be loaded and shared normally.

cachign is good
by Yamin on Mon 10th Jan 2005 19:08 UTC

I think in general the idea is there. Cache native code is the major easily doable part.

I wouldn't be so eager to jump in and try to force VM (java...) to be regular applications though. To a certain extent, we 'trust' our OS, our hardware to be perfect while programming. In a similar sense, you should eventually be able to 'trust' your JVM. Things like memory protection...aren't really relevent in a VM like java.

You don't have random access to memory, so what is there to protect? Why bother with a context switch between java applications? Also additonal VM extensions to security...

Now, moving some JVM stuff into the OS also makes some sense. But now you lose some of the isolation between the OS and the VM.

RE: Chris Lattner (IP:
by A stranger on Mon 10th Jan 2005 19:43 UTC

Interesting but I'm not certain how well it would work with say LISP, or Scheme.

I do see the advantage of the above, and other such efforts as great for those not so enamored with porting systems code, and libraries around.

by Rich Massena on Mon 10th Jan 2005 21:11 UTC

Microsoft.NET is CLR for common language runtime, CLI is usually command line interface

Why VM?
by Mike Hearn on Mon 10th Jan 2005 22:01 UTC

I'm not really sure that a VM is worth the overhead given that you can compile to native code using tools like gcj anyway these days - given the dominance of x86 I'm not even sure abstracting the CPU is worth it either ....

Re: caching is good
by Nicolai on Mon 10th Jan 2005 22:39 UTC

I wouldn't be so eager to jump in and try to force VM (java...) to be regular applications though. To a certain extent, we 'trust' our OS, our hardware to be perfect while programming. In a similar sense, you should eventually be able to 'trust' your JVM. Things like memory protection...aren't really relevent in a VM like java.

Oh, but if there is one thing that history (and security researches) tells us, it's that programs have bugs. They always do, and they always will, and nothing you do can change that.

Putting the VM into its own context is an additional security measure, that will greatly reduce the significance of any kind of exploit (or instability) that can be achieved via bugs in the VM.

re: RE: CLI
by jb on Mon 10th Jan 2005 22:55 UTC

CLI is, in dotnet land, the Common Language Infrastructure.
Essentially all the java "built-in" classes.

JIT is faster than static compilation
by Slava on Mon 10th Jan 2005 23:34 UTC

Theretically a VM can achieve higher performance than static compilation. A VM can use runtime type feedback -- that is, it can look at what parameter types functions are being called with, and compile specialized versions of those functions for those parameter types, thus avoiding runtime type dispatch. A static compiler can also compile specialized versions of functions, but it does not have the luxury of runtime type feedback, so it doesn't really know how many of what specializers to compile. Before you say that this only applies to dynamically-typed languages, even some langauges that we think of being statically-typed, such as C++ and Java, have runtime type dispatch, for example in the implementation of virtual methods, and thus they can benefit from runtime type feedback.

Sun's implementation of Java does this, but Java has a reputation of being slow so it is not a good example. Other languages that use this technique include Self and VisualWorks Smalltalk.

Meaning of slow
by eric martin on Tue 11th Jan 2005 00:20 UTC

Java Slowness = Virtual machine mem. hog= if you have and old computer and don't have 256-500 megs ram

by Rich Massena on Tue 11th Jan 2005 01:06 UTC

Ahh. I should have RTFA, my mistake.

Re: JIT is faster than static compilation
by Anonymous on Tue 11th Jan 2005 01:13 UTC

The JIT and its runtime will always know far, far more about the executing state of the machine than a static compiler and can adapt accordingly.

In a similar way, assembly actually runs slower than C in a lot of cases because the C compiler can produce code that places the processor in a specific state, and the processor can easily predict what goes on from there and execute ahead.

You can do more work AND run faster just by knowing more about the executing states of the machine. This isn't 1990 anymore.

Re: Meaning of slow
by bleyz on Tue 11th Jan 2005 03:06 UTC

Java Slowness = Java being slower and its gc peforming worse than similar vm languages. Not to bash java, but to point out that the jvm can probably be vastly improved.

OTOH you hit the nail on the head - when people claim java has gotten faster, they misregard that it's their hardware that's gone better.

Re: JIT is faster than static compilation
by Deletomn on Tue 11th Jan 2005 03:38 UTC

(Note: I decided to address the C vs assembly comment first)

Anonymous (IP: In a similar way, assembly actually runs slower than C in a lot of cases because the C compiler can produce code that places the processor in a specific state, and the processor can easily predict what goes on from there and execute ahead.
A human can do by hand the same thing as the compiler. The only difference, is that it's easier to let the compiler do it. This is especially true if you are developing software for more than one platform or are new to the platform or the platform has some recent changes or if the software you're developing is especially complicated or if you don't have all the nessecary documentation for the platform, etc...

Anonymous (IP: The JIT and its runtime will always know far, far more about the executing state of the machine than a static compiler and can adapt accordingly.
Wrong. You can write a "static" program manually that will know everything about the computer it's running on and adapt, all you need to know are the appropriate instructions and techniques. (In fact, there are some optimizations that can be done manually that to my knowledge are still not done by any JIT.) And if you can do it by hand, then it's quite possible that a "static" compiler can be developed to do the same (or similar) thing.

The real advantages of a JIT are: 1) A program can be written a long long time ago and the JIT can still use the latest optimization techniques on it. Whereas if it was compiled with a "static" compiler, it's stuck with whatever optimization techniques were in use at the time. 2) A staticly compiled program needs to have all of it's optimizations "built-in" and that could take up alot of space. 3) If the optimizations are done by hand, that's going to require a fair (to alot) of effort, knowledge, and time just for the optimization, whereas with a JIT, it's all done for you. So you can focus more on simply making the program work.

However, I can say that I feel doing some optimizations by hand can be a big help. I've known of situations where the programmers found tons of bugs (and higher-level optimizations which the compiler couldn't do) simply by converting the program to assembly language by hand. The reason this helped is really quite simple, it provided them with a totally different perspective on the program and suddenly hard to find bugs (including ones which the programmers and debuggers had no idea existed) "suddenly appeared" and were "easily" eliminated.

Re: JIT is faster than static compilation
by Deletomn on Tue 11th Jan 2005 03:44 UTC

I forgot to add something...

Anything you (or a JIT) might do to optimize a program while it is running is going to require some overhead of it's own. Whether this overhead while outweigh the optimizations is a good question which you need to ask.

Personally, I don't know if it does or not with the current JITs. Perhaps someone else does.

With such optimizations done by hand though, they are "easily" checked and eliminated through a sufficient level of testing. (I'd imagine that a JIT should be capable of doing the same thing automatically.)

Thanks and more explanation
by logicnazi on Tue 11th Jan 2005 04:28 UTC

First of all I wanted to thank everyone for the thoughtfull consideration and interesting responses. In particular I found the LLVM stuff fascinating. While it doesn't quite address all the things I had in mind it does go a long way there.

Now a few comments in response to what people have said.

<h2>Why Use VMs</h2>

First of all there are several good reasons to use virtual machines rather than static compilation. As several people here have accurately pointed out there are some performance benefits to doing things at runtime. Additionally there are many garbage collection benefits to working in a virtual enviornment as additional information is availible letting one avoid the drawbacks of conservative GC. In particular, I think it would be very difficult to provide guaranteed finalization in a staticly compiled enviornment.

Moreover, static compilation doesn't provide for binary compatibility. While theoretically one could simply provide guaranteed source compatibility the pragmatics of software development make it quite unlikely that this would really be effective. Even pure ANSI C programs usually aren't write once run anywhere. Quite simply as long as the development enviornment is focused around the execution of native binaries the temptation for developers to take advantage of pure binary features incompatible across platforms is simply too great. Furthermore, without fat binaries or a solution like I am suggesting it seems difficult to provide transparent binary copying between platforms and architectures.

Still, these issues may not in themselves provide a compelling justification to make such a major change and some hack like fat binaries or automatice recompilation might offer the user live appearence of perfect binary compatability. However, VMs provide several features that simply can't be provided in staticly compiled code.

Foremost is finely grained permissions/sandbox features. By only allowing the JIT compiler to cache/create 'safe' code we can force all sensitive operations to be performed virtually. While we might introduce rough grained permission features using ACLs or binary scanning these simply can't provide the level or protection and the fine-grained distinctions a virtual enviornment can provide.

For instance suppose you download a program from the internet which edits/updates your bootloader. This program needs direct access to your disk but you don't want to allow an error to overwrite all your data or for a trojan to maliciously modify other executeables on your system. In a virtual machine all calls to the direct disk system call would be sensitive and pass through the emulator portion which can enforce restrictions like requiring all reads and writes to be in a certain range. Since the sector being accesed may be determined by a complicated algorithm once simply can't guarantee these resctrictions at compile time.

So while grsecurity does demonstrate that we can add access controls one at a time to certain system calls it requires specifically dealing with each function one wants to restrict by hand. A virtual enviornment provides a general solution where *any* system call be made subject to near arbitrary restrictions. One might specify that a given program is only to send UDP packets to a particular IP address, or may not start IPC with a particular process or any restriction imaginable not only those which the security people thought about. You can also guarantee the program does not read information that it still must access, for instance the program might need the information from uname but you don't want it to read the uname field or the program might need the result of one syscall to feed to another but the program itself should not be allowed to see the information. Finally, you can implement positive security, giving a particular list of all and only the calls the program is allowed to make rather than negative security which is mostly what binary security can offer.

While some of these features might be possible to implement for native binaries with clever hacks the performance hit would be unacceptable. If we want to block IPC to a process with a particular name in one program a binary solution would require a check for authorization for *every* program seeking to do IPC. While what we really want is system level programs to have fast unrestricted access and sandboxed programs to go through the security checks. So if we want these completly general security restrictions for binaries we either must accept the overhead of every syscall checking for authorization or write a wrapper function for every system call. If we want the ability to replace arbitrary syscalls with our own code, perhaps all programs in a particular sandbox need to be given a modified list of running processes, the difficulty becomes even greater. Not to mention the inherint superiority of virtual security over binary security. Since a pre-compiled binary is directly running on system hardware it is much easier for the slightest error in your security model to allow an arbitrary exploit.

Finally, usint a virtual machine allows trusted computing and contract type programming difficult to implement in pure binary. At heart this is similar to the security issue but differnt in intent. For instance a particular program/plugin may need both to access the internet, say to check for updates or gather data, and handle personal information and a VM based system can track references to allow both but guarantee that the personal information can't exit the local machine (yes this is hard and would have to be conservative). While I don't necessarily like the idea this could also work to protect copyrighted content while allowing the user to load their own tools to search or format the information. It also has the potential to improve grid computing by providing better guarantees that it is really the distributed code which was executed. Finally it offers the possibility of function libraries of unknown origin with enforced contrats.

The rest of this message will be in the next post.

by logicnazi on Tue 11th Jan 2005 05:07 UTC

So in the previous post I hope I clarified the reasons I think the use of virtual machines in programming offers very compelling advantages. Now many of you seemed to accept this proposition but be unconvinced that this needed to be moved into the operating system.

My first reason is that if we don't move it into the operating system the benefits of these new coding methodoligies and languages will be forever locked in the ghetto of application level programming. If we want to be able to use these fancy new languages in the OS itself or device drivers there simply isn't any way around moving things into the OS. While I realize some of you may think the idea of using things like java in the guts of an OS I would remind you that people once felt the same way about using things like C or C++. So while these may not move into the kernel itself I can certainly see these in device drivers and system calls.

In particular the safety features of virtualization are particularly attractive in device drivers. Since device drivers are often downloaded from the internet from untrusted sources but given hardward level access they are particularly in need of security features. Furthermore they are illsuited to the all or nothing type of authorization present in binary only security implementations as it is not uncommon that we would want to give a driver the ability to do things like directly interface with a system bus, but only construct messages with a certain prefix (I'm out of my element here so I apoligize if the example is incorrect but we do often want device drivers access to certain arbitraril defined subsets of a device). While we could implement these protections by hand the evolution of hardware makes it difficult to continue adding protections that protect everything one needs to access and allow new device drivers enough access to do what they need without unduly slowing down trusted system drivers which we want to bypass any security checks. A VM system allows the possibility of a general solution which lets device drivers ship with a complete specification of exactly what access they need with enforcement.

Also some people have suggested that we need not add OS support to gain many of the process pretection and optimization features to our virtual machines. While it is certainly true that we can load much of the JIT compiler into a shared library and implement a small interface for each instance this doesn't solve the thread level protection problems (which was why I mentioned threads in my article instead of just processes). In particular since threads share many context enviornments it is not sufficent to simply handle each thread as its own OS process. So unless we want to just reimplement all the thread protections and scheduling in each virtual enviornment this suggests OS level support.

This support doesn't need to be very complex. I am thinking of something as simple as entering them as threads in the scheduler but instead of waking and sleeping them normally just calling wake and sleep functions in the virtual machine code. Actually now that I think about this it may very well be possible to do in some of the cooperative thread scheduling implementations which allow the kernel to do scheduling and then just expousing data to allow a user space process to manage the threads.

Other than a few small kernel modifications like this the rest of what I mean by OS integration is basically putting support for a common multi-level binary format in the binary loader, the same way the binary loader supports shared libraries. In other words provide a common infrastructure to support the binary caching and multi-level features I suggest for JIT compilers and apparently are even implemented in some situations. So rather than each virtual system doing these things itself there would be uniform system libraries that provide functions like get ssa representation or find code associated with ssa subtree. This is needed, as I mentioned in my article, so that executables written in several virtual machines might be executed without the overhead of CORBA or the like.

Now I agree that many of these things are forward looking and somewhat speculative. It probably isn't time to start throwing virtual code in device drivers and the like. However, since I think it is inevitable that more and more of our applications will be written in high level languages with virtual execution it doesn't hurt to start thinking about it now.

Ohh and also one of the other advantages of virtual machines is reflection type services (maybe I got the term wrong but I mean things like creating functions on the fly). For instance it is nearly impossible to write good lisp as a native binary (and I don't count putting the interpreter in the binary). For the person who asked the virtual machine in question here is basically a machine which implements car, cdr and a few basic operations and has two stacks, similar to a lisp machine(tm).

VM Based Development
by David Rollins on Tue 11th Jan 2005 06:16 UTC

There are some other nice benefits of VM-based architectures. Generally better integrated security. VMs make it much more difficult to create buffer overflows. And, while it's still possible for coding errors to lead to privilege escalation within a VM (for example, when somebody misses a security demand prior to executing sensitive code), the sandboxed nature of a VM can provide yet another layer of security around an app (aka shell) to prevent it from horking your machine. Better portability. Easier coding and maintenance. Reflection (which makes late-binding a lot easier than using RTTI in C++).

Re: JIT is faster than static compilation
by Anonymous on Tue 11th Jan 2005 09:01 UTC

It's true that you can manually create all the instructions C compilers produce by hand, but these 'optimizations' are actually extra instruction that are seen as 'useless' by many assembly programers and they chose to shortcut the instructions. After all, why access that register, etc, etc, when you don't have to? However, it's this particular order of instructions that actually places the processor in the right state such that it can accurately predict what to do next.

You're right that it needs in-depth knowledge to do assembly optimizations; however, the overwhelming majority that use assembly to 'optimize' don't have that knowledge and do it "because it must be faster."

And again I will say, the JIT and runtime will always know more about the executing state of the machine. Not just its configuration, but all the data and instructions issued, and dynamically recompile select sections of code to adapt. Static compilers and runtimes have to rely on alternate code paths. Dynamic compilers and runtimes can 'see' what's going on at the moment and adapt the generated machine instructions. Static compilers need to guess what will happen next and select a path. We're not talking about self-tuning of the program's internal variables and what-not, IIS and SQLServer have been doing that for years.

People like to cite the overhead of dynamic compilers and runtimes as to affecting performance in a negative way. However, this 'overhead' actually makes it faster. A lot of people find this very hard to accept as it seemingly goes against all conventional logic on the surface. It makes more sense when you go deeper and look at things like speculative execution and probabilities.

There is one very important trade-off with automatic compiler optimization, though. Source level debugging becomes useless after the compiler's had it's way, and you need to drop into the lower levels. Compiler and runtime writers dedicate a lot of effort to trying to make sure the optimizer doesn't introduce deterministic bugs

.Net already has a mechanism for this
by Sukru on Tue 11th Jan 2005 10:29 UTC

As far as I know, you can have your code AOT compiled when installing a .Net executable. Mono runtime also supports this capability.

I cannot exactly remember the extension (but lets call it .so). If you want to load Windows.Forms.Dll in mono, it first checks for the precompiled version in, and loads the normal version for JIT if it cannot find the precompiled one.

(For mono check: mono --aot, however I do not know about how the MS runtime does it)

Re: JIT is faster than static compilation
by Deletomn on Tue 11th Jan 2005 11:00 UTC

Anonymous (IP: (Assembly vs C Comment)
Yes... Yes... I know all that. However, there's one other thing to take into account. A human understands what the purpose (among other things) of the the program (or function) is, this sometimes becomes important, because some optimizations can be "obvious" to some humans and yet evade the best efforts of the compiler.

Granted, that doesn't always happen (perhaps not even often, I wouldn't have any statistics) but it is a possibility. And certainly it still requires a human who actually has some sort of skill at this.

Personally, in my opinion, the nice thing about optimizations done by the compiler, is that they are nice and easy and usually very good. So that you (the programmer) can focus on making the program work rather than worrying about every little point that might be inefficient. Optimizing a program properly and throughly can take quite some time. And then if you move it to another platform (or something else happens) you don't have to worry about having to rewrite all of your optimizations. The compiler will do it for you.

In addition, (as stated before) you need quite a bit of expertise in order to optimize a program well, and this requires research and experience. Once again, more time saved.

Anonymous (IP: And again I will say, the JIT and runtime will always know more about the executing state of the machine.
And once again... I will say no it won't. Not versus a properly written program. All options available to a JIT and runtime are available to a staticly compiled program. ALL OPTIONS There is not a single one which cannot be implemented with "some" effort. All techniques, all instructions, all data, everything is available. It could (possibly) be built into a static compiler. Certainly it can be done by hand. The advantage over doing it by hand is obvious, time, effort, and expertise. The advantage over static programs in general (including a compiler) is as I stated before, the static program will need to lug around everything it needs with it and of course it can't include optimizations which haven't been invented yet.

For example, a (simple) staticly compiled program can check it's status as it runs and see how some or all functions are used by keeping statistics. It can then modify function calls (or for really simple programs, function pointers) and change the function to different optimized functions (you could have 100,000 different variants of one function, though this would be excessively impractical)

More "advanced" static programs could actually go through and modify individual instructions. At this point, however, it becomes very impractical to do by hand (and thus things begin to fall apart) The reasons being obvious, it will be different for each processor, it requires a high level of expertise, difficult to write, difficult to test, etc.

And so on...

As I stated, the advantage over doing it by hand is time, effort, and expertise. (And obviously after a certain point practicallity) However, it is possible, which was my point and for all I know, some innovative programmers out there might have resolved these things staticly and might not be talking (because it's some "secret").

One thing they can't do is make the program add optimizations to itself, which it didn't know about at runtime. And as I said, it's going to have to haul all of it's stuff around with it and that's going to take a fair bit of space.

Anonymous (IP: We're not talking about self-tuning of the program's internal variables and what-not, IIS and SQLServer have been doing that for years.
Hmmm... As far as I know modifying individual instructions is quite a bit different from self-tuning a program's internal variables. (BTW... I've done some of these things to a small extent and I also know what I could have done but didn't have time to do.)

Anonymous (IP: People like to cite the overhead of dynamic compilers and runtimes as to affecting performance in a negative way. However, this 'overhead' actually makes it faster.
Overhead does not make things go faster. That's like saying that by dropping money on the ground I've become richer. Dropping money on the ground clearly makes me poorer. However, you can still have a net gain, because the action that caused the overheard allows you to do things you wouldn't be able to do. In the case of dropping money on the ground, I might now be able to pick up that wad of money "over there". Whether I actually made a net gain or not is completely dependent on how much that wad is worth versus the wad I used to have. (Of course, we also need to take into account the walking over there and the amount of energy it took to drop the first wad, plus the amount of energy it took to pick up the second wad.)

Anonymous (IP: It makes more sense when you go deeper and look at things like speculative execution and probabilities.
I know that, I just don't know what the actual probabilities are, because I've never done extensive real world testing with it myself. Things can be different between theory and reality you know. And things can vary from one system to another.

And I'd want extensive real world tests, I've seen too many big announcments from local "Java supporters" which actually didn't even apply to their own clients/company. (One comes to mind, someone was talking about how vastly superior Java is to C/C++ as far as portability goes. And his latest project was implemented in Java and I had the "pleasure" of seeing the results, about one half of workstations this program was supposed to go on didn't have a compatible JVM at the time, but they all had C/C++ compilers. Later, (as in a couple years later) those workstations were upgraded to machines that did have JVMs. Note the "Later" part.)

Note: I've got nothing against VMs and Java in general, I think in the long run (if nothing else) it will be great. But in my opinion alot of people are overly excited.

by Deletomn on Tue 11th Jan 2005 11:47 UTC

Overall I thought your article was interesting. However, I believe you missed what I consider the most important point in joining the VM with the OS: True plug-and-play for hardware.

Let me explain: If a standard was developed for devices through which they could store their own device drivers and they could be easily retrieved (via the standard) by the OS, then you could simply plug the device into the computer and the OS would "instantly" understand how to use the device. The drivers would also be "crossplatform", so if say Linux distributions and the Mac OS implemented the VM, then the device would automatically work with those OSs.

I haven't done much hardware design work, so I don't really know how much this would add to the cost, complexity, etc. of the devices, but I feel that it wouldn't be too hard to add and I also feel it would be well worth the effort.

JIT is *not* faster
by mikeyd on Tue 11th Jan 2005 12:18 UTC

There is no way JIT can ever be faster, because whatever you do it's all opcodes in the end - and with the JIT you have the overhead of doing the compiling at runtime. It's revealing that I have yet to see a major java app without a splash screen, and ime performance when running is also far lower, at least in a GUI. Anyway, while it's true that JIT code running on an Athlon can be faster than arch=i386 compiled code on the same machine, natively compiled code for your machine will always be faster than JIT.

Short point on JIT fastness
by logicnazi on Tue 11th Jan 2005 14:52 UTC

Alright so there has been much debate over whether JIT compilation can be faster than ahead of time compilation.

Now in some *very theoretical* sense AOT compilation can match anything JIT compilation can accomplish. After all one could regard the entire JIT compiler and executed instructions as one AOT compiled program. In general no matter what compilation technique you use some sequence of machine instructions is being executed and could be coded by hand in that manner or by a sufficently good AOT compiler. So in theory AOT will always have the advantage over JIT as whatever optimizations the JIT compiler will produce with run-time data can be hardcoded into the program, i.e., in the worst case you might write a program that uses self-modifying code to duplicate whatever run-time optimizations the JIT makes use of while avoiding some of the JIT overhead.

However, whether or not some ideal AOT compiler could do a better job really isn't the question. Producing a perfect compiler is actually mathematicall impossible (it would require solving the halting problem) so the correct question is whether a JIT compiler has practical advantages which make optimization easier than in an AOT compiler. Indeed I think it does for a couple reasons.

First of all profiling is simply easier with a JIT system as it happens automatically without forcing collection of real world data and recompilation. Moreover, unless one expects users to recompile all their own binaries with their own profiling info a JIT system has access to profiling info relevant to a particular users usage pattern which AOT compilers do not. This can make a real difference as a user who calls a function on large data sets may benefit from loop unrolling greatly but another who calls it often on small data sets may not. Similar considerations apply to optimizations for the processor the user is currently using.

I won't continue listing various runtime optimizations that are *easy* to make with a JIT compiler but suffice it to say they are their. While those of you insisting that an AOT compiler could do this you are correct in principle. One might just build a profiling feature into the compiled program and a function which modifies the code in response. However, we simply don't have good AOT algorithms to do this sort of thing while they are easy to do in JIT code. Moreover, at this level the distinction between JIT code and AOT code starts to disappear as one might reasonably alledge you are just incorporating the JIT compiler into your binary.

So I think the advantages of JIT compilation with respect to performance are clear the question is just whether they overcome the overhead of JIT compilation. I think the answer is clearly yes if we make good use instruction caching.

Compile Time
by kramii on Tue 11th Jan 2005 15:06 UTC

I guess I must misunderstand something. As I understand it, there appear to be two alternative models being discussed here.

1) Static Compilation *once* and for all in the development environment.

2) JIT Compilation *each time* the application is run, so that it can be optimized for a specific environment.

Surely the ideal is to compile an application *once* on a particular hardware configuration? So why not employ a third model:

3) Compile the whole application *once* when the application is installed. Recompile only when significant changes are made to the execution environment.

Or am I being too simplistic?


So some people have been praising AOT C style compilers because they always put the processor in a known state. I think this is a mistake and this is actually a great deficency in their operation tolerated because it is too hard to do anything else.

In any program the processor should always be in a known state, i.e., the processor state is completly determined by the input and the prior instructions. What we really mean by known state in this case is that the compiler has a convention about what the processor state must look like at particular times in the program. This means extra instructions are being used to bring the state into accordance with this convention even when it is not needed.

For instance consider the convention that after a function call the return value is stored in some particular register. This may often make sense but the return value may be entierly ignored by the calling code on every call, or always immediatly stored back to memory and not used for some time so it would make much more sense for the function to ignore the return value or place it in a temporary memory location itself so as to avoid spilling a register.

In fact much of compiler optimizations are about violating these conventions. The state of instruction selection theory today, I hope someday it will change to a more general algorithm, seems to be to start with strict conventions that are known to produce the correct result and then apply optimizations which abridge these conventions in a manner known not to screw up the program.

Unfortunatly, most of these processor level optimizations are performed using peephole analysis after code has been generated, i.e. the generated code is scanned with a relativly small window and if the optimizer recognizes a series of instructions that can be replaced with a faster version it does the replacement. As I understand it a BURS system is an advanced way to accomplish this task, basically it organizes instructions into dependency trees and then scans for matching sets of instructions it transforms via rewrite rules.

As should be apparent such a strategy depends heavily on finding efficent rewrite rules and the longer instruction sequences considered the more optimizations possible. As an example (which may or may not be real world reasonable) imagine a bunch of code which rests between a call saving the DS register and restoring it which makes no use of the data segment in between these points. If this entire sequence is subjected to a rewrite at once it may be able to make optimizations to store a variable in DS while if it only considers sequences of a smaller length it can never know it is safe to overwrite DS. However, one can hardly go through all instruction sequences of even length 20 so if one wants this optimizaiton to have a large window it depends on succesfully identifying commonly used code blocks and appropriate optimizations.

It was exactly this understanding and problem which lead me to suggest OS updates improving JIT compilation. As coders identify commonly used code segments and an optimization of that segment this rewrite rule can be sent to JIT, or even AOT, users. Since it is easy to verify that two code sequences are equivalent users can benefit from this optimization in all of their programs without any security risk.

Managed Kernel
by PlatformAgnostic on Tue 11th Jan 2005 23:34 UTC

JITs and VMs are system-level software. There's no getting around that. Application programmers have to rely on them doing their job accurately. Given the stability requirements for VMs, one has to assume that their code is very good and their bugs are few and far between. If so, why not push all of this code into the kernel? Eliminating user-mode entirely can speed up programs because all hardware accesses can be direct calls without the costly stack switch and access checks involved in a kernel-user transition.

Compile Time
by logicnazi on Wed 12th Jan 2005 03:51 UTC

In answer to your quesiton Kramii I am proposing a combination of 2 and 3. Instead of fully JIT the application each time cached code snippets would be used but it would still execute in a JIT/emulator style enviornment so that sensitive calls can be emulated for security and other reasons. Furthermore, I am suggessting that the compilation proccess be continous and in the background so that the proccessor is always sitting around optimizing applications, or at least whenever it gets a small improvement in optimizations.

RE: Managed Kernel
by Morin on Wed 12th Jan 2005 14:15 UTC

That's exactly what completely VM based OSes (e.g. JNode) do. Such OSes can run device drivers, applications and even most if not all VM code in the VM and thus in kernel mode.

(The trick to run the VM in itself is a bit more complex to explain and I don't have the time now, but JNode is doing that or will do it in the future AFAIK, so you can read up there).