[ What are the advantages of just-in-time compilation versus ahead-of-time compilation? ]
I've been thinking about it lately, and it seems to me that most advantages given to JIT compilation should more or less be attributed to the intermediate format instead, and that jitting in itself is not much of a good way to generate code.
So these are the main pro-JIT compilation arguments I usually hear:
- Just-in-time compilation allows for greater portability. Isn't that attributable to the intermediate format? I mean, nothing keeps you from compiling your virtual bytecode into native bytecode once you've got it on your machine. Portability is an issue in the 'distribution' phase, not during the 'running' phase.
- Okay, then what about generating code at runtime? Well, the same applies. Nothing keeps you from integrating a just-in-time compiler for a real just-in-time need into your native program.
- But the runtime compiles it to native code just once anyways, and stores the resulting executable in some sort of cache somewhere on your hard drive. Yeah, sure. But it's optimized your program under time constraints, and it's not making it better from there on. See the next paragraph.
It's not like ahead-of-time compilation had no advantages either. Just-in-time compilation has time constraints: you can't keep the end user waiting forever while your program launches, so it has a tradeoff to do somewhere. Most of the time they just optimize less. A friend of mine had profiling evidence that inlining functions and unrolling loops "manually" (obfuscating source code in the process) had a positive impact on performance on his C# number-crunching program; doing the same on my side, with my C program filling the same task, yielded no positive results, and I believe this is due to the extensive transformations my compiler was allowed to make.
And yet we're surrounded by jitted programs. C# and Java are everywhere, Python scripts can compile to some sort of bytecode, and I'm sure a whole bunch of other programming languages do the same. There must be a good reason that I'm missing. So what makes just-in-time compilation so superior to ahead-of-time compilation?
EDIT To clear some confusion, maybe it would be important to state that I'm all for an intermediate representation of executables. This has a lot of advantages (and really, most arguments for just-in-time compilation are actually arguments for an intermediate representation). My question is about how they should be compiled to native code.
Most runtimes (or compilers for that matter) will prefer to either compile them just-in-time or ahead-of-time. As ahead-of-time compilation looks like a better alternative to me because the compiler has more time to perform optimizations, I'm wondering why Microsoft, Sun and all the others are going the other way around. I'm kind of dubious about profiling-related optimizations, as my experience with just-in-time compiled programs displayed poor basic optimizations.
I used an example with C code only because I needed an example of ahead-of-time compilation versus just-in-time compilation. The fact that C code wasn't emitted to an intermediate representation is irrelevant to the situation, as I just needed to show that ahead-of-time compilation can yield better immediate results.
Greater portability: The deliverable (byte-code) stays portable
At the same time, more platform-specific: Because the JIT-compilation takes place on the same system that the code runs, it can be very, very fine-tuned for that particular system. If you do ahead-of-time compilation (and still want to ship the same package to everyone), you have to compromise.
Improvements in compiler technology can have an impact on existing programs. A better C compiler does not help you at all with programs already deployed. A better JIT-compiler will improve the performance of existing programs. The Java code you wrote ten years ago will run faster today.
Adapting to run-time metrics. A JIT-compiler can not only look at the code and the target system, but also at how the code is used. It can instrument the running code, and make decisions about how to optimize according to, for example, what values the method parameters usually happen to have.
You are right that JIT adds to start-up cost, and so there is a time-constraint for it, whereas ahead-of-time compilation can take all the time that it wants. This makes it more appropriate for server-type applications, where start-up time is not so important and a "warm-up phase" before the code gets really fast is acceptable.
I suppose it would be possible to store the result of a JIT compilation somewhere, so that it could be re-used the next time. That would give you "ahead-of-time" compilation for the second program run. Maybe the clever folks at Sun and Microsoft are of the opinion that a fresh JIT is already good enough and the extra complexity is not worth the trouble.
The ngen tool page spilled the beans (or at least provided a good comparison of native images versus JIT-compiled images). Here is a list of advantages of executables that are compiled ahead-of-time:
- Native images load faster because they don't have much startup activities, and require a static amount of fewer memory (the memory required by the JIT compiler);
- Native images can share library code, while JIT-compiled images cannot.
And here is a list of advantages of just-in-time compiled executables:
- Native images are larger than their bytecode counterpart;
- Native images must be regenerated whenever the original assembly or one of its dependencies is modified (which makes sense, since it could screw up virtual tables and stuff like that).
And the general considerations of Microsoft on the matter:
- Large applications generally benefit from being compiled ahead-of-time, and small ones generally don't;
- Any call to a function loaded from a dynamic library needs the overhead of one additional jump instruction for fixups.
The need to regenerate an image that is ahead-of-time compiled every time one of its components is a huge disadvantage for native images. This is the root of the fragile base class problem. In C++, for instance, if the layout of a class from a DLL you use with your native app changes, you're screwed. If you program against interfaces instead, you're still screwed if the interface changes. If you use a more dynamic language instead (say, Objective-C), you're fine, but this comes with a performance hit.
On the other hand, bytecode images don't suffer from this issue and do it without the performance hit. This, in itself, is a very good reason to design a system with an intermediate representation that can easily be regenerated.
Simple logic tell us that compiling huge MS Office size program even from byte-codes will simply take too much time. You'll end up with huge starting time and that will scare anyone off your product. Sure, you can precompile during installation but this also has consequences.
Another reason is that not all parts of application will be used. JIT will compile only those parts that user care about, leaving potentially 80% of code untouched, saving time and memory.
And finally, JIT compilation can apply optimizations that normal compilators can't. Like inlining virtual methods or parts of the methods with trace trees. Which, in theory, can make them faster.
Better reflection support. This could be done in principle in an ahead-of-time compiled program, but it almost never seems to happen in practice.
Optimizations that can often only be figured out by observing the program dynamically. For example, inlining virtual functions, escape analysis to turn stack allocations into heap allocations, and lock coarsening.
Maybe it has to do with the modern approach to programming. You know, many years ago you would write your program on a sheet of paper, some other people would transform it into a stack of punched cards and feed into THE computer, and tomorrow morning you would get a crash dump on a roll of paper weighing half a pound. All that forced you to think a lot before writing the first line of code.
But, there is no such thing as JIT-only languages. Ahead-of-time compilers have been available for Java for quite some time, and more recently Mono introduced it to CLR. In fact, MonoTouch is possible at all because of AOT compilation, as non-native apps are prohibited in Apple's app store.
I have been trying to understand this as well because I saw that Google is moving towards replacing their Dalvik Virtual Machine (essentially another Java Virtual Machine like HotSpot) with Android Run Time (ART), which is a AOT compiler, but Java usually uses HotSpot, which is a JIT compiler. Apparently, ARM is ~ 2x faster than Dalvik... so I thought to myself "why doesn't Java use AOT as well?".
Anyways, from what I can gather, the main difference is that JIT uses adaptive optimization during run time, which (for example) allows ONLY those parts of the bytecode that are being executed frequently to be compiled into native code; whereas AOT compiles the entire source code into native code, and code of a lesser amount runs faster than code of a greater amount.
I have to imagine that most Android apps are composed of a small amount of code, so on average it makes more sense to compile the entire source code to native code AOT and avoid the overhead associated from interpretation / optimization.