Optimising JIT'ing

Discussions on more advanced topics such as monolithic vs micro-kernels, transactional memory models, and paging vs segmentation should go here. Use this forum to expand and improve the wiki!
Post Reply
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Optimising JIT'ing

Post by AndrewAPrice »

I've been thinking about a optimizing JIT-implementation.

Each executable file includes 2 copies. One in bytecode and (optionally) one in native code. When a program is compiled, the compiler generates bytecode, and also attaches a list of CPU optimisations that the program can take advantage of.

When the program is ran, the OS will look at the list of optimisations that can be done to the program, and see which ones are supported (like certain registers, instruction sets, ways to re-arrange code to run faster) by the processor/system and then optimise the bytecode into native code for that system. This will probably take a while in large executables which is why the ending native program could also be stored back in the executable.

When a native program is built and stored back into the executable, there will also be a list of processor optimizations (along with the architect of the system, etc) that have been done on the executable.

The next time you run an executable, it will see that a native version exists. The OS will look at the list of system-specific optimisations that have been done on the executable and if one isn't supported on the system or an optimisation is available that hasn't yet been done, and if there are any then the native image will be rebuilt - which solves compatibility with transferring programs between systems.

Of course, there will be ways to run a program without optimising it (direct bitcode to native) and without storing the native version back into the program (default when running over a read-only or network file system).

Why do I think this is worth while? Because instruction sets like SSE2/3/4/5 are constantly being developed, and not all processes support these. So if a program was optimised on each system to take advantage of what instruction sets were available, you could increase program performance on new processors while maintaining compatibility on older systems.
My OS is Perception.
User avatar
lukem95
Member
Member
Posts: 536
Joined: Fri Aug 03, 2007 6:03 am
Location: Cambridge, UK

Post by lukem95 »

sounds like a pretty cool concept, but surely this will add a significant amount of load time or lag on the program when its running? i know this will only be for the first time on each system, but it could be a significant period of time in the case of large programs.

Also, this will increase the actual size of the file by a fair amount, may not be an issue, as HDD prices are constantly coming down, but still, taking the argument of large programs again, could mean that say a 500mb binary is now 800/900 or even a gig.
~ Lukem95 [ Cake ]
Release: 0.08b
Image
User avatar
AndrewAPrice
Member
Member
Posts: 2303
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Post by AndrewAPrice »

The argument list will be like simple, like attached to the bytecode:
- This program will benefit with hyperthreading.
- This program will benefit with SSE.

And attached to the native code:
- This program has been compiled for a x86 system.
- This program utilises SSE.

And the OS will compare what features are/are-not available on that system and rebuild the program as necessary.

A 500MB binary would barely have 10KB of executable code and the other would be 499.9MB embedded resources that wouldn't even need to be touched by the kernel.
My OS is Perception.
User avatar
lukem95
Member
Member
Posts: 536
Joined: Fri Aug 03, 2007 6:03 am
Location: Cambridge, UK

Post by lukem95 »

ah, that makes sense about the executable vs. included resources sizes, and although the argument of size is still valid, it will make much less impact.

In that case, i think its an even cooler concept, and especially if you can cut down the recompile (Conversion to native code) time, then i think your onto a winner ;)
~ Lukem95 [ Cake ]
Release: 0.08b
Image
SpooK
Member
Member
Posts: 260
Joined: Sun Jun 18, 2006 7:21 pm

Re: Optimising JIT'ing

Post by SpooK »

MessiahAndrw wrote:Why do I think this is worth while?
Probably for the same reason I do. This has been a design plan of mine for many years.

However, I will not being focusing entirely on actual JIT, as that will be especially for sandbox (unsafe) apps.

Instead, I intend to focus on native compilation. I tentatively call this "optimizing installation" for any program that is installed from a bytecode installation source and assembled/compiled/linked/converted to run directly on the system architecture. Stand-alone programs will be forced to run in the sand-box, as their native versions must be accounted for in some manner after installation/optimization occurs.

When CPU upgrades/downgrades occur and new optimizations are available/needed, it can trigger applications to "re-optimize" from the program's bytecode source, in which will actually be cached somewhere on the hard drive.

There are many more details, of course, but I did not want to hijack this thread too much :P

Overall, it is a massive undertaking, but I can always shoot for the bare-minimum CPU in an architecture line (e.g. i386) and work my way up the optimization chain as time permits ;)
User avatar
Colonel Kernel
Member
Member
Posts: 1437
Joined: Tue Oct 17, 2006 6:06 pm
Location: Vancouver, BC, Canada
Contact:

Re: Optimising JIT'ing

Post by Colonel Kernel »

MessiahAndrw wrote:I've been thinking about a optimizing JIT-implementation.
I don't think what you've described could be called JIT -- it's more like the install-time pre-compilation done by Singularity.

In a JIT system, the executable that's loaded is nearly all bytecode, with only a small bootstrap portion that calls the run-time environment. The v-table for each class in such systems are initially set to point to JIT-compiler routines so that the first time a method is called, it is compiled to native code and the corresponding v-tables are patched. At least that's the general idea...

Whether you need JIT or just pre-compilation depends a lot on whether you allow dynamic loading. How would that work in your system?
Top three reasons why my OS project died:
  1. Too much overtime at work
  2. Got married
  3. My brain got stuck in an infinite loop while trying to design the memory manager
Don't let this happen to you!
SpooK
Member
Member
Posts: 260
Joined: Sun Jun 18, 2006 7:21 pm

Re: Optimising JIT'ing

Post by SpooK »

Colonel Kernel wrote:
MessiahAndrw wrote:I've been thinking about a optimizing JIT-implementation.
I don't think what you've described could be called JIT -- it's more like the install-time pre-compilation done by Singularity.
This is true.

However, and believe it or not, my design goals pre-date the announcement of Singularity by at least a year. I think C# was out already, but as soon as I heard and saw what Singularity was, I dropped my design in utter frustration. I knew they could fully release in 1 year (not including the many patches/updates that will follow) what it would take me another 5 just to prototype. I have an overbearing habit of not wanting to waste my minuscule amount of free time in reinventing the wheel.

Thankfully, C#/CIL/.Net have ultimately panned out to be quite different than my designs. So after a few years of neglect, I am back to developing DynatOS and the Dynatos VM based upon my original plans ;)
Tyler
Member
Member
Posts: 514
Joined: Tue Nov 07, 2006 7:37 am
Location: York, England

Re: Optimising JIT'ing

Post by Tyler »

Colonel Kernel wrote:
MessiahAndrw wrote:I've been thinking about a optimizing JIT-implementation.
I don't think what you've described could be called JIT -- it's more like the install-time pre-compilation done by Singularity.

In a JIT system, the executable that's loaded is nearly all bytecode, with only a small bootstrap portion that calls the run-time environment. The v-table for each class in such systems are initially set to point to JIT-compiler routines so that the first time a method is called, it is compiled to native code and the corresponding v-tables are patched. At least that's the general idea...

Whether you need JIT or just pre-compilation depends a lot on whether you allow dynamic loading. How would that work in your system?
Would your call .NET JIT? I would, and it does the above with a few changes.

Firstly, native code is not placed back into the original file, it is stored seperately. Secondly, not all code is compiled to native for the long term, only that which is specified. Lastly, they run the compiler constantly in the background when it has Assemblies to compile, instead of waiting for them to be run.
exkor
Member
Member
Posts: 111
Joined: Wed May 23, 2007 9:38 pm

Post by exkor »

Compilers themselves have hard time deciding when to apply right optimization, which jump is more likely to happen, whenever programs needs SSE or not. In many cases benefits from using multiple treads can be only verified by testing(measure time) the algo. I probably start writing compiler first.

You'll need lots of your time to implement that so I can see decent(5%+) speed increase.

If you plan to store optimized version I would run the optimization process on all files on hard drive only once during program installation.
Meor
Posts: 13
Joined: Fri Mar 14, 2008 11:29 am

Post by Meor »

I think a simpler solution would be for the program writers to cross compile their program to a lot of different platforms. If someone has a significantly different system, they could just compile from source.

There's an option in the Intel compiler to do a runtime analysis of a program where it will insert the correct branch hints based on actual program execution examples.
Post Reply