The High Level Virtual Machine aims to build a common infrastructure for the development of dynamic languages (Ruby, Python, Haskell, etc.) targeting the Low Level Virtual Machine. It provides a platform agnostic virtual machine runtime, on top of LLVM, which is able to interpret, jit- or static- compile any supported language. Since all language use the same underlying VM, it’s easy to achieve code and data interoperability between different languages. Version 0.1 was released June 13.
High Level Virtual Machine, v0.1
About The Author
Follow me on Twitter @thomholwerda
2006-06-18 1:28 amReidSpencer
Re: Green –
I don’t expect HLVM to take all that long to develop. It took two months to get “Hello, World” working. The main reason for this is that it is based on LLVM. All the complex code generation, platform support, optimization, etc. that LLVM provides is not being re-invented. Neither is the runtime (APR). All we’re doing is focusing on the language integration issues and creating a higher level virtual machine.
Re: parrot –
HLVM fits in the same category of software as parrot, CLR (.net), Java, Perl, etc. However, it has a few unique differences. First, being based on LLVM makes it possible to be interpreted, JITed, or static compiled. Second, we happen to think the programming model is a little more powerful than any of the predecessors, but only time will tell.
Re: I hope this one succeeds –
HLVM won’t be limited to bytecodes. You’ll be able to execute directly from source, XML representation, bytecode equivalent, or equivalent static compilation. Chris Lattner is a genius. Without his 5 year effort on LLVM, HLVM would not exist. I hope someone helps me with a Smalltalk front end too, because I like Smalltalk as well!
Re: let me understand this –
Exactly! The point of HLVM is to make language writing a breeze. I envision a future where programmer’s design smallish languages on the fly to make it possible to program in the natural constructs of the problem domain, not the solution domain.
Re: Haskell –
I suppose it depends on what you mean by dynamic. In any event, Haskell is on the list of target languages for HLVM.
Thank you all for you kind remarks and curiousity. HLVM is a long way from being complete, but its early comments like these that inspires me.
2006-06-18 4:04 amBigZaphod
Just FYI: The “Developers” link on the left leads to a 404. And the “User Documentation” link in the “More Info” section of the front page leads to http://docs/index.html (at least in Safari).
What’s the best way to get started? I haven’t downloaded/tried it yet (looks like quite a lot of dependency packages to get right first). Does it work on OSX? Is there a sample language implemented that I could look at online somewhere? (I was trying to find one, but it’s late and I might just be missing it )
2006-06-18 4:52 amReidSpencer
Thanks for noticing the broken links. Both fixed now.
Unless you’re planning on using or developing HLVM, I’d suggest you don’t try getting started with it just yet. Wait for the 0.3 or 0.4 release when an actual language is implemented with HLVM. Although released, HLVM is at a very early stage of its development. I’m releasing under Linus Torvalds philosophy of “Release Early, Release Often”.
The dependency packages do need to be built. Most notably, a release version of LLVM won’t work, you need the CVS head version. I’m working right now to make the HLVM build system utilize the various *-config tools of the dependent packages to make configuring HLVM a no-brainer.
Yes, HLVM works on OSX. One of our developers builds it on that platform.
There are no languages implemented at this point other than the AST representation in XML. You can get a feel for the AST by looking at the XML test cases. There is also a full Relax/NG grammar that specifies the XML structure of the AST. You can get this at:
2006-06-18 4:55 amBigZaphod
Thanks for the info and fast reply! I wish you much luck and fun with this.
looks a bit like parrot or .net
I’m currently working on a related project, actually a RISC-like, register-oriented intermediate language instead of the usual bytecoded, stack-oriented intermediate languages like the ones used on the JVM and MS IL (.Net). I’m targetting gcc for the moment, but LLVM is definitely on the radar.
It’s kinda disappointing that HLVM uses bytecodes as well, but that might just be me and my bias towards my own project. Very exciting, still.
I hope Chris Lattner’s job at Apple will make it easier to use LLVM as a backend, and hopefully increase people’s awareness of his project, ’cause it’s really, really, seriously cool. I actually envy him
And I hope someone gives Reid Spencer a hand to build a Smalltalk frontend (myself, on a somewhat distant future, perhaps?). I happen to love Smalltalk, you know
2006-06-17 8:45 pmSamuraiCrow
I’m involved in a project of trying to make old Amos Basic programs from the Amiga generate code for multiple modern platforms. We’ve been planning on using the intermediate code of GCC as an optimizer and code generator and SDL as the graphics and audio layer for the sake of portability.
I also hope this one succeeds since that could mean a platform agnostic packager for us and anyone else trying to write a language that works on nearly any operating system.
Would that mean that languages like Ruby, Python, Haskell, etc. could someday basically become grammars on top of a shared VM?
Haskell is a dynamic language ? I think its just the opposite of dynamic: it has VERY strict type system.
it will be interesting to see prolog calling lisp…
I envision a future where programmer’s design smallish languages on the fly to make it possible to program in the natural constructs of the problem domain
It is a great goal. It will require genetically engineering programmers capable of language design, though.
2006-06-18 8:05 amReidSpencer
It will be interesting to see prolog calling lisp
While I’m not delusional about the difficulties inherent in cross-language interoperation, I belive it is a worthy goal and if we only get part way there, it will still be something useful. Besides that, HLVM will be useful even if none of the languages can interoperate.
One of the reasons programmer’s don’t create their own languages is because it generally means creating a compiler and runtime and optimization and … None of those are trivial problems. Still, there’s over 2000 languages out there by one survey I saw. The point of HLVM is to make it easy to design languages and thus bring out the hidden language designer that lurks in the heart of every programmer.
2006-06-18 6:20 pmCloudy
Actually, cross-language interoperation is a solved problem. Systems capable of doing it have been around since DEC introduced VMS 30 years ago. It works well, so long as the languages are capable of interoperation. It fails only when the language semantics are sufficiently incompatible. Prolog and Lisp are an example of that. (Exercise for the reader: Why?)
One of the reasons programmer’s don’t create their own languages is because it generally means creating a compiler and runtime and optimization and … None of those are trivial problems.
Of these, adding a new kind of vm solves exactly: none. The compiler problem is really one of implementing a front end, as portable compilers with separate front ends have been around since YACC. The runtime involves a collection of libraries unique to the language. . .
Still, there’s over 2000 languages out there by one survey I saw. Perhaps as many as 8000, according to another. 8000 languages is a new language a week for nearly 40 years. Sounds to me like it’s already pretty easy to create new languages, in that case.
There are still plenty of good reasons for what you’re doing. A single VM to run all of these VM based languages is a reasonable goal (ignoring that both Smalltalk and Java define their VM to the byte code level, so you won’t be able to implement “pure” versions of those languages.)
But if your point is to make it easy to design languages, rather than easy to implement them, you’re working in the wrong space. (Yes, yes, I know you meant design easier by reducing implementation burden. I’m focusing on design itself now.)
Language design is hard. It is about abstraction, expressiveness, and semantics. Very few people have the combination of training and experience to be very good at this.
one great use i see for this platform is “subclassing” existing languages. for example, i’m a python programmer, and like the language a lot, but i’d like to add features that i think are just, while the rest of python-dev and the BDFL don’t.
once a language becomes just a grammar, i could make the changes i want in no-time. that way i could directly address the deficiencies (IMO) of the language, that make my programmeing better, and perhaps the main branch will accept some them back.
of course there’s a long way still, but i sure wish to see production-ready hlvm-python
2006-06-18 7:09 pmReidSpencer
That’s a wonderful idea, ganges master. The nice thing about the HLVM approach is that you can be reasonably certain your language “subclass” would work correctly because it is simply a different translation to the (hopefully stable/mature) AST. This is exactly the kind of thing I would like to encourage with HLVM.
Yes, there’s still a long way to go, but fortunately for you, python is one of the languages we’ll tackle early in HLVM’s life.
2006-06-19 2:32 amCloudy
once a language becomes just a grammar,
And that, in a phrase, sums up why attempts like this have always fallen over in the past.
Language design, even language feature design, is not a mattter of annotating an AST with a grammar. (It’s not a matter of finding a neutral, er, I’m sorry, these days we say “agnostic” IL representation either; although that has always turned out to be a holy grail, as well.)
It would be nice to see projects like this cover new ground, but so far the only evidence is that the developers are going over old ground in the same way again.
2006-06-19 7:07 amReidSpencer
I’m curious, what new ground would you have HLVM cover?
The point of HLVM, (for me, not necessarily for others) is to do two things: (1) provide a VM that takes advantage of LLVM, and (2) provide an interface to programming in XML as a stepping-stone to another project I’m working on. If we get a decent virtual machine that covers “old ground”, I’m not bothered by it.
But, I’m still curious what new ground you’d like to see covered.
2006-06-20 3:43 amCloudy
It’s not the ground, so much, as your expectation of what you’re going to find going over it.
LLVM/HLVM will give you the ability to do language-interoperability in the subset of procedural languages which are VM based but without prescribed VMs, by sharing a VM. That’s a good thing, and by all means you should go for it.
But using XML to represent the annotated grammars you want to generate compilers for isn’t going to give you the silver bullet that the project literature seems to imply.
Language design is hard, and changing from YACC/LEX representation of 30 years ago (Bison/Flex, these days) to XML fo generating the front end is going to make it different, but not easier.
The new ground I would like to see covered in language design probably doesn’t fit your research interests, so I’d hesitate to recommend that you personally work on it, but it is this: Why do languages with rich well specified semantics fare so poorly among programmers while languages with vague poorly thought out semantics end up being so popular?
The answer to that question, in my opinion, would take us a long way towards being able to design good languages, something we’re not good at, even given the 8000 or so attempts made so far.
Languages become popular because they are
b) fulfill a percieved need
c) are easy to get into
Compare a language with “rich well specified semantics” such as, Lisp, OCaml, Haskell, <insert whatever language you were thinking of> to say, Ruby. I have studied many languages and the time it takes to get up to speed with most of the aforementioned langs is frightening compared to Python/Perl/Ruby/PHP. Those languages may not be the best specified, or even specified formally at all but they are available, easy to slide into and they fill a need. Personally I don’t understand how academics can overlook such an important thing as user experience and some basic asthetics when designing the REPL environment. Once you get people in there and learning, your task is almost done to turn them into converts.
Forgot one thing.
d) scary syntax
Don’t underestimate how discouraging it is to look at some code, have a good idea what it’s doing but still not be able to reason out how it’s getting there. A language that adopts some of the “normal” (that is existing in major languages) idioms goes a long way to easing a developer into that new language. Hand a C++ guy Haskell and measure how long it takes for him to write something small. Now hand him Python. It could be argued that in the long run Haskell is a better choice but that doesn’t matter as the developer will never stick around to get there. Sometimes languages are too clever for their own good.
But I hope it will evolve. It just proves that open source will always reinvent itself, and after Parrot, NekoVM, now this one. Eventually, someone will get it right. But ideally, I think they need it to be practical from the get-go. It would be really good to get something like this working for several languages in one year or so, because the faster to find the differences between the languages and the VM, the better to solve the design issues.