Home > General Development > OCaml, an Introduction OCaml, an Introduction Eugenia Loli 2004-02-10 General Development 35 Comments Object Caml is an ML type of language. For the non-gurus: it’s a functional language that can also be programmed in a non-functional and object-oriented way. Read about it at the latest issue of the LinuxGazette online magazine. About The Author Eugenia Loli Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker. Follow me on Twitter @EugeniaLoli 35 Comments 2004-02-10 6:26 am There is a nifty side-benefit to functional programming that I think will become more important as we hit frequency ceilings for CPUs. Consider this: why do two 1.5GHz processors not perform similarly to a single 3.0GHz processor? Or, equivilently, why is a CPU with four functional units not twice as fast as a CPU with two? The answer is because of a lack of parallelism in the running code. This lack of parallelism is precisely why Intel designed the P4 the way it did — fast and narrow. With a clock-speed of 3+ GHz, the P4 is good at handling the sequential code engendered by C. Of course, this approach is hitting limits. To push clock-speed higher and higher, Intel has done extreme things like make a CPU with a 31-stage pipelines. Functional programs have the nice property that they have a lot of inherent parallelism. Consider the following C code: a = calc_foo(); b = calc_bar(); In C, calc_foo() and calc_bar() must be executed sequantially, because there is a sequence point after each function call. In a functional language, calc_foo() and calc_bar() usually do not have side-effects (Ocaml idiom discourages them, some languages like Haskell and Clean outright disallow them) and can thus be executed in parallel. 2004-02-10 7:17 am I had to use OCaml in a few of my classes. OCaml gives absolutely terrible error messages. When writing OCaml code, you spend more time trying to figure out what the error message means than anything else. The big problem with functional languages is it gets really ugly when you try to interface them with non-functional code (i.e. most libraries on the system). Functional programming would be great for implementing mathematical algorythms, however, I’ve always decided against using it because the difficulties in interfacing it with things that don’t fit the functional way would outweigh the benefits. 2004-02-10 7:37 am This lack of parallelism is precisely why Intel designed the P4 the way it did — fast and narrow. With a clock-speed of 3+ GHz, the P4 is good at handling the sequential code engendered by C. Of course, this approach is hitting limits. To push clock-speed higher and higher, Intel has done extreme things like make a CPU with a 31-stage pipelines. I don’t think this was Intel’s reasoning behind the P4 at all. The P4 requires the use of its vector units to achieve a reasonable degree of performance. I think the reasoning behind the P4’s engineering is simple: clock speed sells, regardless of how the chip performs. The P4 wastes a percentage of its total cycles approximately equal to the percentage of branch instructions in a particular code by clearing pipelines after it mispredicts a branch. Clearly efficiency was not one of the design requirements of the P4… 2004-02-10 7:40 am The error messages make a whole lot more sense if you are familiar with the lingo of the statically-typed FP world. Interfacing with non-functional code is a lot easier in languages, like Ocaml, that don’t rigerously enforce referential transparency. I haven’t done any work with GTK+ and Ocaml, but from the docs, it seems pretty similar to doing it in C, except the LabGTK+ API is somewhat object-oriented (in the Ocaml sense). 2004-02-10 7:41 am Just for anyone wondering if Ocaml is any worth at all: the awesome Linux port of eDonkey2000 P2P software (mldonkey) is developed with Ocaml. The ./configure script detects if it is installed and downloads a local copy if you so request. 2004-02-10 7:52 am That’s an extremely cynical way to look at things. Very large and lucrative portions of the market do actually read the benchmark numbers for CPUs. Certainly, gamers and workstation users read benchmarks. Even IT people read benchmarks, as the mainstream ZDNet rags are full of them. Joe Sixpack might not read the benchmark numbers, but chip manufacturers don’t make a whole lot of money from the low-end home market… Plus, the P4 does definitely demonstrate good performance for 3D graphics, video compression, and other multimedia programs. Not surprisingly, these sorts of programs have low branch-density, and are extremely amenable to the P4’s long but fast pipeline. These sorts of programs are very important in some of the choice market segments — notably gamers. It is hardly a strech to believe that this was an engineering decision on the part of Intel’s development team. Also, what does requiring the use of vector instructions have anything to do with what I’m talking about? Intel saved transistor space by giving the P4 a weak x87 FPU, making people depend more on SSE. As the P4’s die was huge for the time (north of 45 million transistors as I remember) this was again an engineering decision. 2004-02-10 8:00 am Maybe OCaml has improved recently, but when I last used it 2-3 years ago, the errors didn’t say much more than that something was wrong. It rarely gave you any indication of what the problem was. When it did, the error message was usually wrong. If your syntax was a little off, it would parse things really oddly and think you were trying to do something completely different, making the error message useless. It also didn’t help that the interpreter liked to give you the character number an error occured at. “Error near character 1123” isn’t fun to track down. Context would be useful, or at least line numbers instead… 2004-02-10 8:07 am Interfacing with C in any language, well apart from C++, is probably never any fun. I’ve never done it in OCaml, but I’ve done it in Java and Python, and it’s pretty finicky in those. SWIG has support for OCaml, which might be useful. 2004-02-10 8:18 am I use OCaml every day in my research and I was a bit disappointed by this article because it didn’t seem like the author did a fantastic job of describing the language. It’s not a perfect language but I find it’s extremely elegant for expressing computations. You can do functional or imperative coding, object-oriented coding, whatever suits your fancy. I think my ideal language would be something like ocaml but with required type annotations on function definitions. It’s true that the error messages are confusing until you understand the type system, but I’ve never yet seen one of them be wrong. They’re usually correct if you assume that what you typed was what you meant, but that’s usually not the case when you’ve made a mistake. 🙂 The nice thing about OCaml is that once you clear up the error messages the code is quite often correct! It’s really easy to avoid the majority of problems that you encounter in C or C++. 2004-02-10 9:01 am So, apart from parallel computation as opposed to sequential, what are the practical benefits of using Ocalm, say over C? The author in my opinion failed to give compelling reasons to use the language. Does it produce tighter or more compact code? Does it produce faster code? Does it reduce development time? Does it have efficient, stable, well researched, well tested compilers? Does it have standard APIs/libraries/tools? Are there are any large scale successful projects written in the language(e.g operating systems, databases, web browsers, desktop applications etc). These are some of the questions coders have mind before experimenting with a new langauge, or even considering the said langauge for their next project. 2004-02-10 9:23 am Pros: speed of implementation correctness of code minimal bugs (in theory and in practice) really beautiful code – really, it is sooooo nice Cons: you need to put in some initial effort to get the most out of it. Most people with bad things to day about functional laguages have never put in the (quite significant) effort to switch their brains from imperative thinking Poor library support – .net/mono might change this 2004-02-10 9:28 am I find an ML language to be very useful in rapidly implementing almost any algorithm. I can write the implementation faster, with less code, and with less bugs. The down side is usually functional languages are compiled into bytecode. So the performance is rarely on par with C++. SML/NJ can be compiled into native code that needs to be appended (literally) on the end of a run time executable. (At least last time I used it). OCaml compiles into very fast native executables. For many things the performance is almost as good as C/C++. Adding in the reduced development time makes OCaml very compelling. Though I don’t have any idea about differences in executable size. Now, there are down sides. GUI programming, is a pain. SML/NJ has a type system that differenciates between float/int operators based off of type, OCaml uses a different syntax for float/int operations. Which is less friendly. Though i haven’t used JNI or done more than simple tests, OCaml seems to have a OK system of interfacing with C. There is a special datatype in OCaml that matches the memory configuration of C or Fortran arrays. 2004-02-10 9:42 am Yes, it’s not a very article. He has tried to pack way too much into it and doesn’t have a clear message. There is no way his target audience could follow it. I’m not impressed with his code either. For instance, his converttime function which takes a date string and returns a numerical time is a little strange. If the date string is bogus, it will either return a bogus time or it will print an error message to stdout (!) and return some special number (max_float). That is, it is inconsistent. And returning a special value is not the right way to do things in langauges with exceptions. The function should either return a bogus value for a bogus input, or it should trow an exception for a bogus input, but not some weird mix of the two. His function is cluttered up with exception handling code that makes it hard to follow and doesn’t do anything useful except to generate bogus output. 2004-02-10 9:57 am I said: “The function should either return a bogus value for a bogus input, or it should trow an exception for a bogus input” I didn’t quite mean that. What I meant to say was that the function should probably throw an exception for a malformed input rather than to produce a bogus output. That way, his code would be simpler, easier to read, more consistent, and more useful. 2004-02-10 11:10 am I’ve found that it’s quite easy to use most languages in a non-functional way. As a matter of fact, the majority of code I produce turns out to be non-functional. ; ) 2004-02-10 11:44 am If your code is non-functional perhaps you should use a more clean way to program:) Well, actually I know little of clean, but I couldn’t resist. 2004-02-10 1:15 pm Another big advantage of referential transparency is that you can always cache function return values since you know that a function will always have the same return value if you give it the same parameters. This could even be done by the compiler. What I do not like about OCAML is the lack of operator overloading. For example if you want a floating point division you need to use the ./ operator instead of the / operator. Of course this simplifies type inference, but what am I supposed to do if I write other numeric types such as complex numbers and vectors? Non-strict functional languages such as OCAML make it very easy to interact with non functional libraries. For example F# (CAML for .NET) makes it possible to use all .NET class libraries without any glue code. Here is how you open a file in F# using the .NET FileStream class. Straightforward, isn’t it? open System.IO … let x : FileStream = FileStream.Open(“myfile”) But I am still hoping for Clean for .NET 🙂 2004-02-10 2:00 pm > Non-strict functional languages such as OCAML Just FYI, OCaml is strict. If you’re looking for a lazy functional language, take a look at Haskell ( http://www.haskell.org ) > What I do not like about OCAML is the lack of operator overloading. (again take a look at Haskell) 2004-02-10 4:11 pm What I meant with “non-strict functional” was that OCAML allows mutable data structures and does therefore not adhere strictly to the functional paradigm. A bit misleading since strict has a special meaning in the context of FP. Sometimes it does show that english is not my native language. I know Haskell, but I find the concept of Monads a bit too abstract for a general purpose programming language. I prefer the uniqueness typing concept of the clean language. But other than that, haskell is really nice. 2004-02-10 4:24 pm > “non-strict functional” Oh, you meant it’s functional in a non-strict way… Sorry. 2004-02-10 6:54 pm *cough* Lisp *cough* Actually, if you do not like the syntax of Lisp or Scheme, you can try Dylan. Fun-O has a good production-quality compiler for Winders with some nice library support. Gwydion Dylan is a good compiler, and runs on a number of platforms, but its kinda slow and doesn’t have as good library support. 2004-02-10 7:12 pm Does it produce tighter or more compact code? —— Rarely. Does it produce faster code? ——- Even more rarely. Its pretty close, though. Does it reduce development time? ——– Significantly. Does it have efficient, stable, well researched, well tested compilers? ——– The Ocaml compiler is famous for generating extremely good x86 code. Its a very high-quality implementation. In terms of research, far more of it has gone into ML (of which Ocaml is a derivative) and Lisp compilers than has gone into C or C++ compilers. C/C++/Java/C# compilers are actually rather primitive compared to ML or Lisp compilers. Does it have standard APIs/libraries/tools? ——— Lots of them. Especially for scientific/mathematical work. The standard GTK+ bindings are provided courtesy of LabGTK. OpenGL bindings are provided by LabGL. I don’t know if there is a central listing of all the libraries, but you can try browsing around http://www.ocaml.org. Are there are any large scale successful projects written in the language ———– INRIA (the French institute for computer science) invented Ocaml and uses it for a lot of their work. Ocaml is taught to every CS student going through the French university system. Most programs written in Ocaml have a very scientific bent, but significant work is being done in the language. An (kinda incomplete and out of data) list is available: http://caml.inria.fr/users_programs-eng.html 2004-02-10 7:33 pm My impression of functional languages is that they are very good for mathematical and scientific computing. But are they as good for, say, a business application, or a web app (server-side)? Many functional languages describe themsleves as general purpose languages, but they always seem to fall into the mathematical/scientific/AI category. I’m thinkking aloud here, but it seems those with a strong mathematical background (or those who find mathematics easy) seem to adapt well to the functional way of thinking. Perhaps this is also why, for many others, the functional languages are not appealing and even seem cryptic (yes, I’m also thinking of *cough* Lisp *cough) 2004-02-10 8:01 pm I don’t think that functional languages are inherently hard to learn. There are couple reasons why they may seem that way, though: 1) Previous experience. People are almost always taught imperative languages first. Moving from an imperative mode of thinking to a functional one is much more difficult than learning in a functional mode from the beginning. 2) Ways of teaching. Functional languages tend to be taught in a rigerously mathematical manner. The terminology is often very mathematical (ie: subtyping and inheritence are similar concepts, but the former has a strict mathematical definition that makes it subtly different). It doesn’t have to be taught this way — but traditionally has been taught this way. A loose way of stating the difference between imperative and functional languages is to say that it is the difference between specification and description. In an imperative language, you specify the steps needed to solve the problem. In a functional language, you describe the problem in detail, and a solution emerges from that description. Humans are not necessarily better at specification than description, though the latter is probably more abstract. But programmers should have no problem with abstraction As for Lisp, you can’t pigeon-hole it into the functional language category. Its a multi-paradigm language like C++. You can very easily treat it as an imperative language and still have access to a lot of its power. Especially if you are familiar with modern C++ and some of its functional constructs (eg: function objects, generic algorithms), a lot of Lisp stuff will seem familiar. From there, you can go and take advantage of functional constructs as you get comfortable with them. Also, Lisp’s type system is not mathematically rigerous like ML’s, so a lot of the scary terminology you get with ML-derived languages does not exist in Lisp. While Lisp has a history of being used for certain types of problems, that’s just historical. Indeed, languages like Objective C (which is related to Lisp by way of Smalltalk) have been used very sucessfully for very different types of problems. Common Lisp, especially, is very much designed for general-purpose programming, and has been used sucessfully for non-scientific work — eg: the Orbitz scheduling software, as well as popular games like Jak and Daxter. 2004-02-10 9:04 pm Something important hasn’t been mentionned, do camls have one or two humps ? From the picture on http://www.ocaml.org, I belive it’s a dromedary… 2004-02-10 11:07 pm Lisp and Scheme don’t really fit the same niche as OCaml. Strong static typing is one of the great strengths of OCaml, and both Lisp and Scheme are dynamically typed. Plus, I much prefer the syntax in OCaml — the only downside IMHO is the lack of operator overloading — and OCaml compiles to code whose speed is comparable to C and C++. Lisp and Scheme are more in the niche of Python. They’re good for small utilities or scripting applications but not so great for writing the applications themselves. Dylan, OTOH, is a really interesting language — possibly the most interesting language around. Unfortunately it just doesn’t seem to be able to attract enough of a following to take off. Plus it seems to be really bloody difficult to compile. 2004-02-10 11:15 pm Well, your original comment was that you didn’t want mandatory type declarations in function prototypes. The only way to really do that is with dynamic typing plus type annotations, which is what Lisp does. I’m curious — is there another way to dispense with the type annotations without going into dynamic typing? IIRC, both Haskell and Clean require type annotations too. Lisp (and Python are great for writing applications, and lots of significant applications have been written with them. Common Lisp compiles to code comparable in performance to native C++ as well. Of course, I can certainly see why people might like Ocaml better — the styles are really quite different. 2004-02-10 11:29 pm I agree 100%. OCaml is very nice as a programming language for numerical stuff. But it lacks operator overloading. Why is it that every single language has a major drawback? Haskell has no simple way to interact with non-functional code. OCaml has no operator overloading. Clean is not ported to .NET. Python, Lisp and Scheme are dynamically typed and thus too slow for demanding tasks. So I guess I will have to wait until somebody either ports clean to .NET or adds operator overloading to F#. 2004-02-11 12:11 am Actually, Lisp and Scheme are very fast assuming a good compiler. That’d be CMUCL/SBCL for Common Lisp and Bigloo or Stalin for Scheme. Between type inference and some well-chosen type declarations, you can get arbitrarily close to the performance of C. However, Lisp does have certain drawbacks, namely that it doesn’t take advantage of some of the type-system advancements that have made their way into languages like Clean or Cecil. Goo is an interesting project ( http://www.googoogaga.org ) precisely because it takes a cleaned-up Lisp base and is trying to apply some new research to it. Its far from production-quality, though. 2004-02-11 1:00 am “I agree 100%. OCaml is very nice as a programming language for numerical stuff. But it lacks operator overloading.” Why is this such a big deal? The only serious annoyance I had when programming in OCaml was how byte-code and compiled applications interacted specifically how once it’s compiled you can’t dynamically extend it easily. 2004-02-11 1:06 am > Why is this such a big deal? > If you work a lot with complex numbers and linear algebra, operator overloading is very useful. And if you don’t, the performance of dynamically typed languages such as python is sufficient. > The only serious annoyance I had when programming in OCaml > was how byte-code and compiled applications interacted > specifically how once it’s compiled you can’t dynamically > extend it easily. > You mean a lack of runtime type information and reflection capabilities? That should be solved by the various ML for .NET projects such as SML.NET or MS F#. AFAIK you can even write stuff in F# and extend it from an imperative language. 2004-02-11 2:58 am One nice things about OCaml is that you can use any text editor to write it; whereas with Lisp and Scheme you are pretty much forced (that’s the impression I get at least) to use Emacs. 2004-02-11 3:10 am Lisp is a pain to write in any text-editor that does not have good parens matching. However, most editors have good parens matching. Kate, Vim, and Emacs (the editors I normally use) all work fine. Emacs is extra-nice because there is an extension called slime (also ilisp) that lets you work on the program while you are running it. 2004-02-11 1:06 pm I never got to put myself in the functionnal way of thinking. In theory, Ocaml supports non-functionnal style but the book tries too much to makes you use functionnal style.. Too functionnal and not very readability made me reject Ocaml. I’m currently learning Ruby and I find it much easier to learn, much more readable which is a big plus! There are drawback of Ruby: Ruby is not very fast and is a bit unfinished, but I still prefer Ruby: doing Ocaml is more like doing math than doing programming (and yes, there is a difference).. 2004-02-11 1:37 pm “You mean a lack of runtime type information and reflection capabilities? That should be solved by the various ML for .NET projects such as SML.NET or MS F#. AFAIK you can even write stuff in F# and extend it from an imperative language.” That’d be great if I only programmed for .NET. The main problem is I got spoiled by common lisp and a good REPL. If I remember right there was a module that quasi-let you do what I talked about, it was a very slow Ocaml interpreter implemented in ocaml as a loadable library. However, since ML in general is so nice for implementing interpreters/compilers in general it’s usually easier to implement a domain specific language. — “In theory, Ocaml supports non-functionnal style but the book tries too much to makes you use functionnal style..” In does support non-functional style but there isn’t much of a point to use a function language to just program imperatively (or OO for that manner). The functional style gives so many advantages it’s worth the transition time. “Too functionnal and not very readability made me reject Ocaml.” Readability in ML is always a problem (I mean look at the problems that emacs syntax coloring/indentation has with it, you have to parse the entire program to that point to know what that specific expressions purpose is. However, because of static typing i found if I wrote small non-imperative functions I didn’t need to go back to read them; it was relatively bug free therefore minimal documentation was needed (e.g. what the args were, what the return value is, and side effects if any). Honestly, all this reminiscing about Ocaml makes me want to go back and use it on my next project.