This series is aimed at programming language aficionados. There’s a lot of very abstract writing about programming languages and a lot of simple minded “language X sux!” style blog posts by people who know next to nothing about programming. What I feel is sorely missing is a kind of article that deliberately sacrifices the last 10% of precision that make the theoretical articles dry and long winded but still makes a point and discusses the various trade offs involved. This series is meant to fill that void and hopefully start a lot of discussions that are more enlightening than the articles themselves. I will point out some parallels in different parts of computing that I haven’t seen mentioned as well as analyze some well known rules of thumb and link to interesting blogs and articles.
Printing “Hello world!” to the screen is easy enough in just about any language to be written in a few lines of code and is therefore the first program most beginners write.
Of course it doesn’t teach you much about programming and proponents of language A in which hello world is cumbersome will tell you that their language of choice “scales better” i.e. it makes writing huge, complex programs easier than language B which makes hello world easy. That means the pain of coding in B grows asymptotically faster than the pain of coding in A for large applications.
Another concept concerning asymptotic behavior is big O notation.
It is a way of describing how much resources (CPU cycles, memory) an algorithm uses for some very big input size “n”.
For example if the algorithm has to sort a list of names alphabetically “n” would be the number of names on that list.
The figure shows O(n^2), O(n) and O(log n).
It is no accident that the curve of O(log n) starts higher:
often more complex algorithms are required to achieve better asymptotic behavior and the additional book keeping leads to a slowdown for simple cases (small n).
Connecting the dots, language A would be the O(log n) curve that starts high but grows slowly while language B would be O(n^2) which starts low but grows fast. I’m not claiming that there actually are languages for which “the pain of developing the app” grows exactly O(log n) or O(n^2) with the “size of the app” – that wouldn’t make much sense anyway since I haven’t given a precise definition of “pain” and “size”.
You can choose the algorithm based on the complexity of the problem, combining several algorithms into one program and using each where it works best.
One example of this is GMP.
The programming language analog of this is using more than one language, e.g. one for exploratory programming and a different one for implementing the parts that are already well understood and unlikely to change. The right tool for the job and all that. There is however the added cost of having to switch data structures/algorithms or, in the case of programming languages, having to learn and integrate more than one language.
If you want to avoid this cost there are of course languages that are not aimed at a specific problem size or domain and are moderately good/bad at everything, corresponding to the straight O(n) line. Whether or not this makes sense depends on who you ask, see for example Stallman’s “Why you should not use Tcl” and Ousterhout’s reply.
Assuming that it is possible to adapt the language to the problem, e.g. make the language stricter or less strict depending on the size of the problem, what’s the right default?
One of Larry Wall’s principles of language design is that “easy things should be easy and hard things should be possible”. I think the “hard things should be possible” part means that you should not cripple the language so severely that you’re not able to adapt it to the problem. So why should easy things be easy? Well, first of all programs start out small. But more important is the relative cost of adapting the language to the problem:
A couple of lines telling the compiler to be stricter are not even noticeable if your program is a million lines long. On the other hand if you’re writing a one liner and need ten lines to tell the anal retentive compiler to shut the fuck up it’s obviously no longer a one liner. Since the relative cost of adapting the language is far higher for small problems a programming language should be optimized for solving small problems while still being able to adapt to larger code sizes.
There’s also a psychological component to it.
It just feels wrong to spend great effort on a minuscule problem, even if it makes writing programs easier on average.
One example is static typing vs duck typing.
If your program is written in a dynamic language and grows beyond a certain point you probably need to make sure that the types of various variables are correct. So the program could even get longer than the same program written in a language with concise static typing.
Still it doesn’t feel nearly as wrong as writing templates in C++:
<oh compiler <please forgive me> but I <your puny human user> am unfortunately unable <to tell you <the exact type of this> > >
I’m sure a lot of people will disagree but to me all this fuss just to express the type equivalent of a shrug is ridiculous. The effort should be roughly proportional to the amount of information you want to convey and if hello world needs more than one line the language designer probably didn’t understand this or designed his language with only big programs in mind.
Why languages that are well suited for big programs suck for hello world and vice versa will be examined in the next article.
About the author
I’m Benjamin Schuenemann, a physics student from Germany. Programming languages are one of my hobbies. My main programming language is Python because I don’t forget it quite as fast as C, C++, Matlab or Perl.
I’ll start simply: what?
First, this article’s title is horribly misleading. ‘Hello World’ Considered Harmful – okay, it started out alright, but only 20% of the article is actually about why the author thinks hello world programs are harmful. The 20% that is about it is not very clear, and goes into a complex math discussion that seemed to rely on poor assumptions to begin with.
By the way, Big O notation is not “a way of describing how much resources (CPU cycles, memory) an algorithm uses for some very big input size n.”. It is used to describe the complexity of an algorithm. So don’t invoke computational complexity theory if you don’t really understand it.
The author’s discussion on asymptotic behavior came out of left field and as far as I’m concerned had nothing to do with the point of the article.
Then, finally, the discussion goes into static vs duck typing – WTF? Why? It in no way clearly supports the title of this article.
This article is a mess. It’s amazing it’s on the front page.