Mastering Recursive Programming

Eugenia Loli 2005-06-20 General Development 73 Comments

Recursion is a tool not often used by imperative language developers, because it is thought to be slow and to waste space, but as the author demonstrates, there are several techniques that can be used to minimize or eliminate these problems. He introduces the concept of recursion and tackle recursive programming patterns, examining how they can be used to write provably correct programs.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

73 Comments

2005-06-20 6:16 pm
Anonymous
Actually I think most programmers are started in imperative languages for other reasons:
1.) They’re more popular.
2.) They’re more popular.
3.) The machine is imperative.
4.) Imperative language tend to be more compact (like c, let’s not talk about c++ hehe).
The author seemed to fail to address his promise of fixing speed problems that he made near the beginning.
I don’t think he addressed recursion limitations in c either. Linux, and probably most kernels, limit how many functions you can call deep (I think it’s 10,000 by default). So many things are great for recursion, but with that limitation a number of things won’t work.
Semi-interesting. I’ll give it a better read later. I wish he’d put all his examples in c though, I don’t speak scheme.
2005-06-20 6:23 pm
Anonymous
LISP is probably one of the most misunderstood languages around. a Functional language built on recursion and Object Oriented.
2005-06-20 6:29 pm
Anonymous
Actually I think most programmers are started in imperative languages for other reasons:
1.) They’re more popular.
2.) They’re more popular.
3.) The machine is imperative.
4.) Imperative language tend to be more compact (like c, let’s not talk about c++ hehe).
I agree with 1 and 2, but 3 and 4 are flat-out wrong. With regards to 3, Modern machines are sort of inbetween imperative and functional. In the large scale, yes, the machine is imperative. But consider what the modern processor looks like: it’s deeply pipelined and superscaler. That means for short blocks of code, the parallelism and reduction in data dependencies inherent in functional languages is a good fit for the processor. As machines become more parallel and memory architectures become NUMA, the “C machine model” will resemble the actual machine less and less.
I don’t think he addressed recursion limitations in c either. Linux, and probably most kernels, limit how many functions you can call deep.
If you write your functions as properly tail-recursive, this isn’t a problem.
2005-06-20 6:29 pm
Anonymous
Semi-interesting. I’ll give it a better read later. I wish he’d put all his examples in c though, I don’t speak scheme.
Why would you put it into anything *BUT* Scheme? There are a few languages which reasonably could have been used; C is not one of them.
Scheme is the simplified version of Lisp which is the gran-daddy of functional programming whose central tennent is recursive programming. It is also one of thbe most commonly used and widely available funcitonal languages.
Other options include Haskell (purely functional) or Lisp itself.
Doing it in C would be like writing an article about OOP and writing your examples in C.
A good environment for learning Scheme is http://drscheme.org/
PS- I’ve never been convinced that computers themselves are imperitive. It seems that at the lowest level they are beyond classification. Kinda like looking at an individual atom and trying to determine what state of matter it is in.
2005-06-20 6:34 pm
Anonymous
>Actually I think most programmers are started in imperative languages for other reasons:
>1.) They’re more popular.
>2.) They’re more popular.
>3.) The machine is imperative.
>4.) Imperative language tend to be more compact (like c, let’s not talk about c++ hehe).
Probably should have stopped after 2. The fact that the machine is imperitive has very little to do with it. Very few people learn assembly before C++/Java so the underlying architecture has nothing to do with it. I would argue they are introduced to imperative/OO first because it is more marketable. People are more likely to know it going in, and want to keep doing it in industry.
Number 4 is just silly. More compact? There is plenty of resursive code that is shorter/more concise than the alternative. This argument just doesn’t hold water. Not to mention that many common procedures such as traversing trees and lists are done recursively because of this very reason.
Lisp is Object-oriented? Not so much. You can talk about CLOS, it is about as OO as C or Perl. You can use it as such if you want, but it is most certainly not OO by design.
For the article itself, it is a good intro to how to write good recursive procedures. I’m sure a lot of folks write off recursion as the devil, but there is a certain something about the simplicity it brings.
2005-06-20 6:37 pm
Anonymous
Recursion is a tool not often used by imperative language developers
Um, blah. It’s perhaps “not often used” by novice programmers, but as experience grows one learns to search for the best tool to apply to a particular challenge, and if recursion is appropiate any programmer (that may happen to use an imperative language) will use it.
2005-06-20 6:39 pm
Anonymous
Or maybe computers just aren’t software paradigms? Kinda like an apple is not an oranage.
2005-06-20 6:54 pm
Anonymous
Yet, author is right when he says that though scary, it can be a great solution for a whole class of problems.
Moreover, writing a recursive function requires a bit more paper work and maybe some developers prefer to get down to code as soon as possible.
However, recursion saved me a lot of work in many cases. Plus, I agree it tends to produce very compact functions which is good, expecially if you find smart names for those functions.
I think recursion got a bad name on old computers, where memory was few and one could easily get in trouble with recursion because of limited memory. Modern computers usually have large quantities of memory so recursion can show all its power.
My .02
2005-06-20 7:00 pm
Anonymous
the code in listing 2 does not check the preconditions (an C language int value might be negative) and disregards the fact that 0! is well defined.
What does “provably correct code” helf, if it is wrong?
2005-06-20 7:22 pm
Anonymous
a tool not often used by imperative language developers, because it is thought to be slow and to waste space
It’s sometimes funny to see when ignora^H^H^H^H^H^Hnovice coders preach about how some coding style, technique or algorithm is not widely used, hard, bad, shouldn’t be used, etc.
For one, I could come up with a dozen problems in a sec which are most easily and effectively solved by recursive algorithms. Not having knowledge about such areas doesn’t justify the above reasoning.
As always, all I have to say is: if you’re coding, know the theory, know your job, know your tools and always try to make the right decision regarding which and when to use. Sometimes it’s a recursive algorithm. Sometimes it’s not. Preaching about it not being often used won’t really help you in any way.
2005-06-20 7:35 pm
Anonymous
Actually 3 is not flat out wrong, you didn’t even say it was :-p.
The reality is that the machines are still imperative, but designed to be forgiving (and efficient) with functional things. You still have to make a new function stack each time, but this isn’t a big deal usually.
Number 4: You don’t know what compact means. Compact means that one can hold within short term memory the majority of the constructs of something. So a notoriously compact language is C, a notoriously non-compact one is Perl. I have no actual experience with scheme or lips; but I’m guessing they’re fairly complex in their constructs?
OOOO, I see what he’s saying about tail calls. A little dependent on the optimization of your compiler, but nice.
2005-06-20 7:36 pm
Anonymous
Because he’s not talking to functional programmers, he’s talking to imperative programmers. Would you speak to a Mexican in Greek to convince him that Greece rocks?
2005-06-20 7:40 pm
Anonymous
“For the article itself, it is a good intro to how to write good recursive procedures. I’m sure a lot of folks write off recursion as the devil, but there is a certain something about the simplicity it brings.”
I can agree with that.
Number 3 has more to do with the history than today btw. People tend to take some time to switch the languages they teach, so what’s taught today may very well be the craze of a decade ago.
There’s also an argument of maintaining written code which I forgot to mention for why they don’t start with scheme/lisp/etc.
Some colleges try to only require one language of exposure.
2005-06-20 7:41 pm
Anonymous
I am using recursion everyday even though I write code mostly in PHP scripts.
Sometimes PHP just dies silently (reaching default 8 Mbytes limit of memory) – there is no error message in the web page, only in error log of web server 🙂 A very hard thing to catch, if you forget about web server logs and use your browser as your PHP debugger.
2005-06-20 7:45 pm
Anonymous
If your web site has a site map, which has many levels, you would display the whole structure by calling one recursive function at the root node, and it would scan all levels by itself.
2005-06-20 7:55 pm
Anonymous
Listing 5. …. shouldn’t it be
int sum_list(struct list_node *l)
{
…
return l->data + sum_list(l->next);
}
while l is a pointer to list_node?
@Timpothy: if your PHP-script’s function consumes 8MB of RAM you probably made something wrong.
hint: endles recursion
You are lucky if the script exits at 8MB – consider the script to consume the whole RAM inclusive the rest of free swap – that’s even worse than only 100% CPU load
2005-06-20 8:02 pm
Anonymous
Something I’ve wondered for a while is how often can recursive functions be tail-call optimized in practice? Even the most common example of recursion, the factorial, cannot be tail-called (well, if it can it won’t be pretty). Would anyone happen to have a link to any statistics on this?
2005-06-20 8:03 pm
Anonymous
“I have no actual experience with scheme or lips; but I’m guessing they’re fairly complex in their constructs? ”
brilliant!
Obviously you’v never even *looked* at scheme or lisp. Why do you then claim C is more compact ?
Btw. the Scheme standard says that the ‘compiler’ has to do tail optimization (for recursion, not necessarily general tail call optimization).
“What does “provably correct code” helf, if it is wrong?”
The factorial example is a bit simplistic, true, but if it had been written in OCaml, the compiler would have caught the ‘error’.
2005-06-20 8:10 pm
Anonymous
Not every recursive function can be turned in a tail recursive function, but many can.
I found the following link via google, and it gives some simple guidelines how to do so: http://www.owlnet.rice.edu/~comp210/96spring/Labs/lab09.html
2005-06-20 8:20 pm
Anonymous
I found these blog entries after some more googling
problems with tail call opimization in imperative languages
http://tratt.net/laurie/blog/entries/tail_call_optimization
and one about problems with fp for practical problems
http://tratt.net/laurie/blog/entries/why_dont_we_use_functional_pro…
not that I necessarily agree, but interesting
2005-06-20 9:23 pm
Anonymous
a little vbscript
response.write factorial(170)
function factorial(n)
if(n = 1) then
factorial = 1
else
factorial = n * factorial(n – 1)
end if
end function
on my machine if i seek the factorial(171) it crashes
2005-06-20 9:29 pm
Anonymous
Also, in your example: Going thousands of layers deep is not likely . Maybe when people no longer make most directory structures we’ll see thousand layer deep structures.
2005-06-20 9:36 pm
Anonymous
“a little vbscript”
IIRC variants in VBScript are 16 bits (actually, only 14 are used for the data itself, the first 2 describe what data type it is), so !171 is probably exceeding this limit. Someone correct me if I’m wrong.
another place where recursion is absolutely necessary; It’s virtually impossible to do any programming against directory/file structures without using recursion.
2005-06-20 10:15 pm
Anonymous
The reality is that the machines are still imperative, but designed to be forgiving (and efficient) with functional things.
That isn’t really true. Machines are what they are, superscaler and pipelined. A the black-box level, the CPU can be modeled as imperative, but inside the box, the situation is much more complicated. When you move to something like a large NUMA SMP machine, things become even more complicated. The imperative paradigm has no conception of parallelism, aside from trying to model a parallel machine as running multiple serial processes. When you take serialization into account, it becomes clear that the imperative paradigm is not only not a good description for such a machine, it’s not even the ideal way to program such a machine.
I have no actual experience with scheme or lips; but I’m guessing they’re fairly complex in their constructs?
Scheme and Lisp are notoriously simple in their constructs. The Scheme language standard is like 50 pages, and the Scheme language has a lot more features than C. You can write a basic Scheme interpreter in about 300 lines of code. The Common Lisp standard is much bigger (about the size of the C++ standard), but that’s because Common Lisp has an enormous standard library. Common Lisp doesn’t define a boundry between what’s in the core language and what’s in the library (it leaves that up to the compiler), but in most good implementations, the core language is tiny. For example, the object-system is usually implemented through macros in the core language. Indeed, some don’t even have loops in the core language. They have macros that expand to in-place tail-calls, and let the optimizer take care of the rest.
2005-06-20 10:23 pm
Anonymous
Tail-recursive factorial in Scheme:
(define (factorial n)
(define (iterate n acc)
(if (<= n 1)
acc
(iterate (- n 1) (* acc n))))
(iterate n 1))
From: http://www.answers.com/topic/tail-recursion
Many (most) recursive functions can be expressed as tail-recursion. Certainly, anything that can be represented as a loop can be represented as a tail-recursive function.
2005-06-20 10:23 pm
Anonymous
When I only knew Java, recursion was a bit weird but with learning Prolog, we did loads of recursive programming. Now it’s second nature.
Recursion rocks! It’s so satisfying to look at one or 2 lines of recursive code and then compare it to iterative code that does the same thing run over 20 lines
2005-06-20 10:44 pm
Anonymous
Number 4: You don’t know what compact means. Compact means that one can hold within short term memory the majority of the constructs of something.
Compact can mean many things. Was the responder supposed to read your mind to figure out which one you meant?
So a notoriously compact language is C, a notoriously non-compact one is Perl.
In case you hadn’t noticed, C supports recursion.
I have no actual experience with scheme or lips; but I’m guessing they’re fairly complex in their constructs?
That’s great. “I don’t know Lisp or Scheme, so let’s just assume that they’re complex.” You should probably avoid calling C “notoriously compact” around people who have used Scheme.
2005-06-20 11:10 pm
Anonymous
Horrible syntax, no static type checking, and lots of legacy (car/cdr anyone?). Why do people still put up with that? If you’re gonna use a functional language, do yourself a favour and pick Haskell or OCaml.
2005-06-20 11:17 pm
Anonymous
“another place where recursion is absolutely necessary; It’s virtually impossible to do any programming against directory/file structures without using recursion.”
Tree structures can be traversed without recursion by using for example a stack to keep track of your tree nodes, this should be a lot faster than using recursion on languages like C, although less intuitive i guess.
As for lisp and scheme they do have their cons, the biggest of which being that they are very, very hard to read. Unless of course you’ve been doing lisp for a long time.
2005-06-20 11:23 pm
Anonymous
Many (most) recursive functions can be expressed as tail-recursion. Certainly, anything that can be represented as a loop can be represented as a tail-recursive function.
Yes, but tail-recursive functions are also just as difficult to write and read as the equivalent imperative loop. And first you have to get your head around the whole idea of passing partial results as arguments.
2005-06-20 11:31 pm
Anonymous
“Compact can mean many things. Was the responder supposed to read your mind to figure out which one you meant?”
No, just the context of my speech.
“In case you hadn’t noticed, C supports recursion.”
In case you didn’t notice, I didn’t say recursion wasn’t compact.
“That’s great. “I don’t know Lisp or Scheme, so let’s just assume that they’re complex.” You should probably avoid calling C “notoriously compact” around people who have used Scheme.”
That was an invite for correction jerk. Calm down and get off my case, we’ve all said something wrong.
2005-06-20 11:36 pm
Anonymous
When functions don’t have stacks you’ll be right. Maybe I’m speaking of the OS/ASM model that I’m accustomed to and the machine is really something different; but since probably 99.95% of programmers write for OS/ASM or higher (or device drivers)…
Ok, sounds to me like scheme is really compact. Thanks for informing me . One of these days I’ve always planned to do something in lisp but I’ve never got around to it.
I’m sorry, but I really don’t see the connection of pipelining to functional as opposed to imperitive. How is processing multiple instructions at the same time in line not imperative? Superscalar maybe, I have an idea where that could go; but I really don’t see the pipelined connection…
2005-06-20 11:53 pm
Anonymous
“another place where recursion is absolutely necessary; It’s virtually impossible to do any programming against directory/file structures without using recursion.”
I’ll also comment on this.
Back in the old days I’ve had more than one case where a recursive function bit me by blowing out the stack space. Since then I’ve solved recursion problems by implementing stacks. A huge bonus of this method is that this then allows the problem to be easily solved using a pool of threads, and there you go.
Recursion works great for writing algorithms and doing classwork but it ends up causing more headaches in the end than it solves. Avoidance is key.
As an aside I find the HASH table to suffer from the same problem. Overflow it and everything grinds to a halt.
2005-06-20 11:55 pm
Anonymous
…is most useful of course for naturally recursive data structures (i.e. trees and graphs).
“ML for the Working Programmer” covers this topic quite well, especially how and when to convert a function to its tail-recursive (i.e. iterative) form if possible.
Over and out.
2005-06-21 12:03 am
Anonymous
anyone point me to good resources so too can be enlightened and appreciate lisp/scheme. i’m obviously failing to see its great beauty.
2005-06-21 12:06 am
Anonymous
umm.. I find postfix functional syntax to be very nice. and unless you are doing a lot of deep string manipulation, you can use first, second, third, last, etc, rather than car, cdr, cdar, cadr, caar, etc.
2005-06-21 12:09 am
Anonymous
umm.. I learned lisp in my AI class and I could read it with proficiency within 3 weeks. I am double majoring in mathematics as well though, so I have a natural propensity to read functional structures…. postfix notation is not hard to get.
2005-06-21 12:10 am
Anonymous
Even recursing file/directories? Like, within the last decade?
2005-06-21 12:11 am
Anonymous
the book store 🙂
2005-06-21 12:12 am
Anonymous
I spent 3 hours trying to figure out post/pre/infix notation for an exam once: Then of course there were no questions on it in the exam…
It wasn’t so much that I couldn’t understand them as much as I didn’t feel comfortable converting them by hand with tons of parenthesis thrown in haphazardly.
2005-06-21 2:35 am
Anonymous
When functions don’t have stacks you’ll be right. Maybe I’m speaking of the OS/ASM model that I’m accustomed to and the machine is really something different;
Stacks are another thing in imperative languages that don’t model the computer. Modern compilers actually need promotion passes to turn stack-oriented C code into register oriented machine code. Of course, functional languages are this way too, but that doesn’t change the fact that the imperative model isn’t great for describing the machine.
The OS exposes an imperative model because most existing code assumes the C machine model. There are multiple layers of abstraction to make the machine look like that. You have VMs that hide the NUMA nature of memory from programs, you’ve got MMUs that hide the non-contiguous nature of memory from programs, you’ve got CPU front-ends that hide the non-sequential nature of the processor.
I’m sorry, but I really don’t see the connection of pipelining to functional as opposed to imperitive. How is processing multiple instructions at the same time in line not imperative?
I didn’t say pipelining makes the CPU functional, I said it makes it non-imperative. In imperative code, instructions are sequantial. It is assumed that each instruction completes before another is started. Consider this code:
1: a = 0;
2: b = a + 1;
3: c = b + 1;
4: d = c + 1;
In the imperative model, this looks like straight-line code. However, in reality, you’ve got a data dependency delay on each line. This means that when the second instruction is issued right after the first, it’ll stall at a certain pipeline stage (creating a pipeline bubble) waiting for the results from the previous instruction to become available. When they are, the forwarding network will send the result and allow it to continue, but then the third instruction will stall waiting for the second. Same thing with the fourth instruction. In the CPU, the flow is more like:
a = 0;
wait_for(1);
b = forward_result(1) + 1;
wait_for(2);
c = forward_result(2) + 1;
wait_for(3);
d = forward_result(3) + 1;
Now, your compiler’s optimizer hides all this from you. A good compiler might realize that it can transform the above code to:
a = 0;
b = 1;
c = 2;
d = 3;
When the optimization is applicable, it will get rid of the pipeline bubbles. However, consider what’s happening here: the optimizer is hiding the true nature of the machine from the programmer. Nothing in C exposes the pipelined nature of the machine to the programmer, to optimizer makes a pipelined machine appear to be a non-pipelined one. A more accurate model of the machine would simply have a note saying “results of a calculation will not be available for reads until 10 cycles later.”
That’s just pipelining. Consider what speculative execution or branch predication does to your “machine is imperative” statement! Ultimately saying “the machine is imperative” is a cop-out. Even if it was true that programming “to the metal” was better, you’re not doing that in C anyway!
2005-06-21 2:44 am
Anonymous
<cite.Yes, but tail-recursive functions are also just as difficult to write and read as the equivalent imperative loop. </cite>
Iteration is hard to /prove/ right. Which is why most imperative programmers don’t bother with an axiomatic approach or assertions. That is to say, they don’t bother to prove correctness /at all/. Or, they do extreme programming, which catches at the most 80% of bugs (according to my expert teacher).
If you prefer to debug, instead of write correct code, you do not qualify as a bright person, I’m sorry, not in places with stringent requirements. Of course, for the general users, you can push trash and software loaded with bugs. Look at Microsoft.
But I digress…The point is that recursion, however, is amenable to a mathematical prove of correctness. However, there are different types of recursion.
However, if your language only has iterative looping instead of, say, map, you’re handicapped. If you’re an Algol-family programmer (C/C++/Java/Delphi/Pascal/C#, etc), you’re handicapped but aren’t aware of it.
Any decent implementation of Common Lisp will do recursion in /linear/ time. Any Scheme implementation also (as it’s required by the RsR5 standard).
Scheme and Common Lisp are real programming languages. No matter what you may think, real systems are written in them. Some stuff I could show had lisp but you didn’t know when you visit a site, because it doesn’t matter, it’s not a language war. Lispers and schemers don’t feel they have to prove to /you/ their worth…
2005-06-21 3:30 am
Anonymous
So each instruction dependant on the next (btw thanks for cpu’s-101); sounds imperative… Pipelines are generally in-order processed, and bubble avoision is generally done at compile time (compiled code is still code). I’m sure there’s probably some exception to this, Intel has probably done it hehe.
Even out of order execution has to keep the logic the same as the in-order way it was programmed; so it has to be careful.
Anyway this is going to become a really stupid argument over semantics and levels of abstraction.
You bring up a lot of interesting abstractions, like MMU’s but memory accessing doesn’t relate in my mind to functional verse imperative either.
In my mind a processor still, today, executes in logical order a set of commands (instructions) making it imperative.
Maybe there’s a good article somewhere you can point me to to clarify this?
2005-06-21 3:36 am
Anonymous
well, once you get *fix it it is really easy.
2005-06-21 3:39 am
Anonymous
Why wouldn’t it do it in linear time? I don’t think anyone’s worried about the time constraint. Course, I don’t think this is so much an argument here over where recursion is a great and viable tool (we all know it is). It’s an argument over whether you’d iterate an array with recursion (I never would, but maybe that’s not a lot of extra work in lisp?).
People get funny looks when someone proposes something they already knew about as the solution to all their problems . The article was actually pretty good.
I think it’s also an argument between “c guys” and “lisp guys” that dates back to a generation ago!
Back to linear time. If time’s a concern we aren’t concerned about complexity (linear, log, exponential); we’re worried about the constant. It’s all gonna be linear. I think more people worry about stack overruns though. That and sometimes recursive functions can be a pill to figure out if they aren’t well documented; especially if it wasn’t a recursive problem.
Maybe I’ve lost everyone by now?
2005-06-21 4:23 am
Anonymous
Most of you cannot even spell imperative but ramble on about what is right or wrong for recursive programming. He should have used a more popluar language such as C, this I agree. Every other comment is dribble.
2005-06-21 4:23 am
Anonymous
It’s an argument over whether you’d iterate an array with recursion (I never would, but maybe that’s not a lot of extra work in lisp?).
It depends on how complicated your algorithm is. If its a simple iteration over the array, then recursion doesn’t buy you much. But as the problem becomes more complicated, recursion becomes much more useful in expressing the desired behavior. Quicksort is the pedagogical example for this, but many other algorithms are the same way.
I think more people worry about stack overruns though.
In general, the memory usage of both recursive functions and imperative functions are of the same order. If your machine has a sane calling convention (ie: any RISC, or x86-64) and you’ve got an aggressive compiler, then the constant difference isn’t really anything to worry about. Avoiding recursion because you’re worried about blowing the stack is a silly premature optimization. Most likely, your algorithm can be written tail-recursively, and if it can’t, you’re going to need an extra stack in the iterative version anyway. And that extra stack can be expensive. Where does that stack come from? malloc() is an expensive function to call. Worse, it creates contention for the heap lock (a big deal for parallel machines). So malloc()’ing a temporary stack might blow any performance improvement you get from iteration. You could make the temp stack persistent, but now you have to worry about locking it, etc. Of course, people doing premature optimizations never worry about higher-order effects like that…
2005-06-21 6:04 am
Anonymous
I spend alot of my SW development time in translators usually related to HW design and currently on a C compiler, but whatever the input language is, it is almost always defined by an EBNF (extended Backus Naur form) which nicely describes the syntax as very simple recursive productions.
Each EBNF rule maps directly into a func(), with the semantic stuff added in. The output is usually a tree but that can be regarded also as another language with a very simple EBNF of its own. The C exprn is a good example or recursion.
Removing recursion would by as pointless as it gets.
On occasion I do get stack overflows, but that is not because the stack isn’t big enough or the data is too large, it is usually because one of the EBNF rules has degenerated or not been correctly translated into equiv C func(). Some of the EBNF rules have to be tried in several ways to see which one applies, good example is C assignment which allows multiple assigns. Usually stack overflows are a strong indicator of buffer overflow much earlier in the crash history.
Anyone who is trying hard to avoid recursion is wasting their life, you are ultimately reinventing the same thing with your own stack representation. Sometime that can be justified, mostly not. This was more reasonable 20yrs ago when cpus performed as users expected, today x86 don’t like to be understood. The best way to speed up programs is to understand the memory wall and locality and caching issues.
I do urge all newish programmers to try and understand their preferred language from the point of view of having to compile it themselves to asm. You will quickly find how horrid the EBNF for C like languages really are, they often do not have a precise EBNF defined for them with lots of exceptions to the rules. To get an accurate EBNF for C you need to get a compiler writing text, yet for Pascal and hundreds of passed over languages, its can be found in the index.
Its funny that the less popular a language is in the general culture, the simpler it’s description generally gets which ironically often makes it more powerful as it can get closer to being able to manipulate itself. The ultimate simple language being Lisp itself.
If according to Rayner it takes 300 lines of Lisp/Scheme to write an interpreter, it usually takes 10x as much in C languages. There’s a saying in the Lisp world, that all other languages are doomed to reinvent it. Its probably a good job most languages don’t try that, esp executing their own data.
2005-06-21 6:22 am
Anonymous
“recursion” is no “tool”
2005-06-21 6:35 am
Anonymous
Sorry for the provocative title, but whenever I hear lisp people ramble about the advantages of recursion, they miss half of the picture.
Recursion is a great tool for some jobs but not for all of them.
Recursion mainly can be applied in a very compact way whenever you have to deal with some kind of graph-tree like structures (no matter if the structures are in program or as data), but with linear structures like you deal 99% of the time, you can forget it, loops are the better way to deal with it.
One of the reasons pure lisp basically never really caught on, was the we will slay everything which is closely related to a graph with a recursion thinking, which came probably out of the thought that the lisp VMs were stack machines anyway, so why do we hazzle with the implementation of loops.
But loops normally are more tight on assemlber level (a simple jump, while a recursion has the full burden of a function call with all the stack operations of saving the registers and parameters involved)
The code overhead of a straight loop applied to a linear problem is basically equal than a recursion applied to a linear problem, but never if you cannot afford the time, apply a loop to a non linear graph like problem, recursions are much tighter.
But the above poster gives the impression that recursions can slay everything, and then gives the classical example of tree operations (hin his case a syntax tree). This is a false impression because, he basically pulled out exactly the one case where recursion is really strong, non linear graphs. While 90% of the problems of modern computing are linear or parallel linear.
It usually comes down to the fact that you just have to apply the right tool for the job. Linear problem, use loops unless you want to stress test your stack (which is always smaller than the heap)
Non linear graph problem, use recursion, unless you are in for a sado maso coding session, or you are forced by stack boundaries (because the graph is so huge)
2005-06-21 6:41 am
Anonymous
This really can be a problem in non runtime optimization systems, you really cannot rely too much on the compiler.
I will give a pseudo code example.
A recursion is basically translated to following:
push register a
push register b
push register c
move variable 1 into register a
move variable 2 into register b
etc….
check for the end of the recursion
do something
push a
pubh b
pull c via stackpointer
pull b via stackpointer
pull a via stackpointer
jump to the beginning
A simple linear loop does following ops
check for the end of loop
if reached jump out
do something
increase register a which holds the loop counter
jump to the beginning of the loop
Every compiler basically does that, some unroll the loops so that the processor pipes can be more effective
I think you can get the picture, sure some compilers probably can loop some of the recursion functions, some probably can save some register pushs and pulls, but can you really rely on that, if you follow a slay everything with recursion approach, I would not. I would apply the right tool for the job, which can be sometimes recursion.
2005-06-21 9:48 am
Anonymous
“I think you can get the picture, sure some compilers probably can loop some of the recursion functions, some probably can save some register pushs and pulls, but can you really rely on that, if you follow a slay everything with recursion approach, I would not.”
Yes you can, thats the WHOLE point of tail call optimization, as specified in (e.g.) the scheme STANDARD.
That is EVERY scheme implementation has to run your tail-recursive functions in constant space.
2005-06-21 10:13 am
Anonymous
I’m amazed that people seem to find recursion so alien. I guess that shows what background computer people are coming from in the present. In mathematics, recursion and the principle of proof by induction are absolutely core. In present-day programming, how often is blowing a stack really a problem? If you are averse from even considering a recursive solution, you are guilty of premature optimisation.
2005-06-21 10:25 am
Anonymous
It’s possible (though occasionally awkward), to re-write every recursive algorithm in an iterative fashion, where you no longer have to worry about excessive memory use (recursion isn’t nice on cache) or blowing the stack.
Recursion is an excellent method to consider when creating an algorithm, but once you’ve got it working, it is often worthwhile to convert the algorithm to an iterative one.
See this <a href=”http://www.tbray.org/ongoing/When/200x/2003/03/22/Binary“>articl… on an iterative implementation of binary search or have a look at qsort in your C’s compiler. Both of these were originally designed as recursive algorithms, but were then implemented in an iterative fashion, with an improvement in performance, noticeably so in the case of qsort.
<aside>
For anyone who’s confused, an “iterative” algorithm means using constructs like “for” or “while” loops, and is separate from the terms used to describe imperative or functional languages.
2005-06-21 11:16 am
Anonymous
I agree with Jon. Memory is usually not a problem anymore, except for well-identified conditions like mobile systems and some other legacy systems.
In general computing, memory is very cheap so people’s deep worrying reason is off and performance usually depends on a very broad range of things than millisecond-fast code. I find using recursive functions very handy and I think coders should at least consider it as an option. It produces very compact code which is easy to understand and it usually is (at least in my experience) less prone to bugs than iterations, though it is generally harder to setup.
I’m not a recursion evangelist. I just find that it solves a set of problems very well, the same way I think that functional programming is powerful but OO programming is better to me because of objects association to common people’s way of thinking.
2005-06-21 11:30 am
Anonymous
It is interesting to point out that compiler optimizations can make recursive qsort as fast as iterative qsort. http://big-oh.cs.hamilton.edu/~bailey/pubs/techreps/TR-2001-2.pdf
2005-06-21 1:18 pm
Anonymous
Quicksort might not be that great of an example at all.
It is probably true that the iterative version will probably run a little faster than the recursive version on unsorted data, it is no way as easy to understand though, looking at the Sedgewick book.
The real problem with Quicksort is that if the data is already mostly sorted the execution time goes up way faster than any small savings in removing recursion, O N lg N can go degenerate to O N.N time.
I just instrumented the QS code to count the no times the scan loop tests, the swaps and the recursions occur dynamically and was testing on biggest possible sorts for a cpu to cpu benchmark. It broke at about 64k inputs (or thereabouts) if the data was already sorted since the QS algorithm then went into its bad behaviour of recursing all the time and that did stack overflow, probably a few hundred or thousand times. Now if the code had been iterative the problem would have just moved to the manual stack after all it will still degenerate, only its easier to upsize that hand stack.
The simple solution is too guarantee the inputs are not sorted by perhaps randomising them in the 1st place, now the sort is back to O N lg N time but that sound a bit silly.
Still recursion and looping both have their place and its usually obvious when and which is right. Now how about asking for decent concurrency support for CSP in the cpu HW, that lets you explore the parallel approach instead.
2005-06-21 1:33 pm
Anonymous
Err, what’s syntax got to do with code generation? C may be ugly and horrible to parse, but that doesn’t stop a compiler from generating decent machine code.
2005-06-21 1:38 pm
Anonymous
I agree with Jon. Memory is usually not a problem anymore, except for well-identified conditions like mobile systems and some other legacy systems.
True, memory size isn’t as much of a problem as it used to be, but memory speed is. The more memory you use the less of it fits into fast caches.
2005-06-21 1:52 pm
Anonymous
Iteration is hard to /prove/ right. Which is why most imperative programmers don’t bother with an axiomatic approach or assertions. That is to say, they don’t bother to prove correctness /at all/.
I don’t see why proving the correctness of an iteration is supposed to be any harder than the equivalent tail-recursive function.
In any case, the really hard part is writing a complete and correct formal specification. In effect, you’re moving the testing and debugging from the implementation to the specification.
Which is why functional programmers don’t bother much with correctness proofs either.
2005-06-21 2:05 pm
Anonymous
Here is a Python example to compare with the VB recursive factorial function:
>>> def fact(n):
if n > 1 : return n * fact(n-1)
return 1
>>> fact(6)
720
(sanity check, I know this one from memory.)
>>> fact(170)
72574156153079989673967282111292631147169916812964513765435
77798900561843401706157852350749242617459511490991237838520
77666602256544275302532890077320751090240043028005829560396
66125996582571043985582942575689663134396122625710949468067
11205568880457193340212661452800000000000000000000000000000
000000000000L
>>> fact(171)
12410180702176678234248405241031039926166055775016931853889
51803611996075221691752992751978120487585576464959501670387
05280988985869071076733124203221848436431047357788996854827
82907545415619648521534683180442932395981736968996572359039
47616152278558180061176365108428800000000000000000000000000
000000000000000L
>>> math.log(fact(171),10)
309.09377810522045
We are WAY past a reasonable size for long integers , this
is 309 digits. (343 bytes are needed) Unlike Python, VB
doesn’t automatically roll over to arbitrarty precision
arithmetic, so it can’t handle this sort of number.
2005-06-21 3:47 pm
Anonymous
You can’t write a proper QSort then
Pick a decent pivot to avoid O(N^2) worst time, like a truly random one (Mersenne Twister is fast and uniform enough) or ‘middle of three’.
Recursion is wonderful imo. Just not always useful in practice.
2005-06-21 3:54 pm
Anonymous
Can’t see the point, sorry… any arbitrary precision library in any language will do that. Python has it included by default, nice
But my real point: about log(). log() is oboviously irrational for all prime bases, so it’s waaaay a bad idea to use it to show how good your (language’s) arbitrary precision is ))
2005-06-21 3:58 pm
Anonymous
I’m an evil c++/ruby programmer. When I see a recursive problem many times I will write recursive code first because I think it’s easier to code it clearly and correctly.
If I feel I need to I will recode an iterative solution and leave the recursive there as an example of a correct baseline solution & for commenting.
c++ has even more function call overhead than c, especially with exception handling and also virtual class member functions. For recursive functions I’ve implemented a static function inside the implementation file only and pass in references to the minimum data needed to run the recursion.
2005-06-21 4:15 pm
Anonymous
I think it was a year ago we worked a minesweeper game for class; we were only responsible for a recursive function which calculates what to do when you click a cell.
Probably 45% of the people in the class blew the stack when their recursive function went into infinity. Of course everyone ran out going “why did this crash?! I know I didn’t cross any memory bounds.” and the professor responded that they’re only allowed so many function stacks.
1.) Gateway functions + recurse functions can be a barrier to maintenance; especially if you are solving a simple problem with recursion. Recursions tend to break down complexities (quicksort is a great example, or maybe I’m thinking of merge sort I forget); and when programmers encounter recursion to do something simple and linear they often go “wtf?”
2.) Like iterations it has its useful places; so please for the love of God stop acting like you use it for everything; this is really bothering me. I’m almost tempted to call ya’ll zealous for recursion; but I’m hoping you’re on the opposite side of a ill-placed fence thinking I’m saying “recursion is useless.” I’m not.
I can’t believe we’re arguing about this.
2005-06-21 5:12 pm
Anonymous
Any loop can be translated into a tail recursive function. This is just the Y operator from lambda calculus. In fact it is likely that a good compiler will do this internally so that it can optimize both forms the same way.
Tail calls are more efficient than non-tail calls. This is because you can skip the stack manipulation and just do a jump for many function calls. For this reason doing tail calls is a common compiler optimization reguardless of the recursion aspects.
Functions without side effects are easier for the compiler to optimize. For instance, duplicate function calls can be eliminated (because you can always tell that they will return the same result). Consider the following code snipet:
(+ (factorial 710) (factorial 710))
If the compiler can tell that factorial doesn’t have any side effects then it can translate this code into something more efficient:
(let ((tmp (factorial 710))) (+ tmp tmp))
2005-06-21 6:55 pm
Anonymous
@nimble: C makes it harder to do good code generation than one would think. Pointers make life hard for the optimizer, because the semantics of pointer operations are extremely general. Basically, something simple as *ptr = 0 can alter any value in memory. There is a reason people still write scientific code in FORTRAN, and there are even Scheme compilers that can beat C for certain numeric tasks.
As for prooving correctness: it’s much harder to prove the correctness of iteration. There is a huge amount of existing theory applicable to functional code, while there is very little theory applicable to imperative code. The reason is simple: having mutable values makes the math really hard.
On the point of provable correctness: I find it a bit irritating that programmers aren’t required to prove their core algorithms correct. In my aerospace program, we had to take this really hairy class that thought us to use differential equations to prove that your control mechanism was well-behaved — eg: it couldn’t get into feedback loops that caused outputs to go to infinity or something ridiculous like that. Obviously, this isn’t something you can do for the airplane as a whole, or a program as a whole, but it is something that ought to be done for the really critical stuff.
2005-06-21 6:57 pm
Anonymous
Probably 45% of the people in the class blew the stack when their recursive function went into infinity.
If the assignment had been to write an imperative version, the same 45% of people would have been wondering why their algorithm didn’t terminate.
2005-06-21 10:05 pm
Anonymous
@ozn:
“But my real point: about log(). log() is oboviously irrational for all prime bases, so it’s waaaay a bad idea to use it to show how good your (language’s) arbitrary precision is ))”
Right… you have no idea why he used log, right?
He used log to tell the number of DIGITS that are in that number… nothing fancy… and SURE nothing about the math library.
2005-06-22 1:39 am
Anonymous
You’re completely right on this. But my primary concern was about memory size. It was not nice to think that you’re nice recursive function was good and elegant but as long as you had less than X nodes (and X was not that big…). However, you’re completely right on that.
2005-06-22 5:45 am
Anonymous
C makes it harder to do good code generation than one would think. Pointers make life hard for the optimizer, because the semantics of pointer operations are extremely general.
Agreed, but as you say that’s due to semantics, not syntax, which is what JJ was saying.
As for prooving correctness: it’s much harder to prove the correctness of iteration.
Like ‘no thanks’ was saying, iteration and tail-recursive functions can be straightforwardly translated into one another. What’s an assignment in the loop turns into an intermediate-result argument in the function.
Straightforward (but inefficienct) non-tail recursion usually is easier to prove, but tail-recursion really is just another way to write loops.
2005-06-23 9:31 am
Anonymous
If you want to know why some (usually quite clever – see http://www.paulgraham.com) people are so fond of LISP-like languages then read this book:
Harold Abelson, Gerald Jay Sussman, Julie Sussman: Structure and Interpretation of Computer Programs (full text can be found on-line at http://mitpress.mit.edu/sicp/)
A whole recorded course based on the first edition of this book can be downloaded from here (in AVI format):
http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/
Based on what I’ve read about LISP, it seems that LISP provides an experience of depth to programming which is missing when one uses imperative languages like C, C++ or Java. In other words: if the programmer is deep (can think deeply, has far-reaching intuitive capabilities and such “super” powers of the mind) then LISP-like languages are a much more flexible vehicle for self-expression than traditional imperational languages (C, C++, Java and the like).
The problem with that may be that future readers of the code will likely not be in the same league as the original artist, so even the sight of the oh-so-beautiful constructs may be overwhelming to them, poor simpletons. This is the single main reason why LISP-like languages are still not in widespread use. You have to be a genius to use them according to their purpose.
Until the capabilities of the human mind cannot be reliably expanded via meditation, psychedelic drugs or other well-understood mind-enhancing technologies, we better stick to C++, Java and the simplest programming constructs we have so that our contributions to the world of software don’t fade fast into oblivion.
(That’s why Yahoo rewrote the first version of Yahoo Stores – coded by Paul Graham and associates in LISP – in C++. So that their programmers can understand it.)
2005-06-23 9:57 am
Anonymous
Paul Graham: On LISP
http://www.paulgraham.com/paulgraham/onlisp.html
A note to my previous post: I suspect that the beauty of LISP is much easier to appreciate for those who have spent several years programming in imperative languages, came to recognize how they limit self-expression via their rigid constructs and thought about the possibility of a language which did not have such constraints.
LISP has a certain “metaphysical” quality to it. If you always had a feel for the “spiritual” aspects of software design (and computer science in general), then you might enjoy the vistas opened up by the study of these “ancient” languages enormously (as far as I know, LISP was the second high-level language after FORTRAN).