posted by Yamin on Wed 9th Sep 2009 16:17 UTC

The Problem with Design and Implementation, 2/3

Source code is a valid specification on its own.

Yet, we seem to cling to this notion of the specification that 'just needs to be implemented'. Let us expand this little example a bit. Pardon the extreme simplification here. Much more likely than the full specification you see above, is something like the following or even no specification at all:

Program Inputs:   A,  B
Program Outputs: A/B

This leave the 'code-monkey' needing to fill in the blanks as the specification is not complete. The programmer might choose to return -1 instead of throwing an exception if B is 0. He might not even bother to do error cheeking and just rely on the language itself to handle the issue. He might choose to use long, double, float data types instead of integer. No one knows. This is for a very simple function. Imagine the blanks the programmer has to fill in for anything more complex. Never is this more obvious than in any application that needs a graphical user interface (GUI). The impact of this is immediately seen by users. Don't blame the programmer here. I highly doubt the GUI was specified completely before they started coding. Then when things go wrong you have people wondering what went wrong. Why couldn't that programmer just implement what we wrote in the spec? The programmers say the spec was incomplete. Again, the only complete specification is really the source code itself. Everything else is really just an incomplete specification. No one is really to blame except the broken process itself.


The Academic Problem

This is not just a problem in industry with the stereotyped MBAs not understanding software. Often, you hear from engineers that universities should not be teaching software in a specific language, but they should be teaching abstract notions of algorithms and data structures. I agree, but how do you propose students express algorithms or data structures? Yes, this is why programming languages were invented. To allow us to express algorithms and data structures in a human readable format! You have to learn the language to express your ideas. How do you test your algorithms and data structures? By writing it in a specific language and running it. The power of a good programming language is essential to your learning. You can set break points, inspect variables, make changes and see the results immediately.

Indeed, by insisting that specific programming languages are not 'valid' ways to specify something, academia only reinforces this notion of design and then implementing it. While this might be true in some abstract notion within academia it has disastrous effects when carried over into the rest of the world.

Let's have a look at a simple algorithm one might encounter in academia. Here is Wikipedia's description of the Euclidean Algorithm to find the Greatest Common Denominator.

The Euclidean algorithm is iterative, meaning that the answer is found in a series of steps; the output of each step is used as an input for the next step. Let k be an integer that counts the steps of the algorithm, starting with zero. Thus, the initial step corresponds to k = 0, the next step corresponds to k = 1, and so on.

Each step begins with two nonnegative remainders rk−1 and rk−2. Since the algorithm ensures that the remainders decrease steadily with every step, rk−1 is less than its predecessor rk−2. The goal of the kth step is to find a quotient qk and remainder rk such that the equation is satisfied.

rk−2 = qk rk−1 + rk

where rk « rk−1. In other words, multiples of the smaller number rk−1 are subtracted from the larger number rk−2 until the remainder is smaller than the rk−1.

In the initial step (k = 0), the remainders r−2 and r−1 equal a and b, the numbers for which the GCD is sought. In the next step (k = 1), the remainders equal b and the remainder r0 of the initial step, and so on. Thus, the algorithm can be written as a sequence of equations

a = q0 b + r0b = q1 r0 + r1r0 = q2 r1 + r2r1 = q3 r2 + r3...

If a is smaller than b, the first step of the algorithm swaps the numbers. For example, if a « b, the initial quotient q0 equals zero, and the remainder r0 is a. Thus, rk is smaller than its predecessor rk−1 for all k ≥ 0.

Since the remainders decrease with every step but can never be negative, a remainder rN must eventually equal zero, at which point the algorithm stops. The final nonzero remainder rN−1 is the greatest common divisor of a and b. The number N cannot be infinite because there are only a finite number of nonnegative integers between the initial remainder r0 and zero.

Brings back memories of university doesn't it? Now you read that and yes, that can be implemented in any programming language. Here is one of the implementations done from the same Wikipedia article.

function gcd(a, b)
if a = 0
return b
while b ≠ 0
if a » b
a := a − b
else
b := b − a
return a

Granted it is in pseudo-code, but it wouldn't take much effort to convert it to C# or any other language. If I were writing a program what would be the point of me spending several paragraphs describing the procedure for the algorithm in some mathematical and English language, when I could just express it directly as source code? Can you imagine handing off that written mathematical specification to a 'code monkey' to just implement? He'd have more trouble understanding what you wrote than if you had just written the code yourself.

And this is the same problem in every other realm from network protocol specifications to HTML standards. Every specification that is complete is better off just written as source code directly. I can almost guarantee you that it will be more understandable as well.

Table of contents
  1. The Problem with Design and Implementation, 1/3
  2. The Problem with Design and Implementation, 2/3
  3. The Problem with Design and Implementation, 3/3
e p (8)    82 Comment(s)

Technology White Papers

See More