Software Maintenance and Prototype Based Languages

Guest post by Trent Waddington 2004-03-29 General Development 20 Comments

Have you ever been using an Open Source application and noticed something horribly wrong? I have and as a skilled maintenance programmer it really tickles my fix-it bone. I know I could fix it if I wanted to but it’s just so much effort. Usually it’s only when a bug really annoys the hell out of me that I’ll even go to the trouble of downloading the source code (or even finding out where I can download the source code from). In the rare moments that I have taken on the feat of fixing someone else’s code I’ve found myself exercising my most mad maintenance programmer skills and I decided to make a little list.

Identify the problem.
Try to reproduce the problem.
Identify what component has the problem.
Locate the source code for the component.
Figure out how to build the source code you’ve found.
Ensure that the source code you’ve found compiles into a
component that has the problem you have identified.
Locate where in the source code the problem lies.
Read and understand how the source code produces the problem.
Build a plan for how to change the source code so it doesn’t have
the problem anymore.
Make sure you’ve made a backup of the source code somewhere.
Make the necessary changes ensuring you don’t break anything else.
Re-compile the component.
Try to reproduce the problem with the new component. If you
can, go back to step 8.
Make sure you fixed the problem in the best way possible.
Try to tell someone what you fixed and give them your changes.

Did I mention how much effort this is? What are we going to do about this? The usual solution is simply not to fix other people’s code. We can report the bug as best we can, wait around for it to be fixed and then go and grab the next release when it is available. Usually this is a lot less effort but it sort of defeats the purpose of having the source and it really doesn’t make good use of my mad maintenance programmer skills (but I suppose it does make use
of my mad QA skills).

When people fix their own code they don’t need to go through steps 3 through 8, or at least not to the same degree that someone who has never looked at their code before in their life does. In particular, it’s steps 7 and 8 that takes the most time (and really flexs the muscles of veteran maintenance programmer). In good object oriented code step 8 is usually shorter than step 7, simply because such code has short methods in concise objects that relate almost directly to the domain objects (which is the terms people think in when they identify problems in software). As such, it’s a heck of a lot faster to fix bugs in good object oriented software than it is to fix them in procedural code.

However, that step 7 is still a killer. It takes so long to search through megs of source code to find the particular object that has the problem. Trying to guess the vocabulary of the programmer and look for similar files (or using grep) is the most common practice. That’s pretty sad. Ultimately we’d like to be able to get a visual overview of the objects in the system and easily correspond these objects to the kinds of domain objects we might be using to define the problem. What would also be good is if we could narrow down that visual overview to just what is relevant to the current state of the running program. Then one could running the offending software, stop when the problem is manifesting itself and look at what objects are activate at that point in the execution. That way we can quickly narrow down what objects are involved and get on to step 8, actually figuring out what is causing the problem, without having to learn how the whole beast works.

Ironically, steps 12 and 13 take the most time in maintenance programming. This is the dreaded edit-compile-test cycle and it’s not getting any shorter. If you use a language like Java over a language like C++ you have some appreciation of how much faster compile times are. On the other hand, startup times are ridiculously long for Java apps compared to native C++ apps (not to start any flame wars).

But ultimately, they’re both too damn slow for maintenance work. Almost all problems which involve UI issues will take hours to debug using these two. Why? Because you’re spending 90% of you time edit+compiling and 10% testing. By the time you’ve fixed the first part of the problem you’ve forgotten about the other 3 parts and you’re ready to say “maybe it wasn’t so bad after all”. Resource editors and UI builders cut down on this a little, but they’re only a half-way solution. It’s so hard to see what the problem really is if you can’t click on a button and get an instant, live, response from the application, just as the user is going to see it.

Prototype based objected oriented languages like Self and Io solve both the edit-compile-test cycle and make it easier to understand the software (steps 12, 13 and 8). When you’re using a prototype based language you can edit objects as they’re running. There is no separate compilation step, and your program is always running. The software is easier to understand because you’re not looking at it from above, you’re in there with the objects telling them how they should behave. In a class based object oriented language like Java or C++ you’re telling classes how they should tell objects to behave. If, at run time, you decide that an object should have had different behavior or should have been of a different class, too bad. Even if you can edit the class at run time (as some experienced Java programmers do) you’re going to have to kill the existing objects of that class and make new ones. This means the time spent planning your solution (step 9) better be spent well or you’re going to spend more time in edit-compile-test than you did understanding the software. With a prototype based language you can explore possible solutions at run time until to find one that does everything you want. You’re also more likely to find one that satisfies step 14, fixing the problem in the best way possible.

Of course, when you get to the stage where you can effortlessly fix a problem on a running program you may just forget to tell anyone that you did. Worse yet, you might forget to update the source code for the application and the next time you run it you’ll have to put up with that problem you’ve already fixed or fix it again. Thankfully prototype based languages have recognized this problem long ago and provide a solution in the form of persistence. When an object is modified in the exploratory development environment of a prototype based language it is almost always copied to a backing store where it is retrievable at a later date. I say almost because to a certain degree programmers in prototype based languages still cling on to the class/object distinction. In Self the “classes” typically have names that end in the word “traits” (and the object typically don’t have any names at all!). So if you’re tacking new behavior onto the running object you’re going to lose it once the garbage collector comes chomping along. Typically this is bad form anyway, if you’re worried about messing up the traits object then do a clone of it and redirect the object’s parent* slot to the clone (yes, you can change an object’s “class” at run time in prototype based languages) and remember, if you do this in Io you’ll have to copy the slots you want to change over to the clone yourself. This is essentially the prototype way of doing step 10, you backup the objects you need to, when you need to.

As prototype based programming languages become fashionable again we may yet see some very interesting integration of revision control systems with exploratory development environments. Prototype based programming is about rapid persistent development and revision control is still stuck in the world of files and source code merging. The ability to track changes in a development environment changes the kinds of things that can be done automatically with revision control, like maintaining history when moving methods from one object to another and other big refactorings. Try doing this with Java files in the best revision control systems around (like Perforce and SVN) and then try merging from another branch.

When it comes to prototype based programming, the best it yet to come.

About the Author
Trent Waddington is a professional software developer and maintenance programmer associated with the Centre for Software Maintenance at the University of Queensland, Australia. His interest in prototype based programming languages began in 1997 when he reviewed the work of researchers at Sun Labs working on the Self project. He has written compiler backends for the java virtual machine and is an active contributor to open source projects including the Boomerang open source decompiler.

If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.

20 Comments

2004-03-29 8:52 pm
Anonymous
“As prototype based programming languages become fashionable again we may yet see some very interesting integration of revision control systems with exploratory development environments. ”
Not just that, also how well it will be able to fit into the rest of the software development process. Software development is still wrestling with the issues between “I have an idea” and “here, code this up”.
2004-03-29 9:10 pm
Anonymous
The benefits of fixing the program while its running are not limited to prototype-based languages. Class-based languages like Lisp and Smalltalk have been doing these things for years. What’s really needed for this sort of thing is not a prototype-based language, but a language runtime that is tightly integrated with both the source code and the compiler. That way, when the source code is edited, the compiler is invoked to generate new machine code, and the runtime makes sure that the appropriate references are updated.
2004-03-29 9:20 pm
Anonymous
4,5,6,10 and 12 can be solved on debian thusly:
apt-get source <packagename>
dpkg-buildpackage
dpkg -i <.deb file>
I’m sure you can use srpms to do the same thing.
-Mark
2004-03-29 9:52 pm
Anonymous
Not if you have Woody in you source list. You’ll be waiting a looong time
2004-03-29 10:38 pm
Anonymous
As much as I hate those “our goober solves that problem”-type posts, our goober — DTrace — does solve this particular problem, or at least Step 7 of it. DTrace can quickly point you to the errant problem in a running app (or collection of apps) without recompiling, relinking or restarting it — usually so quickly and concisely that fixing it becomes the more difficult part. We have used DTrace with great success to find problems in gigantic software systems. More details on DTrace are here: http://www.sun.com/bigadmin/content/dtrace
Alternatively, you can look at the OSNews story on DTrace: http://www.osnews.com/story.php?news_id=5160
– Bryan
———————————————————————- ——
Bryan Cantrill, Solaris Kernel Development. [email protected] (650) 786-3652
2004-03-29 11:17 pm
Anonymous
Umm… Lisp Machine anyone?
2004-03-29 11:55 pm
Anonymous
I would have to disagree here. What you propose would be handy, but it still wouldn’t fix the heart of the problem – The old fashioned try, test, debug, rewrite loop wouldn’t be broken. The importance of prototype based languages over what you suggest would be the Dijkstra inspired top-down, post-condition, to specification, & {via recursive step-wise refinement} to eventual code generation. Even this though wouldn’t suffice. Instead, it would lay the ground work for a tight coupling between prototyping & formal methods. What we need is a language in which we can write a specification & via stepwise refinement simultaneously generate the code & a proof of correctness. As long as we are stuck in the test & debug loop we can never guarantee that our code will always work. We need a mathematical approach that can prove the code is correct. Formal methods formal methods formal methods!
2004-03-30 12:13 am
Anonymous
I really don’t see how prototype-based languages address your concerns about top-down design. My comment was in response to the fact that the author attributed runtime modfication of code to prototype-based programming languages, while in reality it is attributable to any of the languages that have more highly-evolved runtimes.
As for your comments about formal methods, I (along with many others) am not a big believer in them. Alan Kay once said:
“Until real software engineering is developed, the next best practice is to develop with a dynamic system that has extreme late binding in all aspects.”
I don’t believe we have gotten to a point where we can reliably and easily treat complex software with a mathematical level of rigour. The tools we do have today often sacrifice expressiveness for precision, which I consider an unacceptable trade-off. In the end, it comes down to the fact that the science of software development (as opposed to computational theory) is underdeveloped. Engineers in other fields have all sorts of formal methods for treating large, complex systems with a mathematical level of precision, and good tools to automate that process, but software developers do not. Until such tools are invented, I agree with Alan Kay that the next best thing is an extremely dynamic environment that allows for iterative, incremental devlepment.
2004-03-30 1:48 am
Anonymous
but a language runtime that is tightly integrated with both the source code and the compiler
Maybe some ultra-JIT system could do that, but don’t we really want some sort of uber-repository that can do a lot more that just slow down our execution?
Maybe IBMs integration of Eclipse with UML will be a step in the right direction…
2004-03-30 1:57 am
Anonymous
ReactoGraph is an example of another visual prototype-based language that also uses message-passing and visual data flow programming…
http://www.cs.dal.ca/~gauvins
Enjoy!
2004-03-30 2:17 am
Anonymous
This is why open source will never work. No one wants to spend time to make a serious, high performance program for free.
2004-03-30 2:34 am
Anonymous
Eh? You don’t need a JIT. Most such systems have native-code compilers. You do pay a 10-20MB memory overhead, for having the compiler integrated into the runtime, but a good implementation will stuff that into a shared library so you only have to pay it once.
To get a taste of what such development environments can do, there are a number of products you can check out. There is:
IBM’s Visual Age Smalltalk: http://www-306.ibm.com/software/awdtools/smalltalk/
Dolphin Smalltalk: http://www.object-arts.com/
Cincom’s VisualWorks: http://smalltalk.cincom.com/index.ssp
All of these have free demos that you can play with.
Lately, I’ve been playing with Functional Developer, an interactive IDE for Dylan. They’ve got a free basic edition IDE for Windows.
http://www.functionalobjects.com
Check out their debugging manual available:
http://www.functionalobjects.com/products/doc/env/env_100.htm
It does some really nifty stuff. You can do things like catch an exception, edit the offending methods or classes, and restart from where the exception was thrown. You can also run more than one program at the same time under the debugger, to make it easier to debug synchronization issues between client/server programs.
2004-03-30 4:20 am
Anonymous
The window of innovation only opens every few years.
Java crept in just as it was clear C++ was too complex. The window only stayed open for a year or so. Python was too late. Try again in five years.
Linux waltzed through the door to innovation just as the old unixes died and Windows looked around for any competition. SkyOS was too late. Maybe in a decade the door for a new OS will open again.
I’m not trying to be a troll – I’m serious in stating that you cannot hope to re-seed the marketplace with a new tool or method when that market has already picked the next model and has not found fault with it (yet).
The window for a new methodology of programming languages is not open. The Java/C# model, as repugnant as it might be, will need to run its course for another few years before people seriously start looking for alternatives. Frankly trying to push an alternative at this point is a waste of time, as is trying to push a linux alternative.
2004-03-30 8:14 am
Anonymous
Rayiner Hashem quoted Alan Kay:
“Until real software engineering is developed, the next best practice is to develop with a dynamic system that has extreme late binding in all aspects.”
I find this strange. In my experience, the earlier the binding happens, the fewer chances for errors.
Furthermore, you can get a long way towards more maintainability and fewer bugs by adopting a good coding style. I’m not talking about rules about where to put your braces and spaces, but start by coding what you want done, and defer coding how to as late as possible. Generic programming helps here, and you can start working in this way today with C++.
2004-03-30 8:53 am
Anonymous
I’ve explored a couple of packages using netbeans for java and the debugging tools address some of the problems, notably about understanding the what code does what behaviour….
The packages had visibly been coded using netbeans so that did help..
2004-03-30 11:26 am
Anonymous
When you’re using a prototype based language you can edit objects as they’re running.
mh.. you mean something like:
if var
def object.method
..
end
end
else
def obj2.meth()
..
end
end
or you mean Object.define_method(code)? Or similar mechanism in ruby, smalltalk, python, CLOS..
There is no separate compilation step,
Oh so you mean an interpreted language..
and your program is always running.
..or an *image based* language. SmallTalk had this for looong time, this does not relates anyhow with prototypes.
As prototype based programming languages become fashionable again we may yet see some very interesting integration of revision control systems with exploratory development environments.
You never took a look at Squeak, right? never heard of monticello? Guess what? Squeak has open/always running/editable objects. And it has a versioning system built on his architecture. Like any other SmallTalk system.
I can’t understand how prototypes have a role in this article.
2004-03-30 3:59 pm
Anonymous
“””but start by coding what you want done, and defer coding how to as late as possible. Generic programming helps here, and you can start working in this way today with C++. “””
I’ve found that only works when you can design the whole system top to bottom and you have relatively static requirements. In most cases, an over all design, a data structure design, and a bottom up design seem to give a better success rate (anecdotally). As for generic programming in C++, yuck, ML derivitives offer a much better alternative.
2004-03-30 4:34 pm
Anonymous
As a hobbyist hacker, I shudder to even consider looking at C/C++ or even Java source code of a large project to find a particular bug. What would significantly help me overcome this hurdle would be if there were some good, documented UML for the software in question, in particular, class diagrams and maybe some sequence diagrams, and use case diagrams).
People are good at ‘talking’ the benefits of modelling, best practises development processes and design patterns. It’s time people start ‘walking’ what they’re ‘talking’. Let’s see some UML for Apache, Mozilla Firebird, MySQL, and even parts of the Linux kernel, Xfree86, bash, and such, with an emphasis on those areas most likely to be contributed to and/or bug-fixed by the community.
2004-03-30 4:56 pm
Anonymous
There is a saying in the Lisp community — languages don’t just determine how you express solutions to your problems, but how you think about your problems. The whole idea of late-bound languages, as Alan Kay refers to, is that they make it easy to think about your problem with code.
Solving “hard” problems in static languages is usually a chore. You have to deal with all sorts of requirements that having nothing to do with solving the problem. You have to specify types, for example, even though they just get in the way of refactoring your code. Thus, you have to think independently of the language, and once you think you’ve got a solution, implement it in the language.
However, many problems are much easier to think about when you have the computer to help. You can ‘think’ by prototyping in a late-bound language. You can toy with your ideas immediately, and quickly build working prototypes. Because of the ease with which dynamic code is refactored, that prototype can be easily turned into production code, so that effort is not wasted, and no bugs are introduced in rewriting the prototype.
Its a matter of top-down vs bottom-up development. In the former model, the computer doesn’t help you solve your problem, it just offers a way to express the solution. In the latter model, the computer is an active partner in the problem-solving process. In the former model, you often have to write a significant amount of code to the spec before you can get any of it to work. In the latter, you start with at tiny core program, and gradually refactor and build it up, keeping it working the whole time. Its the whole idea of iterative development, which is taking off in extreme programming circles these days. There is a nice article about the history of ID here:
http://www2.umassd.edu/SWPI/xp/articles/r6047.pdf
2004-03-30 11:20 pm
Anonymous
I’ve seen a few comments thus far that have basically amounted to “Smalltalk can do that!!” Clearly Smalltalk (and LISP) is the grandfather of runtime editable objects, but in the same breath that I mentioned runtime editability I also mentioned program understanding. In particular:
The software is easier to understand because you’re not looking at it from above, you’re in there with the objects telling them how they should behave. In a class based object oriented language like Java or C++ you’re telling classes how they should tell objects to behave.
That applies to Smalltalk also. The level of indirection from programming a class and then instantiating that class is simply too great for runtime editing.