In recent years “scripting languages” are becoming a path which is a must go for rapid application development. The open source community has seen many scripting language implementations. Some really popular and good ones available are perl and python.
I recently had a pretty good discussion about this with one of my friends on chat. Mary was trying to find out about type systems in general and the basic classification of languages in terms of the typing system they use. The following is a summary of what we discussed and finally concluded upon.
The major reason why scripting languages exist these days is because it is easy to develop using one. Most scripting languages don’t enforce type declarations of the variables which relieves the developer from supplying the type of a variable every time he declares it. Scripting languages often feature a very rich set of primitive data structures which can be operated upon easily by using very concise syntax. They also feature a huge library repository so that the developer time is cut down.
These were more like the “language inherent” features of a scripting language. Scripting languages also have another notable common aspect which is of the way they are implemented. Almost all scripting languages are interpreted. Often, in popular use of term, “scripting” is generally attributed to an interpreted language implementation rather than one which has language features like the ones shown above. However, this might be incorrect to say if we adhere strictly to the “ease of programming aspect” of scripting languages as their compelling feature.
Personally, I give a lot of importance to efficiency. That means I am not glad to see slow implementations of good scripting languages. However, I also understand that these days programming throughput is so important that people are willing to sacrifice efficiency for gains in programmer productivity. This brings us to an important question. Does programmer productivity imply an interpreted or slow implementation of the language? Programmer productivity depends on a number of things
– The ease of use of the language (availability of high level data structures and operations, relaxation of type declaration)
– Lesser learning curve for the language itself
– As a (though not a very avid) developer, I have seen that the major amount of learning curve that I face is from learning new APIs for different languages. Often getting used to the libraries of a language is more demanding than learning the language itself.
– An easy way to integrate native (harder but more efficient) languages to the scripting language. One may think of the scripting language API in a low level language like C to be the bridge used to do the same.
It should be noted that none of the above features actually imply that the language implementation be interpreted. Often the meta-hacking and introspection demands in a scripting language are high. However, even this amounts to a powerful runtime of the language and doesn’t necessarily imply an interpreted language implementation. In fact, the very fact that scripting languages and native languages are so diverse offer rest of the three problems to a programmer.
Mainstream programming languages have also come far from the times of the assembler and C. Nowadays languages sport constructs for object oriented programming, sometimes even syntactic sugar for commonly used primitives (C#?). Languages like OcaML feature a very rich type inference mechanism which does away with the need to supply types at all. This is like scripting languages with typeless variables except that in scripting languages the types are checked at runtime whereas in OcaML it is statically checked, making the final compiled program faster. Advances like these show that it is possible to keep an efficient implementation of the language even if the language has scripting language like features. Languages like Java have a really powerful runtime which enables programs to introspect existing data structures and what’s more, even create custom types while the program is running. Language features like these go a long way to support better software engineering practices and increase productivity.
It is in fact possible to eliminate the other three hurdles in programmer’s productivity by implementing a well designed language system. Why do I call it a language system? It is because, most often than not scripting languages are used to glue together natively implemented components. Thus libraries of the scripting language are native, but the program is made in the scripting language. Sometimes, scripted components are called from inside the native application. All this motivates a good amount of interoperability between the two languages and that’s where the API for the scripting language in the native language comes in. This bridge should be easy to use. In a well designed language system, this bridge can also be entirely eliminated. The best example of such a system that I can think of is BeanShell/Java. The BeanShell scripting language is an adaptation from Java itself except that it has the features of a scripting language like typeless variables and syntactic sugar.
However, since the languages are inherently the same the system attempts to completely do away with a bridging API. Thus BeanShell scripts can run in the same runtime as the native Java program itself. They have completely interoperable objects and methods. So an object in BeanShell can be thought of as a normal object in Java and vice versa. There is no API required because the BeanShell scripting engine takes care of making objects in BeanShell interoperable with those in Java using Java’s powerful reflection mechanisms. Now let us see what does this system give us. Learning curve for the language? Very little as it is an adaptation from the native language we are anyways supposed to know. Learning curve for libraries? None, as it has the same library suite that Java has except that now it is available from a language which is easy to program in. Learning curve for the bridging API? None again as there is no bridging API! However, BeanShell/Java is not a very perfect system for the following reason.
Java is a neat language and there have been claims that it runs fast enough to be compared to C/C++. However, it still has a large memory foot print and high startup times. The complaint is more against Java than BeanShell. Current implementations of Java still rely on a Java runtime to C runtime bridge for most system calls. It would have been better if Java was directly used for interfacing with the OS… just too many issues to discuss here.
BeanShell doesn’t give any high level convenient data structures like other scripting languages. It would have been great if BeanShell had syntax for directly manipulating data structure classes in the Collections Framework. Some more operations could also be automated. This might result in some form of a bridge API emerging. However, it will never be a bridge between the two languages. It would be a bridge between high level data structures only.
BeanShell is interpreted and slow when it might have chosen not to be. Perhaps it was in the interest of BeanShell (designed to be embed-able in Java applications) to be implemented in Java itself and exploit reflections through the Java API. It might have chosen to compile the script to bytecode on the fly and then run it. However, this is not the goal of BeanShell. BeanShell is supposed to convert a script into some data structures, objects and changes in the runtime such that it seems as if the script executed as a Java program in the same runtime.
By a language system we are targeting the same runtime, the same libraries via two languages. One for rapid prototyping and one for actual type checked, efficiently compiled prototyping. This is close to the .NET architecture except that instead of having “the one” runtime for all possible languages in the world, you have one runtime for each native language with a scripting dialect also compiled in the same runtime. Having “the one” runtime causes languages to be forced out of their natural being into a model more suitable to the runtime. Thus, if .NET has a predominantly object oriented kind of a runtime, Haskell or Lisp will look kind of odd on it.
Another compelling feature of scripting languages is that they are embed-able in applications. Thus, the native program itself can convert a script string supplied, to meaningful operations on its own data structures. BeanShell is essentially aimed at this. Such language implementations could be interpreted. However, it is again possible and desirable to be able to actually compile them on the fly and embed them in the existing language runtime. Such an API for executing scripts could actually be derived from the scripting dialect implementation itself as discussed in the previous paragraph.
And now some rant ;-). I have been looking at this language called Objective C lately. It seems to be a natively compiled version of a language like Java. Apart from that it is fully and efficiently interoperable with C (this is a big plus!) and has a powerful runtime. This it is an excellent candidate for being the native language part of a scripting dialect. The scripting dialect could be used for rapid prototyping without alienating a lot from the base language and at the same time contributing to a common pool of libraries. The scripting dialect would basically consist of the scripting language definition, and two implementations of it; one for compiling these to generate native code for the same runtime, and another providing an interpreting API which would allow us to embed scripts into applications. The interpreting API could as well be a Just-In-Time compiler, reusing the effort spent in making the native compiler for the dialect. Has anybody done this for Objective-C? (I have heard about a python implementation in python from one of my friends… we now know a use for that. But again, I won’t choose python as a native language for a good language system.)
About the Author:
Ritesh Kumar is a graduate student in the Department of Computer Science, University of North Carolina at Chapel Hill. Operating Systems, Systems design and Development Platforms are things that interest him the most. He can be reached at ritesh at cs dot unc dot edu.
If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.