About 3 years ago I was looking around for something to add multimedia capabilities to my GNOME desktop. At that point in time there wasn’t really that much around. I think the most advanced video player for Linux in those days was XAnim, which was neither were moving quickly or could qualify as free software, except in the beer context. Projects like Xine and mplayer had either just started up or not come into existence yet.
Also my interest wasn’t purely aimed at playing back media files on my own machine, but also to see if there was something out there I could help push forward to give Linux developers and users something competitive to Microsoft’s DirectMedia and Apple’s Quicktime. Which meant I was looking for something which would also allow developers to relatively easily do more advanced stuff.
Anyway I started looking around and discovered GStreamer. I guess what pulled me in was the screenshots of the pipeline editor which gave me the clear feeling that this was more than just a playback application, but something that could be used for a much wider range of applications. Based on that I decided to do an interview with the developers for the news website I was involved with at the time (now gone). I guess I never left after doing that 🙂
The core concept in GStreamer is that of a pipeline system which your media streams through. This means you have one or more sources which can be anything like a file, an URL or a hardware device. Depending on how you construct your pipeline you can then have lots of things happening to that media stream before it ends up in one or more sinks at the other end of your pipeline. The sinks can be like the sources; a web stream, a file or hardware device; all depending on what plugins and elements you have installed.
So what can happen in the pipeline? Well the possibilities are almost endless. In GStreamer there are some different classes of elements. There is the basic stuff like elements for decoding different formats, demuxers for splitting the audio and video into separate streams, muxers and encoders for merging the streams back together and encoding them in a format of choice. Then there is a class of filter elements. Filters can do things ranging from technical transformations like colorspace conversions, stereo to mono and vice versa, to adding effects to video images like make them look old or psychedelic or make the video look as if a bug is looking at it. There is no limit to the number of elements you can apply to a given pipeline except the limitations your hardware imposes on you. For instance if you are doing an application that needs to do work in real-time, it puts limitations on what kind of things you can do, because if you do too much your machine will simply not be able to do the computations fast enough. The GStreamer architecture is designed however so that the pipeline system itself adds no latency to the pipeline. This is a prerequisite for many types of applications which demand low latency.
GStreamer also contains an advanced system for negotiating capabilities. This means that GStreamer itself can assemble a line of elements that takes the input it gets and transforms it into a form that the output device you are using supports. So if your output device is something using the I420 colorspace but the video stream coming in is in the RGBA colorspace then GStreamer will be able to automatically assemble a pipeline that converts the colorspace for you. Since GStreamer handles these things itself, a developer who wants to write media applications doesn’t need to learn about colorspaces, bitrates and sound card clock speeds; GStreamer provides you with an easy to use API that lets you focus on your actual application instead of worrying about what kinds of things happen at the lower levels.
Developing GStreamer has been a lot harder than I think was anticipated when the project started out. Unlike many other free software projects, GStreamer was not a simple re-implementation of something which had been done before. I guess we did what Steve Balmer claims free software never does: we innovated. The basic design and basic idea came from a research project at Portland University, research work in which GStreamer project founder Erik Walthinsen participated. It was loosely modeled on DirectShow. This research gave us the basics, but when you take something out of the lab and place it in the real world, many new issues tend to arise quickly. This means that the last three years, we have had many re-writes of core modules as the original design needed extending in order to allow people to use GStreamer for more and more varied stuff and as real world worries such as support for legacy formats reared their head.
It is important to know that GStreamer has always been focused on two things : keeping the core media agnostic and keeping it GUI independent. In fact many of the first commercial users of GStreamer used it on the server for things like audio formats conversion at a telecom, recording and storing clips of live news and archiving at a radio station, and similar applications.
GStreamer in the embeded world
From the early stages, GStreamer had a focus on keeping the core small and nimble enough to be useful on various embedded devices. GStreamer’s first corporate sponsor was the embedded company RidgeRun Inc. working on a version of Linux for devices uses CPU’s from Motorola. Unfortunately their cash reserves ran dry before they could get some products to market. The legacy of this work still remains however, and GStreamer can still today be easily slimmed down to fit on even small devices. GStreamer is also the media player back-end of the GPE media player from handhelds.org. Philip Blundell maintains this player, he also made the Tremor Ogg Vorbis plugin in GStreamer.
GStreamer on the server
GStreamer has long been used by various groups and companies on the server side. Known uses have been as the backend for icecast webcasts, format converter at a telecom and media archiver at a radio station. Since the core GStreamer system doesn’t contain any GUI code it is perfect for making server solutions.
There is also a new company being formed around GStreamer these days which will among other things produce server-based multimedia solutions based on GStreamer. We feel confident that the creation of this company, which already has its financing taken care of, will help boost our effort even further.
GStreamer on the desktop
The GStreamer team has from the beginning been very interested in seeing GStreamer integrated into the free desktop projects. As free software developers we are very keen to see the Linux desktop really becoming competitive with the commercial desktops out there and we feel that having top of the line multimedia capabilities is essential to making this happen. From the GStreamer point of view having a media player is just the tip of the iceberg in that regard.
GStreamer never had basic playback as a primary goal : it was, and to some degree still is (even if ‘real’ world concerns has made us increase the focus on it) a secondary concern to us. What has been our focus from the start is to make sure that GStreamer is powerful and flexible enough to allow people to make advanced media applications like non-linear editors, advanced audio mixers, sound editors, softh synthesizers, transcoders and so on. By having this focus we feel we have been able to make a much more clean and extensible design than we would have gotten if we focused on media playback and then added features to support other things as an afterthought.
As part of our desktop push we got GStreamer integrated into GNOME 2.2 and recently moved GStreamer CVS and website to be hosted on freedesktop.org. While the move has little practical effect we hope its sends out a clear message to developers that we see GStreamer not as the GNOME multimedia framework but as the Unix multimedia framework.
The use of GStreamer in GNOME
We introduced GStreamer at a basic level in GNOME 2.2 and the integration is planned to go much further in GNOME 2.6. This is what we are working on currently. A new addition that developers and users will notice in GStreamer 0.8.x is that it has a mixing interface designed by Ronald Bultje. This means that whenever someone writes an element for a sound system for GStreamer, all the core applications in GNOME will be able to do mixing and volume control on that sound system. Today if you want to add Irix audio volume control to GNOME you would need to add it to a lot of separate applications like the gnome-mixer, the GNOME volume applet, ACME (the gnome multimedia keyboard daemon) and any other GNOME application which adjusts volume and so on. No more.
In GNOME 2.6 we plan to have all these systems use GStreamer which means that if a Irix audio system plugin is written for GStreamer then all the basic applications in GNOME can adjust the volume on an Irix system. Currently we have an OSS and an ALSA plug-ins which support this mixing interface, but there are work being done on trying to get a Sun Audio one also. Others will be added as they are contributed. Note that this new interface is for hardware mixing, software mixing we have of course been doing from the beginning.
In GNOME 2.6 there will also be two new applications using GStreamer. Sound-Juicer is made by Ross Burton and is a neat little cd-ripping application. Rhythmbox, maintained by Colin Walters, is an advanced music playback application comparable to Apple’s iTunes.
We are also working hard on offering good video playback and video encoding, so even if we will not bundle any video player with GNOME 2.6 I do expect that some of the distributors will start to bundle Totem or gst-player during the lifetime of GNOME 2.6. And then if things go as planned I guess we see a GStreamer-using video player and maybe even a video recorder bundled with GNOME 2.8. Of course the main problem here in regards to distributors in the issue of a lot of the common media formats out there being covered by a host of patents in some countries, with the US being the one hardest to ignore, which makes it problematic for most Linux distributions and Unix vendors to ship GStreamer or any other media framework with support for these formats enabled.
This distribution problem is why supporting formats such as Ogg Vorbis, Ogg Theora, Ogg Tarkin, Matroska and FLAC both in regards to decoding, but also for encoding has been a top priority for us. We want to be a enabler for people in regards to moving to using free formats for their own stuff. In fact we have a really cool announcements for the world soon in that regard.
The use of GStreamer in KDE
GStreamer is not officially part of KDE today which means that there is naturally a lot less integration of GStreamer into the core of KDE. The first release that might see GStreamer officially adopted for KDE is KDE 4.0, as that is first opportunity where the current arts system can/should be deprecated.
This means that GStreamer use is currently at a more experimental level in the KDE world with
Scott Wheeler’s JuK being the only major application which offers using GStreamer as its backend. It uses the KDE GStreamer bindings developed by Tim Jansen which are hosted in KDE CVS currently.
There are some other KDE projects apart from Juk looking into using GStreamer out there, so hopefully when the times comes for KDE to choose the choice will be easy to make. Not only will such an event make sure that multimedia development effort is focused on going forward instead of redundant re-implementation, but it will be a portal to taking things a huge step forward in regards to cross-desktop interoperability and co-operation.
Plans for the future
We are pretty happy with the current feature set of the core of GStreamer. So our focus for the next half year will probably be on polishing what is there and updating old elements and creating new ones.
The only major missing component is a subsystem for doing MIDI with GStreamer. Actually it is probably just a set of elements, but until it is actually implemented we can’t be 100% sure that the core doesn’t need some changes to accommodate it. We have had several people expressing interest in taking it on, but no one has actually committed some code for it so far, but hopefully someone will get a start on it soon, as it is really something we need in before starting to aim for GStreamer 1.0.
The new development series also includes support for interactivity which is needed for enabling such stuff as DVD menu’s and Flash animations to work through GStreamer. The interactivity support was created by David Schleef and is currently being completed and extended by Jan Schmidt. I am not sure we will have a finished DVD playback application or complete Flash support ready for 0.8.0, but the basics are there so we should be able to offer those during the 0.8.x series lifetime. Of course the flash support depends on interactivity being integrated into the flash decoding library Swfdec that David Schleef are developing in parallel with GStreamer. Developers interested in flash should really take a look at swfdec as it is the only flash implementation available out there available under a license as liberal as the LGPL, and it can be used outside of GStreamer.
Another fun new feature is our advanced new support for metadata/tagging designed and implemented by Benjamin Otte. We have devised a system which should be able to preserve your metadata when you use a GStreamer based application to transcode your files. So for instance if you convert some of your FLAC files to Ogg Vorbis with GStreamer, the metadata tags should be converted along with the music. Making an easy to use application with GStreamer for doing media conversions might be an idea for someone who want to learn how to program with GStreamer.
Another interesting development is that we currently got a team of about 7 french students who are going to make a GStreamer-based non-linear video editor as the final year project. This effort will probably improve the framework in certain areas like adding new advanced clocking elements for doing things like double speed playback. In relation to this we just completed a major change in our system which enables, among other things, splitting a video signal into two parallel windows. This means that in the case of a video editor you can view the same video signal in two different windows; the first could be the original video for instance while the second contained your transformations. A great way to experiment with different transformations in real time.
We also hope to integrate a SMIL implementation done by Malcolm Tredinnick in the coming months. This will actually not be part of GStreamer as such, but it is designed to integrate easily with GStreamer. SMIL is a W3C standard using XML that is meant for doing interactive audiovisual presentations. This is very useful for accessibility for instance, where we hope to see GStreamer tools written that will help popularize the addition of accessibility information using SMIL to formats such as Ogg and Matroska. We are especially excited about this as this will be the first implementation of SMIL available under the LGPL.
Other interesting improvements are the work on Quicktime and ASF done by Jeremy Simon, our improved AVI and Matroska encoding by Ronald Bultje, the color ascii art plug-in based on libcaca by Zeeshan Ali, our improved error handling by Thomas Vander Stichele and the software video scaling and gst-player and Totem development done by Julien Moutte. A more long term project that has been discussed (long term unless someone outside the current group of hackers steps up to the plate) is writing a new sound server using GStreamer.
On the competition
Don’t want to start a huge flamefest so I keep this one brief in the hope that the level of toe stepping will be acceptable :). We really feel that GStreamer is at a sweet spot at the moment with little real competition. Other projects within our sphere of application are either much more narrowly targeted (like Xine or MPlayer, which are great backends for creating basic playback applications, but in our opinion not well suited as backends for a wider more complex range of applications), or they are much more immature (like MAS and NMM), or licensed in a way that makes them uninteresting to most free software developers (like Helix). Or, even, a combination of all of the above issues.
Another major advantage of GStreamer in addition to our relative maturity is our licensing. GStreamer uses the LGPL license which we feel is the perfect license for a library. It allows plug-in and application developers the freedom to use the license of their choice (our choice for the applications we make ourselves is the GPL), yet reducing the chance of major forking or fixes being withheld, which is what a more liberal license could lead to.
GStreamer is also highly portable; we had reports of people running it on Linux, FreeBSD, Solaris, AIX and Irix. Recently I also got a mail about the core compiling on OSX with some soon to be merged patches attached. The code also compiles with other compilers than GCC, for instance Sun’s Forte. Porting to Windows shouldn’t be overly difficult if GLib works well. Possibly the only thing that would need coding is a custom scheduler.
Anyway, back to what got me started with GStreamer. As said it has been a long and at times though journey, but I feel that we have now finally gotten to the point both with the library and with available applications using it, that I feel that we are able to deliver what I originally wanted. Just the continuation of the projects and efforts that are already underway is proof enough that we have succeeded in delivering what I was looking for those years ago. That said we are still not done and there are of course still areas needing work, but that is just the way of software; it is not something static but a continuesly evolving entity.
I hope I managed to get you interested in GStreamer and what it can offer developers and through them what it can offer users. If you want to talk you can find us at #gstreamer on irc.freenode.net or you can get in touch through the mailing list.