I wanted to write something about the great progress being carried on linux as OS of choice for a professional Digital Audio Workstation (DAW) since a long time. With the inclusion of the Advanced Linux Sound Architecture (ALSA) into the 2.6 kernels, time has come to extend my experiences to all of you.
For clearness’ sake, and maybe to give my own ideas an order, I’ll try to list what topics will be examined in this short article. Please don’t assume this is everything linux has to offer us, it’s more like a good start:
One gets to fully understand the exceptional flexibility of this entire system only after learning how to manage the way its different parts can interact and cooperate. This design satisfies whichever type of Audio requirement thanks to its modularity. To say it all, one of the iluminated aspects showing how well Linux Audio is growing, is this adherence to the simple rule that made Unix (and Linux) great: every application takes care of *one task*, every requirement is satisfied singularly and in a modular way.
ALSA’s own task is to manage audio-relalted hardware in kernel-space, from cheap consumer sound cards to professional ones. Jack is the audio server, it allows the interaction between all the sound applications that make use of its API, LADSPA supplies the platform for audio plugin development, with several sound processors and digital effects. Finally, on top of this, Ardour is in my opinion the most successful example of how far Audio and Linux can go: A professional Digital Audio Workstation (DAW) able to transform your PC in a recording studio
This document is not proposed like a how-to or a detailed guide to compilation/installation/configuration. First of all because what would be one boring read (and write) and because the net is already full of such guides, I will limit myself to supply some links where necessary.
ALSA timidly started in 1999 with an ambitious target: to conform and standardize the Audio Layer in the linux kernel. After approximately five years the objective it has been reached, with the release of version 1 of drivers the ALSA finally integrated in linux kernel 2.6.
Audio was traditionally supported in linux through a set of drivers (OSS) absolutely does not standardized and often described as evil from the very same audio developers. There also existed the option of propietary drivers, I wonder whoever made that choice. The situation was pretty sad. Alas, even if obsoleted, OSS drivers are still used by many linuxers, and still included as a (deprecated) choice in new kernels
Do you remember the first brave distributions including ALSA drivers in their own purposedly patched kernels? The first distribution that comes to mind is SuSE, great fosterer of the plan. I recall a distribution named something like BestLinux for the first time suggesting the option to use ALSA in place of the old OSS.
My soundcard is a Yamaha OPL3SA2, not exactly the top on the market, but apart from the usual problems due to the fact of being a famous ISA “plug’ n’ pray” card, it has always worked decently with OSS. With decent I mean: listening to some mp3’s, hearing “hello! this is Linus Torvalds and pronounce Linux “, or some most hateful login sound 🙂 .
With OSS drivers, the problems began when trying something more advanced than “cat example.wav > /dev/dsp”, for the lack of common guide-lines, few advanced audio applications were available, while several applications called “sound servers” were born with the scope to satisfy in user-space the lacks of kernel-space, more on this in the paragraph dedicated to Jack.
With ALSA the things changed, it started a positive feedback carrying greater and greater interest and attracting to linux more and more audio developers who, even with good ideas, couldn’t put them to reality traditionally. Making use of devfs and procfs, and sufficiently abstracting the interaction between software and hardware, ALSA has allowed to have a rational, modern and homogenous API. It does wonders with a card like mine, allowing me to shape all the parameters that OSS did not even see, the better with professional cards with 8 or more inputs and outputs, midi and eveything you may need.
ALSA allows you to choose which modules or characteristics to use or exclude from kernelspace thanks to its modular structure, it provides libraries in userspace for the applications that make use of its API , and a series of basic utilities such as: arecord, aplay, alsamixer. Another advantage for ALSA is the complete compatibility with old the OSS, in order to guarantee the usability of the obsolete audio applications, this oss-compatibility layer can be disabled too, if you decide so.
ALSA does not use /dev/dsp anymore, like already pointed out it uses devfs – now replaced by udev – virtual devices in /dev/snd/* that reflect the actual hardware found on the machine. With my opl3sa2 I will have a device for LINE IN, one for MIC IN, one for every OUTPUT channel, one for MIDI and other invisible during the normal operations. Each one of these devices has a generic identifying name reflecting its function. In this way, I can read from and write to the soundcard at the same time. This translated in less geeky terms means to record and to listen at the same time to and from the PC!
The professional audio server for linux – Jack
The “full-duplex” mode, that is the ability to record and to listen at the same time using a single audio card, was already possible using the old OSS indeed, thanks to some software called sound servers. A Sound Server is a software that tries to manage in user space the use and access to your audio devices. Let’s see which are main sound servers and in what Jack differs from the others, and why Jack is called a professional sound server.
ARTS – KDE’s sound server
Arts is Kde’s sound server. Personally I am a big fan of this DE. If properly configured I find it very integrated and usable and it’s able to adapt to my needs. One of the best modules in KDE has always been represented by the audio layer and ARTS, the sound server. Arts is very configurabile: from a simple GUI I can choose to enable the full-duplex mode, set a sampling frequency, enable ALSA or OSS support, or to leave all the way it is. THe fact is: arts was never meant to do professional audio: when it was written it was already a success to be able to listen to an mp3 with xmms while licq (who remembers that one?) was happily announcing “Incoming Mail!”
Esound – sound “the alternative” server
ESD, alias “Enlightened Sound Daemon”, betrays from the name its origins: in fact it was written for the Enlightenment Window Manager. “E” was one of the most brilliant and sexiest software a desktop user could meet. I would say right now it is quite old, however still loved and used. Currently esd has been adopted by GNOME, another amazing Desktop Environment. Esd isn’t quite as configurable as arts and it doesn’t support full-duplex .
Audio Jack Connection Kit
Jack’s first characteristic is to make use of the ALSA devices. To be precise jack’s operation is closely related to every aspect of ALSA, that means we cannot install Jack on our machine without having installed and configured ALSA. Making use of the various devices in /dev/snd/*, jack leaves part of the audio I/O management job to the kernel.
Jack works in “realtime”. The last year has seen the “Linux in Real Time” issue bouncing from a forum to another. There has been a revival of the interest for QNX, the birth of projects like RTLinux and a bunch of patches to test the new kernel 2,6 abilities with the stable 2.4 kernels. These “abilities” evoke exotic names like “preemptibillty”, “lowlatency”, “capabilities” and obviously “realtime”. Apart from the physical pleasure when pronouncing these words, the direction taken by linux development privileges high performances to all the levels, server, desktop, workstation and obviously Audio. I am not the one to describe what this means at a technical level, but a kernel that works in realtime should guarantee all the I/O operations are carried out exactly in real time. If we think of the requirement to record and to listen music at the same time… we begin to see the light
Jack has been studied and written to fully take advantage of these new abilities offered by the kernel, providing a standardized layer for all audio applications that support Jack. Each of these applications, from XMMS to Alsaplayer, GStreamer to Mplayer, to name a few, “is seen” from our audio server, and put in relation one with another simultaneously. Every application can be thought of like a virtual device, having at least one IN and one OUT. In this way the audio stream can flow from one application to the other in homogenous and standardized ways.
As an example, let’s say an output leaves from a video opened with mplayer, then a microphone attached to MIC IN can be added to the signal, or an instrument in LINE IN, a drum track played with hydrogen and whatever else. They can all be processed with one of the LADSPA digital effects, like a reverber or a delay etc, to end in the INPUT of an application able to record the audio flow, like the simple jack_rec, or more complex applications like Ardour indeed.
Though appreciating the comfort of shell & command line, since from the first recordings and experiments with jack I tried to find a good GUI in order to manage complex connections. I am more than sure that “ONLY CLI is beautiful” is becoming more and more an enemy attitude towards linux, that has the only effect of scaring the customers. Qjackctl is a nice application that uses QT to make the task of configuring, launching and “killing” (stopping) Jack a pleasant experience.
Once launched and configured qjackctl, it will be enough to push buttons start and stop to manage the server. Attention, you cannot use more than one audio server at a time, that’s because Jack isn’t supposed to be multiuser friendly, so before launching jack make sure nothing is using the sound card. Qjackctl’s main window contains some other useful buttons, each one opening a different window showing various info like messages, or statistics and info about the audio server. Qjackctl acts as a monitor for Jack’s “Transport”, a feature making jack behave like a master for all audio applications.
Particularly useful as an example is the Connections window, used to monitor but also to modify the state of the connections and therefore acting as a so-called “patchbay”, that is as a control center from which being able to manage which INPUTS to include or to exclude from the audio flow. Obviously with “INPUT” I mean both the physically existing or whatever application providing manageable INPUTS/OUTPUTS.
Audio Linux Developers Simple Plugin Architecture – LADSPA
I mentioned the inclusion of digital effects in the audio flow in the example before. Anyone who has had any exerience with phonics, recording studios, live concerts etc etc knows what I’m writing about. I will describe “my personal” situation and requirements, they obviously cannot coincide with yours, but I believe it supplies an example of real life use of linux, in order to do things more concrete than an always regenerating apt-get dist-upgrade : -)
Before introducing the most ambitious software I’ve ever seen for linux, I would want to shortly describe the world of digital effects according to linux. If someone has any experiences in home recording, they can think of LADSPA as to the equivalent for linux of the VST Plugins.
There are a lot of types of plugins, you can can choose among classic effects type: chorus, delay, reverber, echo etc; but I found something for basically all the tastes: flanger, pitch shifter, phaser, compressor…; there are also equalizers and filters of many types, and less known others. In my experimenting I can’t do without a set of plugins by Tim Goetse that reproduce the sound of a guitar tube amplifier :-). You find them here.
Nearly all LADSPA plugins can also work in realtime. This means that I can apply the effect echo “live”, while strumming my guitar attached to LINE IN of my sound card, or “postprocessing”, that is modifying, a “track” taken before and lacking a little echo, for example.
What if I want to record at the same time a drum track played with hydrogen, a rhythm guitar and a voice? More: if I also want to add a bass line but the bass player isn’t there? More: if I want a three-dimensional effect applied on the voice track and, say, a little distortion on the guitar track? Still more: if I want that each of the “tracks” comes recorded in different files so to allow me to do small changes on the single “tracks”?
First “killer application” – Ardour
What if? I welcome you to multitrack recording!
Ardour is a good candidate to be a “killer application” for linux. It has it all: although it is a beta it is quite stable, it supports all of the standards in the digital audio, it is more and more simple and effective to use, it follows an open development and on the Mailing Lists every input or advise from the beta testers is always welcomed. Ardour is obviously free software, therefore GPL, therefore it can be freely extended and modified by anyone.
The “definition” of ardour is: Audio Digital Workstation. Ardour is closely related to the development of jack, if we consider that the main developer of the two projects is the same brilliant and somewhat visionary Mr Paul B. Davis, “former employee n.2 at amazon.com”.
Ardour is a multitrack recorder
with audio editing ability, support for plugins, unlimited undo/redo /redo, full automation support, mixing console with possibility to add a number of tracks/busses virtually limited only by the hardware available.
Let’s go back to the example proposed before in order to introduce a tipical ardour session. For this test I want to record an example song. The structure should be simple: drums, bass, two guitars and voice.
First hting to do is to launch the sound server Jack:
$: jackd - R - d alsa - d opl3sa2 - r 44100
as root, or
$: jackstart - R - d alsa - d opl3sa2 - r 44100
as simple user (in a kernel patched for “capabilities”). Note those commands make sense just in the case you have defined “opl3sa2” as your sound-card, anyway, like I said, I always use qjackct.
Now my machine is ready for being used like a small recording studio. I set up the samplerate at 44100 because my audio card and my PC in general truly suck, but jack, ardour and any other jack enabled applications support until 96KHz of sample rate, that is the audio quality found on dvd’s. Now let’s start ardour:
Without entering in the detail of ardour’s configuration and personalization, it is easy and intuitive enough to set up a session in a rational way. I generally begin with putting together a drums part with hydrogen.
Once it is ready, leaving hydrogen in execution, I launch ardour, create a new session with the appropriate menu. This will create a folder with the name I decide to give the session. Inside that directory ardour will put all the recorded data, together to the metadata it needs. Again from the session menu, I click on “add tracks” and I add the following 5:
- Drums – stereo
- Bass – mono
- Guitar1 – mono
- Guitar2 – mono
- Voice – mono
While there are I also add another stereo track and I call it MIX, see below. One thing I generally do is to save one “session” like this as a Model for future sessions. This is done saving as Model from the Session menu. Subsequently I can create other similar sessions using the saved model, as it can be seen in the image above.
Now to record hydrogen’s output, that is the drums part, directly into ardour, in the track purposely created. Remember of the”Transport”? Well, in order to make this, ardour must be configured as transport “master” and hydrogen like “slave”.
Recording the first track
- 1 click on the Input button in the appropriate strip, and from tab Ardour add hydrogen_L and hydrogen_R respectively to the channel L and R of the trace
- (optional) 1 click on the Output button to add/modify the track’s output to be able to listen to the trace while it is recorded
- 1 click on the R button in the appropriate strip
- 1 click up on the appropriate and inequivocabile red button
And finally one click on the Play button, I am recording! If not, there must be some misconfiguration: one have to make sure to correctly set up the transport master/slave parameters and those related to Input and Output. Once hydrogen has finished playng the drums part, I stop the recording. Deselecting the R button I can listen to the drums take.
Applying an example LADSPA plugin
Let’s try to add a light reverber on drums. From the routing prefences dialog under the Window menu, or right-clicking in the appropriate dark area in the mixer strip, I can assign plugins, inserts and sends, let’s try with FreeVerb, listen again to the track, better. I can always change the plugin’s parameters at any time.
“Mixdown” – the correct way to export a session to an audio file
Adding the other four tracks happens in the same way, with the difference that in order to record bass, guitar and voice we do not have the need of external programs, and it is enough to just repeat the procedure described before for every other track, choosing every time Input and Output accordingly.
At this point I have five tracks of recorded material, volumes and plugins configured to my tastes etc. Now I want to “fix” the session on a file that can be later compressed or burned to a CD. Remember that particular track called “MIX”? I am going to mix all the five tracks on this one, therefore Input/Output of the tracks is fixed accordingly to record verything on MIX. It is normal to make ugly recordings on the first tries.
As soon as I am happy about how my final MIX sounds, or as soon as I am fed up with re-recoding the same tracks, I am ready to export the session. One click on the appropriate voice from the session menu and I can give a name to the file. The default settings are usually ok. It is important at this point to only select the right and left channel of the MIX track. Short waiting and and our file is ready.
Another step is mastering what we have recorded. In order to make this we need an application designed exactly for this scope, whose name is Jamin.
In the image here, taken directly from Jamin’s homepage, you can seen Jamin can be integrated in this whole audio workspace, so as to directly receive ardour’s output and use it like its input, ready for being mastered.
My short introduction ends here, not before naming at least some of the more promising applications in the world of the Audio in Linux:
– One of the more interesting software for editing and multitrack recording, exists for many OS’es and it is quitereliable homepage
– “audio software package designed for multitrack processing”, as it is defined from the author. It makes virtually everything, it’s only issue being the difficulty to discover all of his functions. Various graphical interfaces are available, ilke those found on ecasound’s homepage
- Jack Rack
– From the name it is easily understood that this is software for guitarists, but not only. Jack Rack is exactly
that: it allows the user to add more effects, plugin LADSPA obviously, and to apply them to any jack-controlled INPUT. – Homepage
– I don’t personally use this app, but i think it’s worth mentioning it here. – homepage
Tutorials – How-to
The Unix/Linux philosophy is gaining credibility and professional audio seems to be one of the unexpected resources that could bring linux to to explore filelds different from the traditional server, although its great flexibility tends to scare many users. Here are the pages that have been useful to install, configure and learn to use the several pieces of this large puzzle:
- The links page on ecasound’s homepage
- The list of software that uses Jack audio server
- Documentation, from the basics until the philosophical quarrels on low latency
- Step by step guides to learn use ardour, audacity, jack and others. A must.
Note: Copyright on the story’s big front page icons by Jimmac.
If you would like to see your thoughts or experiences with technology published, please consider writing an article for OSNews.