Have you ever wondered why in other Operating Systems such as Windows, MacOS or even BeOS installing software is so easy compared to Linux? In such OSes you can simply download and decompress a file or run an installer process which will easily walk you through the process.
This doesn’t happen in Linux, as there are only two standard ways to install software: compiling and installing packages. Such methods can be inconsistent and complicated for new users, but I am not going to write about them, as it has been done in countless previous articles. Instead i am going to focus in writing about why it is difficult for developers to provide a simpler way.
So, why can’t we install and distribute programs in Linux with the same ease as we do in other operating systems? The answer lies in the Unix filesystem layout, which Linux distros follow so strictly for the sake of compatibility. This layout is and was always aimed at multi-user environments, and to save and distribute resources evenly across the system (or even shared across a LAN). But with today’s technology and with the arrival of desktop computers, many of these ideas dont make much sense in that context.
There are four fundamental aspects that, I think, make distributing binaries on linux so hard. I am not a native english speaker, so i am sorry about possible mistakes.
1-Distribution by physical place
2-“Global installs”, or “Dependency Hell vs Dll hell”
3-Current DIR is not in PATH.
4-No file metadata.
1-Distribution by physical place
Often, directories contain the following subdirectories:
lib/ – containing shared libraries
bin/ – containing binary/scripted executables
sbin/ -containing executables only meant for the superuser
If you search around the filesystem, you will find several places where this pattern repeats, for example:
You might wonder why files are distributed like this. This is mainly for historical reasons, like “/” being in a startup disk or rom, “/usr” was a mount point for the global extras, originally loaded from tape, shared disk or even from network, /usr/local for local installed software, I dont know about X11R6, but probably has its own directory because it’s too big.
It should be noted that until very recently, unixes were deployed for very specific tasks, and never meant to be loaded with as many programs as a desktop computer is. This is why we don’t see directories organized by usage as we do in other unix-like OSes (mainly BeOS and OSX), and instead we see them organized by physical place (Something desktop computers no longer care about, since nearly all of them are self contained).
Many years ago, big unix vendors such as SGI and Sun decided to address this problem by creating the /opt directory. The opt directory was supposed to contain the actual programs with their data, and shared data (such as libs or binaries) were exported to the root filesystem (in /usr) by creating symlinks.
This also made the task of removing a program easier, since you simply had to remove the program dir, and then run a script to remove the invalid symlinks. This approach never was popular enough in in Linux distributions,
and it still doesn’t adress the problems of bundled libraries.
Because of this, all installs need to be global, which takes us to the next issue.
2-“Global installs”, or “Dependency Hell vs Dll hell”
Because of the previous issue, all popular distribution methods (both binary packages and source) force the users to install the software globally in the system, available for all accounts. With this approach, all binaries go to common places (/usr/bin, /usr/lib, etc). At first this may look reasonable and the right approach with advantages, such as maximized usage of shared libraries, and simplicity in organization. But then we realize its limits. This way, all programs are forced to use the same exact set of libraries.
Because of this, also, it becomes impossible for developers to just bundle some libraries needed with a binary release, so we are forced to ask the users to install the missing libraries themselves. This is called dependency hell, and it happens when some user downloads a program (either source, package or shared binary) and is told that more libraries are needed for the program to run.
Although the shared library system in Linux is even more complete than the Windows one (with multiple library versions supported, pre-caching on load, and binaries unprotected when run), the OS filesystem layout is not letting us to distribute binaries with bundled libraries we used for developing it that the user probably won’t have.
A dirty trick is to bundle the libraries inside the executable — this is called “static linking” — but this approach has several drawbacks, such as increased memory usage per program instance, more complex error tracing, and even license limitations in many cases, so this method is usually not encouraged.
To conclude with this item, it has to be said that it becomes hard for developers to ship binary bundles with specific versions of a library. Remember that not all libraries need to be bundled, but only the rare ones that an user is not expected to have. Most widely used libraries such as libc, libz or even gtk or QT can remain system-wide.
Many would point out that this approach leads to the so called DLL hell, very usual in Windows. But DLL hell actually happened because programs that bundled core system-wide windows libraries overwrote the installed ones with older versions. This in part happened because Windows not only doesn’t support multiple versions of a library in the way unix does, but also because at boot time the kernel can only load libraries in the 8.3 file format (you can’t really have one called libgtk-1.2.so.0.9.1 ). As a sidenote, and because of that, since Windows 2000, Microsoft keeps a directory with copies of the newest versions available of the libraries in case that any program overwrites them. In short, DLL hell can be simply attributed to the lack of a proper library versioning system.
3-Current DIR is not in PATH
This is quite simple, but it has to be said. By default in Unixes, the current path is not recognized as a library or binary path. Because of this, you cant just unzip a program and run the binary inside. Most shared binaries distributed do a dirty trick and create a shell script containing the following.
This can be simply solved by adding “.” to the library and binary path, but no distro does it, because it’s not standard in Unixes. Of course, from inside a program it is perfectly normal to access the data from relative paths, so you can still have subdirs with data.
4-No file metadata
Ever wondered why Windows binaries have their own icons and in Linux binaries look all the same? This is because there is not a standard way to define metadata on the files. This means we cant bundle a small pixmap inside the file. Because of this we cant easily hint the user on the proper binary, or even file, to be run. I cant say this is an ELF limitation, since such format will let you add your own sections to the binary, but I think it’s more like a lack-of-a-standard to define how to do it.
In short, I think Linux needs to be less standard and more tolerant in the previous aspects if it aims to achieve the same level of user-friendlyness as the ruling desktop operating systems. Otherwise, not only users, but developers become frustrated with this.
For the most important issue, which is libraries, I’d like to propose the following, as a spinoff, but still compatible for Unix desktop distros.
Desktop distros should add “./” to the PATH and LIBRARY_PATH by default, this will make the task of bundling certain “not so common”, or simply modified libraries with a program, and save us the task of writing
scripts called “runme”. This way we could be closer to doing simple “in a directory” installs. I know alternatives exist, but this has been proven to be simple and it works.
Linux’s library versioning system is great already, so why should installing binaries of a library be complicated? A “library installer” job would be to take some libraries, copy them to the library dir, and then update the lib symlink to the newer one.
Agree on a standard way of adding file metadata to the ELF binaries. This way, binaries distributed can be more descriptive to the user. I know I am leaving script based programs out, but those can even add something ala “magic string”.
And the most important thing, understand that the changes are meant to make Linux not only more user-friendly, but also more popular. There are still a lot of Linux users and developers that think the OS is only meant as a server, many users that consider aiming at desktop is too dreamy or too “Microsoft”, and many that think that Linux should remain “true as a Unix”. Because of this, focus should be put so ideas can coexist, and everyone gets what they want.
About the Author:
For some background, I have been programming Linux applications for many years now, and my speciality is the Linux audio area. I usually receive emails from troubled users every few days (and a lot more on each release) with problems related to missing libraries, distro-specific, or even compiler specific. This kind of things constantly make me wonder about easier ways to distribute the software for users.