Introducing openMosix

Eugenia Loli 2004-02-22 Internet 18 Comments

The openMosix software packages turn networked computers running GNU/Linux into a cluster. It automatically balances the load between different nodes of the cluster. Nodes can join or leave the running cluster without disruption. The cluster spreads the workload between nodes according to their connection and CPU speeds. Even more about OpenMosix here.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

18 Comments

2004-02-22 10:25 am

Anonymous
As the systems administrator for a scientific computing group at our local university, I have quite a bit of hands-on experience with HPC clusters. We’ve deployed two successfully within my group, both running Linux. When I deployed our first cluster, I tried Mosix, only to discover that they had not yet implemented a “relocatable socket” which would allow programs utilizing sockets to migrate.

This was a moot point really, as our modelling program is an MPI application and doesn’t need Mosix to take advantage of a cluster.

Two years later (in 2003) I was asked to help deploy our modelling program on another cluster several states away. The cluster’s administrator had configured it as a Mosix cluster and he was quite excited about trying it out. After reading through the Mosix documentation and trying it out for myself, I discovered that the Mosix developers had implemented their relocatable socket, which meant that Mosix should be able to migrate MPI applications.

Here’s what the author had to say about MPI+Mosix:

MPI applications benefit from running in an openMosix environment. Although a process starts on node 1, the cluster determines whether it would be better to run a certain process on another, less loaded node. openMosix uses an advanced algorithm based on market economics to determine which node best suits the application. This way, even already parallelized applications will gain from openMosix.

Well, I don’t know what MPI implementation the author was using. Perhaps one has been specially written for Mosix. However, I attempted to use MPICH (http://www-unix.mcs.anl.gov/mpi/mpich/) to no avail. MPICH is intimately tied to rsh for network use, so I had to write my own script which took rsh-like arguments to spawn processes locally, and configured MPICH to use that. I disabled all of the shared memory message passing code through ./configure (and MPICH doesn’t support threads, so that isn’t an issue). Regardless, when I started the model, the processes simply would not migrate.

Ultimately, Mosix doesn’t bring much to the table for MPI applications besides some management benefits which can also be had through (commercial) cluster management software. Having all processes appear in the master node’s process table means that “killall -9 <myjob>” will clean up any stale processes from a cluster run, something MPICH seems to be problematic about. It also means that when you deploy a cluster you don’t need to configure rsh/ssh or NIS(+).

Ultimately on the second cluster the admin just configured all cluster jobs to run as a single user, and used ssh + RSA keys. My clusters both use rsh and NIS (as the cluster is firewalled from the external network).

The problem with Mosix is that most MPI implementations designed to operate inside a single system image use shared memory for message passing, while MPI implementations designed for use on a networked cluster use rsh/sockets. What would really be necessary for an MPI implementation which works in a Mosix environment is a custom tailored application designed to spawn migratable processes on the local system but use sockets rather than shared memory for IPC.

As for other advantages of Mosix people like to dream about, such as compiling large trees with make -j and have the compiler jobs automatically exhibit distcc-like behavior, you can forget about it. The compiler processes will rarely migrate as they’re not running long enough for the Mosix algorithms to decide to migrate them.

Mosix is a neat toy (and MFS is wonderful if the slave nodes of your cluster aren’t diskless) but in reality in most of the cases where you’d expect process migration to be beneficial Mosix will not, for whatever reason, migrate processes. It’s something of a disappointment.
2004-02-22 12:40 pm

Anonymous
I have to agree with Bascule, my final year project last year involved comparing a Mosix cluster to an Sgi Origin3800.

I will admit mosix is a neat addition to Linux, but regarding MPI there’s no comparison when your other against a single image machine like an Origin3800
2004-02-22 1:58 pm

Anonymous
I agree with the first summation as well.

After testing openmosix in a real-world situation with varying servers and many idle servers, the only result i could see was that processes that were largely idle got migrated, and any cpu taxing processes stayed on their original servers.

Not to mention to 2.4.22-2 bug that crashes boxes when you use the mfs filesystem on any level. The 2.4.21 kernel patch was better, except it randomly killed imporant boxes dead for unknown reasons

All in all, clustering via linuxvirtualserver or by buying beefier boxes is currently the only sure-fire way to counter increasing load on boxes reliably.
2004-02-22 2:17 pm

Anonymous
If you are using linuxvirtualserver, then mosix is most probably not the solution you are looking for.
2004-02-22 2:54 pm

Anonymous
The only real disadvantage of MPI/PVM is that you programs need to be written for those libraries in opposite to mosix which can even migrate threads as it likes.

However, this disadvantage doesn’t count much when regarding the fact that most clusters costed much, require lots of administration work and really need to make full use of their resources. Having these in mind, it’s not really a problem anymore to have a little extra work with implementing MPI. MPI is easy, every cluster user should know how to program MPI, thus, I don’t see any real advantages of mosix anymore.

Its disadvantages are quite obvious (speed).

Another disadvantage is that it’s a quite complex kernel patch, often conflicting with other kernel patches, it needs a lot of setup work (imho more than MPI), while MPI runs in userspace completely.

Last but not least, the MPI standard is applicable to about every scenario of distributed nodes that can talk to each other, while OpenMosix is bound to specific versions of Linux and an IP-based environment. Of course, Mosix could be ported to Infiniband, but I doubt it could make use of Infinibands real advantages (see infiniband.sf.net) as this Infiniband patch for MPI-CH does.
2004-02-22 4:46 pm

Anonymous
last supported kernel 2.4.22 as opposeed to the newest linux kenel being (2.4.25) and no sign of 2.6 support + the above comments = openmosix and interesting novelty?
2004-02-22 4:58 pm

Anonymous
Well my experience is exactly the opposite. We run large scale simualtions, and its quite easy to split the simulations up N times, with N seeds of the RNG, running on N cpus. MPI is not nesessary. Basically we find that the computation times scales down linearly with additional cpus in the clusters, with practiacally no overhead.

At least for this application, we find openmosix to be both very cost effective, stable, and simple to use. Best of all, no code modification are needed. It all depends on the applicaitons I guess.
2004-02-22 5:03 pm

Anonymous
openmosix is no novelty. It’s derived from Mosix which first showed up in 1977. Porting openmosix to to other architectures or newer kernels is a _lot_ of work as openmosix’ process migration technology needs to modify important, complex and frequently changing parts of the kernel (scheduler, mm, arch-dependant mm stuff, etc.).

Because of the above arguments why on “real” clusters, MPI is preferred, openmosix also has a lack of programmers – as it deeply changes the way the kernel works (just think of the redirection of syscalls like ioctl, catching and handling page fault exceptions because the needed page is on different machine), the project needs experienced kernel hackers; available MPI implementations (LAM, CH) are simpler to maintain.

Mosix is a highly interesting concept and I encourage everybody to read http://openmosix.sourceforge.net/linux-kongress_2003_openMosix.pdf, a paper written by Moshe Bar, the maintainer of openmosix. However, it’s not of much use for most massively parallel applications because MPI-based programs can be optimized for cluster usage in many ways, while the only method to scale your program with mosix is fork().
2004-02-22 5:41 pm

Anonymous
Isn’t DragonFly going to do stuff like this natively (as in isn’t something like this part of their overall goals for the OS)?
2004-02-22 5:41 pm

Anonymous
As for other advantages of Mosix people like to dream about, such a compiling large trees with make -j and have the compiler jobs automatically exhibit distcc-like behavior, you can forget about it. The compiler processes will rarely migrate as they’re not running long enough for the Mosix algorithms to decide to migrate them.

Well I guess it all depends on how long you are talking about. OpenMosix works pretty good at compliling but it will take a bit for it to migrate. If the compile is not long enough to migrate then it doesn’t really matter then does it? If it can be compiled locally rather quickly then what’s the point of wasting resources anyway?
2004-02-22 8:21 pm

Anonymous
What Bascule was talking about is that Mosix isn’t useful when having a lot of small jobs to process – compilation is imho a good example. It may take you 10 hours to compile OpenOffice but you won’t be much faster or even slower using Mosix. Problems like these require a software solution that specializes on that specific problem – e. g. a MPI program or distcc – which mosix doesn’t by definition.

Just one example out of many in which mosix doesn’t scale well.
2004-02-22 8:22 pm

Anonymous
I am an openmosix fan.

If you are in a high computing department or whatever, you know well enough if your applications are suitable for mosix or whatnot and really ought not be here complaining otherwise.

But if you are in a general office environment, get everyone to run openmosix! I’ve got everyone dropping a openmosix cd version on instead of turning their windows boxes off overnight, and hey presto: scheduled builds and tests run so much faster!
2004-02-22 8:43 pm

Anonymous
The things written about MPI in the openMosix documentation refer to the normal usage of MPI, not trying to start all MPI processes on the same node and then relying on openMosix to migrate them. MPI can be used on an openMosix cluster, and the benefit is that other jobs will migrate out of the MPI job’s way to idle CPUs, if there are any. It is true that otherwise oM does not make running MPI any easier or better.

We have used LAM MPI on an oM cluster quite successfully. LAM also uses rsh to distribute jobs, and this has worked without making modifications to rsh or writing wrappers. The default oM configuration does not allow any processes to migrate. The init scripts of the system must be changed in order to allow rlogind and sshd processes to migrate. Then all login shells opened by these servers will inherit the migratability attribute. This way, MPI can be used in the normal way with remote processes being started via rsh and the dynamic load balancing of oM will apply to the MPI processes.

We are a computational and structural biochemistry laboratory with a mixture of experienced and newbie unix users and a diverse mixture of serial and parallel computational jobs to be run on the cluster. OpenMosix has been very good in that it does not require the users to learn a queuing system. It works enough like a single image system that the novice users can simply log in, run their processes and have the system make the best use of the hardware. MPI users mostly will learn to control the migratability of their processes and in my experience will let the system migrate their parallel jobs. We do not use MPICH and have experienced none of the problems Bascule describes.

MFS is buggy, unstable and incompatible with GNU fileutils (now coreutils) to boot. I have used it mostly to distribute configuration files to the cluster nodes, but I find it unsuitable for handling large amounts of data.

It is true that openMosix does not work for parallel compilation using make -j. There have been some patches at some point to address this, but now that distcc is available, there is not much point trying to make oM do distcc’s job. Distcc works extremely well and it works on oM clusters, too.

My problem with oM is the string of vulnerabilities that have been found in the Linux kernel. The openMosix project has not made a release since 2.4.22, which has known vulnerabilities. A largish number of students have a user account on our cluster, so I’d rather have more frequent security updates. I do not have the time hunt down and test patches from mailing lists and obscure web sites.
2004-02-22 9:53 pm

Anonymous
would this let one to run applications like gimp and cinepaint on an old laptop if there were more powerful computers in the same openMosix cluster?
2004-02-22 10:13 pm

Anonymous
“would this let one to run applications like gimp and cinepaint on an old laptop if there were more powerful computers in the same openMosix cluster?”

I’ve used OpenMosix for just that sort of thing with good results. The apps start up locally, but quickly migrate to the more powerfull boxes (or in the case of several underpowered boxes each box just gets a share of the apps) so the slow machine I’m at does not get overpowered when I start a lot of applications. For that sort of thing OpenMosix is good in my oppinion.
2004-02-22 10:21 pm

Anonymous
SGI Altix is a Linux server.

“Beyond this, Altix class systems uniquely provide up to 8 terabytes of total addressable memory. ”

http://www.linuxworld.com/story/43722.htm?DE=1 – “Dissolving the Limits of Linux”

This is a free operatingsystem wich gives you the source, the ability to modify it, the freedom to distribute, and more.

But Linux don’t have all the administration tools that the older unix operatingsystems have. Look at Sun Solaris and others and compare them for your needs.
2004-02-23 2:49 am

Anonymous
Hi guys … Here is my work about … on spanish languaje.

I was talk about openMosix on “Congreso Nacional de Software Libre 2004 – Mexico”

Slides on … http://cipactli.iingen.unam.mx/OpenMosix

Saludos.

JoseCC … http://www.josecc.net
2004-02-23 8:54 am

Anonymous
but sadly it dose not have thread migration or shared memory, which is needed to distrubute a single app across the cluster

all it is good for now it running many of the same job simultaniously or freeing up resources on a single box so you can have the run a large app