Linked by Thom Holwerda on Wed 9th Nov 2011 21:26 UTC, submitted by edwin
General Unix Way back in 2002, MIT decided it needed to start teaching a course in operating system engineering. As part of this course, students would write an exokernel on x86, using Sixth Edition Unix (V6) and John Lions' commentary as course material. This, however, posed problems.
Order by: Score:
Comment by cb88
by cb88 on Wed 9th Nov 2011 22:34 UTC
cb88
Member since:
2009-04-23

I've ran it in qemu before its fairly neat. Make sure you get the up to date sources when I first tried it out I got an old tarball and it had some bugs which I learned how to fix but at the same time they were already fixed upstream.

Reply Score: 1

binary for windows....
by neozeed on Wed 9th Nov 2011 23:34 UTC
neozeed
Member since:
2006-03-03

I slapped together an elf cross compiler, and set it up so that you can cross compile the kernel from windows with a simple 'build' command...

http://vpsland.superglobalmegacorp.com/install/xv6.7z

I've seen patches that include VM & a basic TCP/IP stack.. I guess all that is missing is a functional libc, and shared memory, shared libraries....

It is amazing how quickly it compiles! ... with MinGW or Cygwin dd will work correctly and you can build the whole thing much easier.

Reply Score: 2

RE: binary for windows....
by bogomipz on Thu 10th Nov 2011 22:35 UTC in reply to "binary for windows.... "
bogomipz Member since:
2005-07-11

I guess all that is missing is a functional libc, and shared memory, shared libraries...

What makes you so sure shared libraries is important or even desirable?

http://9fans.net/archive/2008/11/142

Reply Score: 2

RE[2]: binary for windows....
by Alfman on Fri 11th Nov 2011 04:04 UTC in reply to "RE: binary for windows.... "
Alfman Member since:
2011-01-28

bogomipz,

Regarding shared libraries, I've often wondered this myself. Shared libraries are the cause of all dependency problems, but is there really that much of a net benefit?

I think maybe when RAM was extremely tight, the answer may have been yes. But these days, they may be of less value, we really ought to test our hypothesis.

Consider, that shared libraries can't be optimized over API calls. It might take 30 bytes to call a libc function, which shifts the bytes around again to do a syscall. In a static application, it could theoretically optimize away all the glue code to do a syscall directly while saving space.

Obviously we have to look at bigger libraries too, like libjpg, but even there I wonder how much space would be wasted if it were statically compiled.

This isn't to preclude to the use of shared libraries for applications which are genuinely related and deployed together. But I do see an awful lot of application specific libraries under /lib which have to be managed in lockstep with their associated application, why should these be shared libraries at all?

Reply Score: 3

RE[3]: binary for windows....
by Vanders on Fri 11th Nov 2011 12:48 UTC in reply to "RE[2]: binary for windows.... "
Vanders Member since:
2005-07-06

Shared libraries are the cause of all dependency problems


Library versioning and symbol versioning are a solved problem. It's only when developers do not follow the standards that they introduce problems. In a properly managed library dependency issues are non-existent. Glibc is the obvious example here: since Glibc2 (libc.so.6) was introduced on Linux, the library has remained both forwards and backwards compatible.

Consider, that shared libraries can't be optimized over API calls. It might take 30 bytes to call a libc function, which shifts the bytes around again to do a syscall. In a static application, it could theoretically optimize away all the glue code to do a syscall directly while saving space.


The glue code is either needed, or it is not needed. If it's needed, then you can't simply "optimize it away". If it's not needed, then you can just remove it from the syscall shim.

Remember that in many cases the syscall shim will also attempt to avoid a syscall. It's much, much better to do the sanity checking and return early if the arguments are bad before making the syscall. It may require a few more instructions before the syscall happens, but that's still far less expensive than making the syscall only for it to return immediately because the arguments are wrong.

In some cases it's also entirely possible for the syscall shim to satisfy the caller entirely from user space I.e. it doesn't even need to call into the kernel.

Obviously we have to look at bigger libraries too, like libjpg, but even there I wonder how much space would be wasted if it were statically compiled.

This isn't to preclude to the use of shared libraries for applications which are genuinely related and deployed together. But I do see an awful lot of application specific libraries under /lib which have to be managed in lockstep with their associated application, why should these be shared libraries at all?


Now this is where I do a 180 and agree with you! Shared libraries are overused and in a large percentage of cases are used inappropriately. As a simple rule of thumb I'd prefer that system libraries are shared, and any third party libraries required by an application should be static. That's not fool proof, but it's a starting point.

Reply Score: 3

RE[4]: binary for windows....
by Alfman on Sat 12th Nov 2011 00:08 UTC in reply to "RE[3]: binary for windows.... "
Alfman Member since:
2011-01-28

Vanders,

"The glue code is either needed, or it is not needed. If it's needed, then you can't simply "optimize it away". If it's not needed, then you can just remove it from the syscall shim."


I agree with most of your post, but the problem with shared libraries is that often times they just add a layer of indirection to the syscall without adding much value. If you use a shared library to perform a function, then you cannot optimize away the glue code used to call the shared library.

On the other hand if we're willing to internalize the code into a static binary, the glue code becomes unnecessary (I'm not sure that GCC/LD will do this kind of optimization, but the potential is certainly there).

Reply Score: 2

RE[5]: binary for windows....
by Vanders on Sat 12th Nov 2011 01:27 UTC in reply to "RE[4]: binary for windows.... "
Vanders Member since:
2005-07-06

If you use a shared library to perform a function, then you cannot optimize away the glue code used to call the shared library.


That's not how shared libraries work. There is no more code required (at run time) to call a function in a shared library than there is in calling a function within the executable.

Reply Score: 2

RE[6]: binary for windows....
by Alfman on Sat 12th Nov 2011 04:59 UTC in reply to "RE[5]: binary for windows.... "
Alfman Member since:
2011-01-28

Vanders,

"That's not how shared libraries work. There is no more code required (at run time) to call a function in a shared library than there is in calling a function within the executable."

You're kind of missing my point, though. I know that shared library functions are mapped into the same address space as static functions, and can be called the same way. But the fact that a function belongs to a shared library implies that it must abide by a well defined calling convention, and subsequently translate it's internal variables to and from this interface. There are optimizations that can take place in a static binary that cannot take place with a shared library.

For example, we obviously cannot do inter-procedural analysis and optimization against a shared library function (since the shared library function is undefined at compile time). Theoretically, using static binaries, an optimizing compiler could analyze the call paths and eliminate all the glue code. Trivial functions could be inlined. Calling conventions could be ignored since there is no need to remain compatible with external dependencies.

In the ideal world, object files would be an intermediate representation like java class files, or .net assemblies. Not only would the run time compilation optimize for the current platform, but it could also perform inter-procedural optimization that might eliminate all costs currently associated with glue code.

Reply Score: 2

RE[7]: binary for windows....
by Vanders on Sat 12th Nov 2011 14:09 UTC in reply to "RE[6]: binary for windows.... "
Vanders Member since:
2005-07-06

Theoretically, using static binaries, an optimizing compiler could analyze the call paths and eliminate all the glue code.


I can't help but feel that the time and effort needed to do that well would be significant, yet only save a tiny fraction of the load and link time for a binary that uses classic shared libraries.

Reply Score: 2

RE[2]: binary for windows....
by christian on Fri 11th Nov 2011 10:07 UTC in reply to "RE: binary for windows.... "
christian Member since:
2005-07-06

"I guess all that is missing is a functional libc, and shared memory, shared libraries...

What makes you so sure shared libraries is important or even desirable?

http://9fans.net/archive/2008/11/142
"

All well and good until you find a critical bug in that DNS client library that every network capable program you have installed uses, and you now have to recompile (or relink at least) every single one of them.

Shared libraries may give memory usage benefits or not, may cause DLL hell in some cases, but from a modularity and management point of view, they're a god send.

Performance poor? That's an implementation detail of using lookup tables. There's nothing stopping the system implementing a full run time direct link, at the expense of memory and start up performance.

Reply Score: 2

RE[3]: binary for windows....
by bogomipz on Fri 11th Nov 2011 10:51 UTC in reply to "RE[2]: binary for windows.... "
bogomipz Member since:
2005-07-11

The argument that updating a library will fix the bug in all programs that dynamically link said library goes both ways; breaking the library also beaks all programs at the same time.

And if security is a high priority, you should be aware that dynamic linking has some potential risks on its own. LD_LIBRARY_PATH is a rather dangerous thing, especially when combined with a suid root binary.

Reply Score: 2

RE[4]: binary for windows....
by jabjoe on Fri 11th Nov 2011 11:38 UTC in reply to "RE[3]: binary for windows.... "
jabjoe Member since:
2009-05-06

I'll take being able to easily fix everything with easily being able to break everything every time over not able to fix anything.

The LD_LIBRARY_PATH suid root binary security hole is one that if you know about you can avoid. It's not something that means throw the whole system out.

Update: Looks it's protected against anyway.
http://en.wikipedia.org/wiki/Setuid

"The invoking user will be prohibited by the system from altering the new process in any way, such as by using ptrace, LD_LIBRARY_PATH or sending signals to it"

Edited 2011-11-11 11:43 UTC

Reply Score: 2

RE[3]: binary for windows....
by Alfman on Fri 11th Nov 2011 23:42 UTC in reply to "RE[2]: binary for windows.... "
Alfman Member since:
2011-01-28

christian,

"All well and good until you find a critical bug in that DNS client library that every network capable program you have installed uses, and you now have to recompile (or relink at least) every single one of them."

Your point is well received.

However this person has a slightly different suggestion:

http://www.geek-central.gen.nz/peeves/shared_libs_harmful.html

He thinks applications shouldn't use shared libraries for anything which isn't part of the OS. This would largely mitigate DLL hell for unmanaged programs.

I realize this answer is gray and therefor unsatisfactory.


A better solution would be to have a standardized RPC mechanism to provide functionality for things like DNS. The glue code would be small, and could always be linked statically. This RPC would be kernel/user space agnostic, and could be repaired while remaining compatible. I think the shift from shared libraries to more explicit RPC interfaces would be beneficial, but it'd basically need a new OS designed to use it from the ground up - now that linux hosts tons of stable code, it's unlikely to happen.

Reply Score: 2

RE[2]: binary for windows....
by jabjoe on Fri 11th Nov 2011 11:32 UTC in reply to "RE: binary for windows.... "
jabjoe Member since:
2009-05-06

Shared objects aren't really about saving space any more (much of Window's bloat is having a massive sea of common DLLs that might be needed, and multiple versions of them, for both x86 and AMD64). It's about abstraction and updates. You get the benefits of shared code from static libs, but to take advantage of new abstractions or updates, with static libs, requires rebuilding. That's a lot of rebuilding. Check out the dependency graph of some apps you use some time. They are often massive. To keep those apps up to date would require constant rebuilding. Then the update system would have to work in deltas on binaries else you would be pulling down much much more with updates. With shared objects you get updated share code with nothing but the shared object being rebuilt. Easy deltas for free. Having to rebuild everything will have a massive impact on security. On a closed platform this even worse because the vendor of each package has to decide it's worth them updating. Often it's even worse because each vendor has their own update system that may or may not be working. Worse, on closed platforms, you already end up with things built against many versions of a lib, often needing separate shared object files (which defeats part of the purpose of shared objects. Manifest is crazy with it's "exact" version scheme.) Static libs would make this worse. With shared objects not only do you get simple updates but abstraction. Completely different implementations can be swapped in. Plugins are often a system of exactly that. Same interface to the plugin shared objects, but each adds new behaviour. Also put something in a shared object with a standard C interface, and many languages can use it.

With an open platform and a single update system, shared objects rock. You can build everything to a single version of each shared object. You update that single version and everything is updated (fixed/secured). You can sensibly manage the dependencies. You removed shared objects if nothing is using them. You only add shared objects something requires. This can and is, all automated. This does save space, and I would be surprised if that if you build everything statically the install wasn't quite a lot bigger, unless you have some magic compressing filesystem witch sees the duplicate code/data and stores only one version anyway. But space saving isn't the main reason to do it.

Any platform that moves more to static libs is going in the wrong direction. For Windows, it may well save space to move to having static libs for everything because of the mess of having so many DLLs not actually required. But it will make the reliability and security of the platform worse (though not if it already has an exact version system, then it's already as bad as it can be).

In short, you can take shared objects only from my cold dead hands!

Reply Score: 3

RE[3]: binary for windows....
by bogomipz on Fri 11th Nov 2011 16:40 UTC in reply to "RE[2]: binary for windows.... "
bogomipz Member since:
2005-07-11

In short, you can take shared objects only from my cold dead hands!

Haha, nice ;)

I agree with those that say the technical problems introduced with shared libraries and their versioning have been solved by now. And I agree that the modularity is nice. Still, the complexity introduced by this is far from trivial.

What if the same benefits could have been achieved without adding dynamic linking? Imagine a package manager that downloads a program along with any libraries it requires, in static form, then runs the linker to produce a runnable binary. When installing an update of the static library, it will run the linker again for all programs depending on the library. This process is similar to what dynamic linking does every time you run the program. Wouldn't this have worked too, and isn't this the natural solution if the challenge was defined as "how to avoid manually rebuilding every program when updating a library"?

Reply Score: 2

RE[4]: binary for windows....
by Vanders on Fri 11th Nov 2011 17:12 UTC in reply to "RE[3]: binary for windows.... "
Vanders Member since:
2005-07-06

What you're describing is basically prelinking (or prebinding). It's worth mentioning that Apple dropped prebrinding and replaced it with a simple shared library cache, because the cache offered better performance.

Reply Score: 2

RE[5]: binary for windows....
by bogomipz on Fri 11th Nov 2011 19:18 UTC in reply to "RE[4]: binary for windows.... "
bogomipz Member since:
2005-07-11

What you're describing is basically prelinking (or prebinding).

Prelinking exists to revert the slowdown introduced by dynamic linking. I'm talking about not adding any of this complexity in the first place, and just using xv6 in its current form to achieve the same modularity.

(Well, xv6 apparently relies on cross-compiling and does not have a linker of its own, but I would expect a fully functional version to include C compiler and linker.)

Reply Score: 2

RE[4]: binary for windows....
by jabjoe on Fri 11th Nov 2011 17:39 UTC in reply to "RE[3]: binary for windows.... "
jabjoe Member since:
2009-05-06

I don't see how your system of doing the linking at update time is really any different than doing it at run time.

Dynamic linking is plenty fast enough, so you don't gain speed. (Actually dynamic linking could be faster on Windows, it has this painful habit of checking the local working directory before scanning through each folder in the PATH environment variable. In Linux, it just checks the /etc/ld.so.cache file for what to use. BUT anyway, dynamic linking isn't really slow even in Windows.)

You have to compile things different from normal static linking to keep the libs separate so they can be updated. In effect, the file is just a tar of executable and the DLLs it needs. Bit like the way resources are tagged on the end now. Plus then you will need some kind of information so you know what libs it was last tar'ed up against so you know when to update it or not.

What you really are searching for is application folders. http://en.wikipedia.org/wiki/Application_Directory
Saves the joining up of files into blobs. There is already a file grouping system, folders.

The system you might want to look at is: http://0install.net/
There was even a article about it on osnews:
http://www.osnews.com/story/16956/Decentralised-Installation-System...

Nothing really new under the sun.

I grow up on RiscOS with application folders and I won't go back to them.
Accept dependencies, but manage them to keep them simple. One copy of each file. Less files with clear searchable (forwards and backwards) dependencies.
Oh and build dependencies (apt-get build-dep <package>), I >love< build dependencies.

Debian has a new multi-arch scheme so you can install packages alongside each other for different platforms. Same filesystem will be able to be used on multiple architectures and cross compiling becomes a breeze.

Reply Score: 2

RE[5]: binary for windows....
by bogomipz on Fri 11th Nov 2011 19:17 UTC in reply to "RE[4]: binary for windows.... "
bogomipz Member since:
2005-07-11

I don't see how your system of doing the linking at update time is really any different than doing it at run time.

The difference is that the kernel is kept simple. The complexity is handled by a package manager or similar instead. No dynamic linker to exploit or carefully harden.

If you don't see any difference, it means both models should work equally well, so no reason for all the complexity.

You have to compile things different from normal static linking to keep the libs separate so they can be updated.

What do you mean by this? I'm talking about using normal static libraries, as they existed before dynamic linking, and still exist to this day. Some distros even include static libs together with shared objects in the same package (or together with headers in a -dev package).

In effect, the file is just a tar of executable and the DLLs it needs. Bit like the way resources are tagged on the end now.

I may have done a poor job of explaining properly. What I meant was that the program is delivered in a package with an object file that is not yet ready to run. This package depends on library packages, just like today, but those packages contain static rather than shared libraries. The install process then links the program.

Plus then you will need some kind of information so you know what libs it was last tar'ed up against so you know when to update it or not.

No, just the normal package manager dependency resolution.

What you really are searching for is application folders.

No, to the contrary! App folders use dynamic linking for libraries included with the application. I'm talking about using static libraries even when delivering them separately.

The system you might want to look at is: http://0install.net/

Zero-install is an alternative to package managers. My proposal could be implemented by either.

Reply Score: 2

RE[6]: binary for windows....
by jabjoe on Fri 11th Nov 2011 21:39 UTC in reply to "RE[5]: binary for windows.... "
jabjoe Member since:
2009-05-06

The difference is that the kernel is kept simple.The complexity is handled by a package manager or similar instead. No dynamic linker to exploit or carefully harden.


Not really a kernel problem as the dynamic linker isn't really in the kernel.
http://en.wikipedia.org/wiki/Dynamic_linker#ELF-based_Unix-like_sys...

What do you mean by this?


When something is statically linked, the library is dissolved, what is not used the dead stripper should remove. Your system is not like static linking. It's like baking dynamic linking.

This package depends on library packages, just like today, but those packages contain static rather than shared libraries. The install process then links the program.


Then you kind of loose some of the gains. You have to have dependencies sitting around waiting in case they are needed. Or you have a repository to pull them down from....

No, just the normal package manager dependency resolution.


That was my point.

No, to the contrary! App folders use dynamic linking for libraries included with the application.


Yes.

I'm talking about using static libraries even when delivering them separately.


As I said before, it's not really static, it's baked dynamic. Also if you have dependencies separate you either have loads kicking about in case they are need (Windows) or you have package management. If you have package management all you get out of this is baking dynamic linking. For no gain I can see.....

Zero-install is an alternative to package managers.

It's quite different as it's decentralized using these application folders. Application folders are often put forwards by some as a solution to dependencies.

Reply Score: 2

RE[7]: binary for windows....
by bogomipz on Sat 12th Nov 2011 16:35 UTC in reply to "RE[6]: binary for windows.... "
bogomipz Member since:
2005-07-11

Not really a kernel problem as the dynamic linker isn't really in the kernel.

Sorry, I should have said that the process of loading the binary is kept simple.

When something is statically linked, the library is dissolved, what is not used the dead stripper should remove.

Yes, this is why dynamic linking does not necessarily result in lower memory usage.

Your system is not like static linking. It's like baking dynamic linking.

This is where I do not know what you are talking about.

Creating a static library results in a library archive. When linking a program, the necessary parts are copied from the archive into the final binary. My idea was simple to postpone this last compilation step until install time, so that the version of the static library that the package manager has made available on the system is the one being used.

This way, the modularity advantage of dynamic linking could have been implemented without introducing the load time complexity we have today.

Reply Score: 2

V7 on x86
by Mikaku on Thu 10th Nov 2011 09:29 UTC
Mikaku
Member since:
2007-05-03

This is also interesting:
http://www.nordier.com/v7x86/

Reply Score: 3

Another similar project
by hakossem on Thu 10th Nov 2011 12:32 UTC
hakossem
Member since:
2005-07-15

Thix is a Unix-like OS that implement almost all POSIX.1 standard:
http://www.hulubei.net/tudor/thix/
http://thix.eu/

Reply Score: 3

Pfffft.
by nokturnal on Thu 10th Nov 2011 19:04 UTC
nokturnal
Member since:
2009-06-24

No tab completion. What kind of OS is this?

Kidding, of course. It's actually really neat.

Reply Score: 3

RE: Pfffft.
by moondevil on Sat 12th Nov 2011 17:02 UTC in reply to "Pfffft."
moondevil Member since:
2005-07-08

The last time I used HP-UX, Aix and Solaris you would not get tab completion with the default shells as well.

Reply Score: 2

Xv6 versus Minix
by sydbarrett74 on Thu 10th Nov 2011 22:33 UTC
sydbarrett74
Member since:
2007-07-24

I know that Minix is a more complex and feature-rich OS (not in the least because it's capable of production use), but other than this, can someone tell me what differences it has architecturally compared to Xv6?

Also, I'm wondering why they didn't just use an older, smaller version of Minix since Tanenbaum *did* write it originally as a teaching OS.

Reply Score: 1

RE: Xv6 versus Minix
by christian on Fri 11th Nov 2011 10:24 UTC in reply to "Xv6 versus Minix"
christian Member since:
2005-07-06

I know that Minix is a more complex and feature-rich OS (not in the least because it's capable of production use), but other than this, can someone tell me what differences it has architecturally compared to Xv6?


I think because of the rich commentary available with V6 UNIX (Ie. the Lions book), and V6 is very simply.

In fact, I'd go so far to say that the V6 kernel would probably make for a reasonable base of a micro-kernel with a bit of work.

But as it is, V6 is still a monolithic kernel, with all the OS services linked into kernel space, whereas Minix provides only critical services that cannot operate in user space, leaving the rest to user space servers.

I'm torn on the micro versus monolithic kernel debate. Some services are not really restartable without hacks that obviate the benefits of a micro-kernel in the first place (how would you restart the filesystem server if you can't read the filesystem server binary from the filesystem? You'd have to link it with the kernel blob, meaning it couldn't be changed at run time).

Also, I'm wondering why they didn't just use an older, smaller version of Minix since Tanenbaum *did* write it originally as a teaching OS.


The original Minix version was 16-bit and probably not well integrated with development tools like GCC and GDB.

It was a pleasure just typing "make qemu-gdb" and attaching running a debugger in another window, and stepping through the kernel as it did it's work. I guess it'd take a lot of work to get Minix into that state.

Reply Score: 2

RE[2]: Xv6 versus Minix
by sydbarrett74 on Fri 11th Nov 2011 12:52 UTC in reply to "RE: Xv6 versus Minix"
sydbarrett74 Member since:
2007-07-24

Christian,

Thank you for your reply. It was very informative.

Cheers!

Reply Score: 1