Post a Comment
Many of the arguments in the article are based on an incorrect understanding of how SystemD works. Ultimately, an auditing system would not be capable of most of the things that SystemD does (unless it were made into a rampant layering violation
).
This is not completely true. SystemD doesn't have to actively observe and act upon any of the things that you mentioned. Sockets are created ahead of time, and SystemD leaves them alone (the kernel buffers the data). Hardware events are mainly observed by Udev (SystemD has very little hardware logic). And mount points are handled by AutoFS, also in the kernel.
Actually, SystemD can and does do this. It sets up AutoFS mounts. Any access will cause the process to block until the real file system is mounted. An auditing system would not be able to do this any better than SystemD can.
An auditing system would not be able to fix this problem. What is needed is a transactional file system. I am actually working on a transactional file system layer for the Linux kernel (about which I may write an article for OSNews someday
). Edited 2010-08-26 00:27 UTC
Fully agree. SystemD is about integration and much basic monitoring. The fact that a subsystem is restarted automatically if it crash. does not make that all systems (daemons, services, whatever) are obsolete and the rest is "the service". A daemon that restarts automatically the X Server, in case of crashing or if user is in graphic mode, does not make it that the X Server is obsolete.
So I the author simply gets the fact that SystemD will do a better logic to restart services and so on, but the final conclusion is wrong.
) i believe that SELinux is such a rampart layering violation. it even spreads into user-space libraries and tools. but the real fact SELinux proves is that auditing sytems are meant to become omnipotent. so, can you safely state that an auditing system will _not_ implement everything systemd is dreaming of into its observing code?
this rather sounds like you agree to the simple truth that things are better done inside the kernel, and supervisors should only feed and serve. i conclude from this that systemd is quite working on the same layer/interface for just everything inside the kernel as the auditing system, and there _is_, as a result, doubled core functionality.
again, my article targets at the cores of the implementations, the observing parts. if the auditing system can do this in the kernel, why we need it another time outside the kernel? in other words, if an auditing system can't do it _better_ than systemd, does that justify a layer in userspace? where should the generic observing interface reside, and how should userspace daemons settle on it? that is my question.
). this is interesting. could you please tell how it shall act (inside the kernel) and why an auditing system is not interested in it?
That is just to configure it. There is really no way to do that without user space tools. But yeah, I don't like SELinux very much... it's way too complicated.
Yes, actually. I highly doubt Linux would ever let an auditing system launch arbitrary daemons. And that's because it wouldn't make any sense. The old uevent helper system proved that it's always better to let user space launch things.
There is absolutely no duplicated functionality. None of the things that SystemD does with the kernel are done by the auditing system, and vice versa. The only possible thing I can think of would be that an auditing system could do the job of AutoFS. But that would be a really bad idea. AutoFS is much better for that purpose.
It's not outside the kernel. AutoFS is part of the Linux kernel. The reason that SystemD has to setup the AutoFS mounts rather than the kernel is because the kernel has no business reading configuration files. Policy decisions belong in user space.
The "generic observing system" is the auditing system. There is really little reason for observation of processes other than for security or debugging.
A transactional file system would allow programs to have a consistent snapshot of the file system. An entire transaction (which could last an indefinite amount of time) is an atomic operation. For example, a package manager could install software in a transaction. Then, if the power goes out, you will not be left with an inconsistent state. The downside is that performance is slightly decreased, and there can be conflicts (e.g. A writes to a file that B is trying to read). Unlike many transaction systems, there is no blocking. Basically, if A reads something in a transaction, and B writes to that thing in a transaction, the transaction with the lower priority is terminated. Individual, normal file operations are treated as transactions with infinite priority, so normal programs never have to worry about the transaction system. If an auditing system were to maintain all this logic, it would be a huge layering violation.
That is just to configure it. There is really no way to do that without user space tools. "
there are auditing systems that don't taint coreutils, for example
(...)
There is absolutely no duplicated functionality. None of the things that SystemD does with the kernel are done by the auditing system, and vice versa. The only possible thing I can think of would be that an auditing system could do the job of AutoFS. But that would be a really bad idea. AutoFS is much better for that purpose.
(...)
The "generic observing system" is the auditing system. There is really little reason for observation of processes other than for security or debugging.
you seem to get my article wrong. possibly the term to observe creates this strong relation to the auditing system that you think they are the same. but, please, go on birds perspective and overlook the kernel scape. you will see that, even if systemd is not observing by itself, at some point in the chain there is an observer because otherwise there would be no action on events. you would rather call this event handling or the like, but to observe is fully correctly used here in terms of the english language. think of a star observer. yes, in many cases the observation can be settled very deep into the kernel internals. but that is of different matter. anyhow there must be observation for events, and there is always a reason why.
this reason may be defined far outside the kernel in a user script. but the job ticket must get through down to the observing unit, being mangled and translated some times on the way. so it is, and both the auditing system and systemd somehow need to create such tickets for an observer or even to create an observer itself, depending on the kernel interfaces they hook in.
beside that - and here we come to what my article is about - both create a system to parse rules, to create types (struct's) of contexts, to pass these contexts as tickets, etc. think in terms of structures. many programs re-invent structures for the very same purpose: scanning rules to create internal contexts to type and register them at the correct interfaces and bind them to chains, an event handler, or whatever.
this generic way of doing things i mean. this is what the framework could encapsulate and offer a way that both the auditing system and systemd, but also udevd and other services, can profit from it. i could write my own rules in guile and register them via an ffi, circumventing systemd. but systemd would be notified about the change and could update its state or possibly act against my script - possibly via the auditor.
possibly you now see that what i target at is a more generic, say, job center for kernel observation or instruction that provides principles for simple job-creation and allows for even more flexibility because of being accessible arbitrarily and even in concurrency, managing the states for the listeners and feeders.
If that were the only problem then any one of the init replacements created in the last 15 years would be an improvement.
Speed is secondary. An init replacement primarily needs to solve initialization sequencing. Building an init sequence in which the appropriate things are started at appopriate times, and not before other things that may be needed first, is a highly non-trivial process. Upstart and systemd try to solve this problem and the different approaches define more than anything else the differences between the systems.
After that there are some nice to have things which are lacking on linux. Here I primarily mean service control; it's embarrassing that Windows does this better (yes, better). Both upstart and systemd try to address this in fairly similar ways.
Both (but systemd in particular) do other things, of course, which I consider nonessential but still worthwhile and improvements on current systems. I have to give a great deal of credit to Lennart for not trying to solve just one tiny technical problem but aiming for a holistic approach, while still not greatly violating the *nix philosophy.
But init already does the sequencing correctly, in a linear fashion. That's not too difficult at all.
But linear is slow. We have muli core cpu's now. Booting would be faster if we loaded things in parallel. Ok, but what can we load parallel to what, and what has to remain serialized? That's the complexity of the sequencing. its complex due to the parallelization which is due to the need for speed.
Linear is not 'easy' - linear is hard. Nonlinear is harder, but linear is not trivial! You have a large, unknown set of things to run which must be run in a particular order. What order? If you know you have a good order and want to insert a new item to run, where in the sequence does it go? Can you *safely* alter the order by inserting this new item and can you be sure that doing so does not break anything?
For simple things it's pretty easy to just "drop it in" and hope it will be fine, but there are many non-simple things. Does your ldap daemon need to be started before your remote filesystems are mounted? What if your /home is mounted via nfs and all users are stored in ldap? How do you know which order to load things in and how do you re-order it when it changes? Even in a purely linear situation this is not simple. If thinks work well today it's because of luck and careful engineering over many years.
It's a management nightmare which only becomes worse over time. Designing a system that works on purpose, instead of accidentally, is a worthwhile effort and a tricky problem.
Being faster is nice, sure, but that's not really a problem that needs to be solved, it's just a nice side effect. Once we can figure out sequencing properly we can get parallelism "for free" and thus some speedups. But no, it's not a goal.
https://bugzilla.redhat.com/show_bug.cgi?id=615527
If this was fully read you would notice Eric Paris the lead of fanotify. The lead developer to the replacement to inotify and dnotify. fanotify has none of the issues of inotify.
In fact fanotify allows you to block accept and delay requests of a file system. Something the past inotify and dnotify don't allow. Reason fanotify support real-time virus scanning and auditing from user-space.
Not all forms of auditing can be done from kernel space. Like who in there right mind would run a virus scan in kernel space.
"For example, if systemd recognizes a mountpoint access, it can mount the resource immediately. But, is that quick enough? Systemd has no influence on the accessing process and thus can't turn it into sleep until the mount happened."
This so call issue can be solved by fanotify delay response putting application to sleep. fanotify needs to feature complete then problem here is solved.
Next systemd uses cgroups to divide tasks. A full cgroup can be suspended while waiting for a drive to mount as well. Little over the top. fanotify catching would be far less painful.
There are a few experiments in using fanotify to make file recovery from backup transparent. Ie when you attempt to access file that has been sent to backup program gets delayed until file is recovered from backup location and extracted.
systemd is setting up to take advantage of the tech that will be on hand to userspace for auditing over the next 12 months. This really does leave all the other init systems far behind.
cgroup tech is also always expanding. The control systemd is providing compared to all the old systems is many times more.



