IncludeOS is an includable, minimal unikernel operating system for C++ services running in the cloud and on real HW. Starting a program with #include <os> will literally include a tiny operating system into your service during link-time.
↫ IncludeOS GitHub page
IncludeOS isn’t exactly the only one of its kind, but I’ve always been slightly mystified by what, exactly, unikernels are for. The gist is, as far as I understand it, that if you build an application using a unikernel, it will find out at compile time exactly what it needs from the operating system to run, and then everything it needs from the operating system to run will be linked inside the resulting application. This can then be booted directly by a hypervisor.
The advantages are clear: you don’t have to deal with an entire operating system just to run that one application or service you need to provide, and footprint is kept to a minimum because only the exact dependencies the application needs from the operating system are linked to it during compilation. The downsides are obvious too – you’re not running an operating system so it’s far less flexible, and if issues are found in the unikernel you’re going to have to recompile the application and the operating system bits inside of it just to fix it (at least, I think that’s the case – don’t quote me on it).
IncludeOS is under heavy development, so take that under advisement if you intend to use it for something serious. The last full release dates back to 2019, but it’s still under development as indicated by the GitHub activity. I hope it’ll push out a new release soon.
Is anyone familiar with both IncludeOs and NanOs https://nanos.org/ ?
It seems that NanOs is more mature but I haven’t used them.
Back in the day, we would have called these unikernel things DOS extenders ( well, almost ).
In the company I work at vulnerability tracking, patching, SLAs generate a huge amount of overhead. Minimizing software packages to reduce that overhead would be fantastic.
Strylight I am sorry to break you bubble. Most of these unikernels are bad ideas. Yes I give vulnerability and tracking seams bad.
Following I am going to quote from a pro unikernel person.
https://nanovms.com/dev/tutorials/assessing-unikernel-security
“”””One thing I should quickly point out before we get started – there is now a veritable cornucopia of unikernel implementations out there. Some are focused on security, some are focused on performance, some are focused on size, others on ease of use. Some are actively being worked on – others have languished. Some are master thesis, others are supported by multi-billionaire dollar companies.””””
Now read over the above bit carefully. Yes there are lot of unikernels that lack good security features. There are a lot of unikernels without proper funding behind them. Big one lots that are languished. Reality if you look at the Insights on github you very quickly see that IncludeOS is not in a healthy state. With big long gaps in development.
Strylight what is worse being informed and having fixed for vulnerabilities or using a unmaintained OS that you don’t know what security flaws need to fix.
Please note I did not say all unikernels are bad ideas. Good ones are not easy to find.
There are a lot of Monolithic/Microkernel/hybrid operating systems out there as well with a lot of them being not maintained.
The same could also be said of Linux and the all the BSDs out there. We would all then be stuck with just DOS and Windows. Just because it is a common occurrence, does not make it the rule.
The comment was that many unikernels either lack proper resources, security design, maintenance and/0r attention generally. None of that can be serious.y said about Linux or most of the BSDs. In fact, a better argument could be made to apply some of those to DOS and Windows. But none of those operating systems are really good examples of the kinds of problems that @oiaohm was cautioning against.
I’d recommend checking out Hyperlight: https://github.com/hyperlight-dev/hyperlight
It uses multiple concepts like NanoVMs and UniKernels to achieve the same thing but with greater flexibility.
This blog post gives a good primer and what is possible with it: https://arschles.com/blog/hyperlight-overview-1/
Here is the 2nd part of the blog: https://arschles.com/blog/hyperlight-overview-2/
This is my problem. Where is hyperlight address randomization and other attack limiting methods.
Rust is not what sel4 uses. Yes there are a lot of unikernel like things who end up selling something that they are not.
Yes there are a lot of Linux distributions that you should not go near for a production deployment.
Do go back though those primers with a proper fine tooth and see that hang on they are not providing the goods they should.
Like really they are trying to claim they doing something so called highly secure and you are depending on non validated OS like Linux/Windows for your security core instead of something validated like sel4.
We really do need to take these unikernel things with a serous grain of salt they like over claiming with fundamental flaws in their logic.
oiaohm,
Address randomization is a mitigation for unsafe languages. It’s not as critical for safe languages.
These criticisms of unikernels have been extremely generic in nature. Of course you need to look at everything with a grain of salt, but things need to be evaluated at on a case by case basis, not just as generic statements. Your posts make clear that you have a bunch of generic criticism to share, but do you have any specific criticisms of include OS? I don’t know if there are actual issues with it, but if you are using generic criticisms against it without even having looked at the project, then it seems a bit unfair. On the other hand if there is specific criticism to share that’s more useful to act on (and constructive).
Alfman
https://www.infoq.com/presentations/unikernels-includeos/
“””I think we will try to do proper ASLR at some point, so that every time we reboot we come up with slightly a bit different. I don’t see it as a vector. Yes.”””
I might have seamed to be talking generic. But I was sticking to faults all unikernels raised so far have.
Alfman
“””Address randomization is a mitigation for unsafe languages. It’s not as critical for safe languages.”””
Common mistake. Address randomization is need for unsafe and safer languages. Remember languages like rust people call safe are only safer language they have options to have unsafe code that needs all the same protections unsafe languages have.
ASLR like it or not need to be a bare min feature checkbox for a properly secure os kernel be it unikernel or not. And its not that most of these unikernel developers don’t know it. What is the one problem with doing ASLR is that it has negative effects on benchmarks. Except to loss up to 5% of performance.
Proir post.
“””Reality if you look at the Insights on github you very quickly see that IncludeOS is not in a healthy state. With big long gaps in development.”””
You go to insights on github look at contributors with Include OS you see single developer active with big gaps in work.
With hyperlight you see a project that only started on the 6 of november 2024.
Nanos is the only one that looking at insights at github that somewhere near acceptable based on developer activity that been mentioned so far. But you still have that Nanos overselling is security.
Nanos, hyperlight and includeOS all lack ASLR and let people use programming language that requires it for security because the languages are either unsafe or just safer languages that allow unsafe modes..
Alfman and if you are really using safe languages yes the ones that don’t allow the exploits at all because you don’t have the unsafe mode you still need the address randomization to slow down attackers who get their hands on likes of memory dumps(yes when you transfer vm from host to host and so on) from having a quick and simple access to information. Yes ASLR is not just for buffer overflow/over run … exploits.
Alfman I would love to find a correctly constructed unikernel.
oiaohm,
The generic advice isn’t very informative in the context of this specific project. If you had a specific criticism, that’d be good to know. But some of your statements do come across as unfair and misplaced in the context of IncludeOS. Perhaps you should have made it clearer that you weren’t criticizing IncludeOS specifically because otherwise it’s sort of implied by your tone.
Correct code doesn’t need address randomization to foil memory exploits the same way invalid code does. This is true even for correct C, but the problem with C is that it doesn’t get verified by the compiler and humans are faulty. Safe languages mitigate the risk of software memory faults. If you disable language safety checks, then of course your counteracting this mitigation. It doesn’t mean the code is necessarily vulnerable though, only that the developer takes responsibility for verifying the code manually. Unsafe sections might be justified if used sporadically in places where the overhead to audit them isn’t so onerous.
ASLR isn’t the only (or even the best) solution to memory faults, it wouldn’t be necessary if our languages didn’t produce these faults.
The hardware may have it’s own vulnerabilities that can benefit from the obfuscation that ASLR offers (these hardware vulnerabilities should be fixed rather than rely on ASLR!), But ASLR isn’t necessary to catch memory faults in software that doesn’t have those memory faults. It’s redundant.
“””The hardware may have it’s own vulnerabilities that can benefit from the obfuscation that ASLR offers (these hardware vulnerabilities should be fixed rather than rely on ASLR!), “””
Alfman this is lack of crystal ball and a false believe hardware can be fixed. We know having ASLR helps when these hardware issues turn up. When you deploy you are have no way to know the hardware is good but you can know the hardware is bad. Also hardware cannot be replaced quickly. Production usage of hardware with Spectre and meltdown flaws is still there and those are 2017 detected flaws and as you go looking there are even older still in production usage. As I said ASLR should bare min and this is not just coding errors this is the hardware errors we know are in production and hardware errors that we don’t know that are in production..
Hardware vulnerabilities like or not we have to depend on software mitigations because its not cost or production possible to replace all the known defective chips. This is another unikernel common generic problem is lack of mitigations against known hardware issues and lacking the safe guards to make unknown hardware issues less problem as they turn up.
This is living in real world limitations not theory fiction.
“”” This is true even for correct C, but the problem with C is that it doesn’t get verified by the compiler and humans are faulty. “””
https://beta.sel4.systems/Info/FAQ/proof.html
sel4 written in C them do formal proof this results in all the issues safe languages promise fixed fixed. The resulting C from this process is more secure than most so called safe languages that are only safer languages.
“””If you disable language safety checks, then of course your counteracting this mitigation.”””
That if you know you did this. Developer imports a library for rust or the like and the library could have unsafe code in it the developer does not notice.
Unikernel OS selling safe where is the tooling to keep developer using it informed of how safe they currently are.
“””It doesn’t mean the code is necessarily vulnerable though, only that the developer takes responsibility for verifying the code manually.”””
Refer Sel4. There is tooling that should be here if you are going to claim improved security.
The idea of manually checking code need to go away in a lot of cases. Sel4 demoed lots can be automatic. Remember Sel4 demos the tool can be done on unsafe languages like C. So there should be auto tooling for checking the unsafe areas of safer languages if you are trying to sell safe.
oiaohm,
You are wrong when you say the hardware cannot be fixed. Computing has historically evolved to allow the sharing of resources between vastly different security domains. In the past we accepted this risk largely because there just weren’t many resources to go around so sharing was a high priority. However considering how many resources often float near idle in a modern system, I think we should reevaluate the practice of sharing resources across trust domains without flushing state information first. IMHO modern CPU engineering has more than enough cores/cache/tlb to dedicate some to secure processes and completely eliminate the risks of shared resources.
I agree that ASLR memory obfuscation can help with hardware based exploits, but to be honest I don’t think using software mitigations against hardware exploits is the best approach for the long term. Software based fixes should only be considered a stop gap measure…but the hardware still should be fixed. Personally I feel the use of software mitigations for hardware flaws is a cringey solution to our long term security needs.
Unfortunately this attitude of accepting hardware exploits on the basis of software mitigations is part of the problem and it ends up propagating the problem down the line instead of putting an end to it. We can do better than this!
You can write proofs for absolutely any language, assembly language, or even binary machine code. Surely we can agree that the “proof” is not a byproduct of C, however it’s objectively a lot easier to guarantee that absence of memory faults in a safe language. This is not a big deal in small academic examples, but for large projects with dozens or more chefs in the kitchen it’s a huge deal.
Alfman
“””You are wrong when you say the hardware cannot be fixed. Computing has historically evolved to allow the sharing of resources between vastly different security domains. In the past we accepted this risk largely because there just weren’t many resources to go around so sharing was a high priority. However considering how many resources often float near idle in a modern system, I think we should reevaluate the practice of sharing resources across trust domains without flushing state information first. IMHO modern CPU engineering has more than enough cores/cache/tlb to dedicate some to secure processes and completely eliminate the risks of shared resources.”””
This is you attempting to fight against the known reality.
https://trustworthy.systems/projects/LionsOS/
Yes the sel4 development is doing what you are talking about. But that does not change the reality in community hardware how long it takes to get fixed.
“””I agree that ASLR memory obfuscation can help with hardware based exploits, but to be honest I don’t think using software mitigations against hardware exploits is the best approach for the long term. Software based fixes should only be considered a stop gap measure…but the hardware still should be fixed. “””
The thing is being without the stop gap is not a valid option once you accept how long you have to live with defective hardware. The CPU you buy today as a brand first batch from factory where designed 2 years ago. So a fault found today in your consumer/commodity hardware for fixed hardware to be shipped is 2 years and that if the hardware fix is right first time around. This is why you have microcode patches and the like but they cannot address every issue that turns up.
Remember majority of these unikernels have no stopgap support. Not having stopgap support should not be classed as acceptable.
“””Unfortunately this attitude of accepting hardware exploits on the basis of software mitigations is part of the problem and it ends up propagating the problem down the line instead of putting an end to it. We can do better than this!”””
No you are skipping over it the limitation. There is limit silicon production. The 2 years to have a chip fabricated is a limitation. Those working on systems that need to be secure for military have come to the unfortunate reality is the limitations of silicon production means what ever solution you come up with must be able to support stopgaps for known flaws in silicon.
There is a limit to how high of quality CPU you can have. Alfman you need to look at historic examples. Spectre fault it effects all 2019 and before CPUs. Note it was detected in 2017. Yes 2 years of production had to sell out before in 2020 finally fixed CPU turned up.
Alfman think replacing every Spectre CPU in 2020 we don’t have the silicon production todo that. Its going to take until 2026-28 to make them uncommon in production.
Alfman like it or not the stopgap requirement due limitations on how fast CPU can be made and replaced 9 to 12 years from point of fault in CPU being detected.
Yes this 9 to 12 years of stop gap is the reality when you are not living in theory fiction. Its not that we can do better. That the production and replacement limit with the community hardware that we use in desktop computers and servers..
“””You can write proofs for absolutely any language, assembly language, or even binary machine code. Surely we can agree that the “proof” is not a byproduct of C, however it’s objectively a lot easier to guarantee that absence of memory faults in a safe language. This is not a big deal in small academic examples, but for large projects with dozens or more chefs in the kitchen it’s a huge deal.”””
sel4 and lionos are military used item. This is not small academic examples. The high security military use cases what proofs on everything.
https://trustworthy.systems/projects/pancake/
Pancake from the sel4 developers is not complete yet but it shows you what a true safe language should look like. This don’t look like rust and other safer languages where you have uncontrolled unsafe areas.
Do note pancake is designed to be able disassemble binary and check that what the compiler generated matches what was put into the compiler.
Alfman Sel4 developers started with how to put a proof system on top of C to make the output safe. Now they are working on making pancake a language based off of what they learnt from that process. Yes doing it this way means you don’t go making unsafe code sections. Uncontrolled unsafe code sections equals you screwed up making you so called safe language. Sel4 developers have proved it possible to make a proper safe language that can integrate very well without leaving uncontrolled unsafe areas.
A safer language where you can have uncontrolled unsafe areas that are not forced to be validated is a path to human error to miss validating a bit of code in the unsafe areas that happens to be a security flaw.
Sel4 trustworthy.systems teams is a great place to go to learn what a proper secure system should somewhat look like. We are not seeing unikernels doing this yet and we should not be giving them free rides on it.
oiaohm,
I think you have the wrong idea that I’m criticizing sel4.
In any case I’m glad you admit hardware can be fixed and I’d like to call on you to tactfully encourage engineers to do so. In the meantime there are a lot more mitigations, including disabling cache around sensitive data and/or explicitly flushing all sensitive state between task switches. As you know, there are tradeoffs and some of these mitigations may be even more effective than ASLR.
ASLR isn’t the only mitigation though. We should acknowledge that hardware vulnerable to statistical attacks is still flawed with ASLR and software in unsafe languages are still vulnerable to crashing and denial of service attacks with ASLR. In other words ASLR mitigations clearly are not addressing the root of the problems. Real fixes for these problems would make ASLR redundant. While I don’t mind the option to enable redundant security measures, as a primary solution ASLR remains a dirty hack.
Your continued use of unikernels as a target seems very unfair to me. A developer could go with a different kernel design and it would be equally vulnerable to the risks you are describing. Being a unikernel has nothing to do with it. Can you help me understand why you’re so determined to stereotype and cast shade on “unikernels” even when you aren’t familiar with their specifics? I just don’t think that’s a useful way to think about it.
Not really, chip makers have gone through many chip cycles since then….this is clearly not the main limitation. We need to stop making these excuses though. The concept of isolation needs to be extended to all shared CPU resources, otherwise we’re condoning hardware with statistical vulnerabilities to stay the status quo over the long term. We can do better and I think we should.
Except they haven’t completely fixed the hardware flaws. And asserting they did would reveal a contradiction in your argument because if it were fixed, then ASLR becomes more redundant and less critical.
You keep going on and on about sel4, over 10 times by my count. It’s great that your passionate about this project, but with all due respect you’re coming across as a salesman and not an unbiased reviewer.
As cool as sel4 is, it is not a panacea. If you are trying to point to the mathematical proofs of sel4’s C code to make a case against “so called safe languages”, well that doesn’t really hold water. We can write mathematical proofs for code in any language, including brainfuck. This is a very simple language but intentionally designed to be obtuse for humans.
https://en.wikipedia.org/wiki/Brainfuck
Please don’t misconstrue my point as suggesting that C is no better than BF. But the fact that you can have a mathematical proof for BF does not make it a good reason to choose BF over any other turing complete language. This same logic applies to proofs in C. It means nothing in terms of the merits of a programming language.
“””In any case I’m glad you admit hardware can be fixed and I’d like to call on you to tactfully encourage engineers to do so. In the meantime there are a lot more mitigations, including disabling cache around sensitive data and/or explicitly flushing all sensitive state between task switches. As you know, there are tradeoffs and some of these mitigations may be even more effective than ASLR.”””
The reality is to do special handling you need to be able to place the hooks where you would have had to for ASLR to work for most of these options. This is why ASLR is bare min. Having ASLR done provided the foundations to do other mitigations..
“””Your continued use of unikernels as a target seems very unfair to me. A developer could go with a different kernel design and it would be equally vulnerable to the risks you are describing. Being a unikernel has nothing to do with it. Can you help me understand why you’re so determined to stereotype and cast shade on “unikernels” even when you aren’t familiar with their specifics? I just don’t think that’s a useful way to think about it.”””
In fact I am not being unfair. unikernels have the habit of selling themselves as providing security advantages without the proofs to back this up.
The horrible part is I am not really sterotyping. Find unikernel with migitations they are rare.
From a security point of view you are better to custom build a Linux kernel as compact as possible than using all the commonly pushed unikernel.
https://github.com/torvalds/linux/tree/master/tools/memory-model
Something people miss Linux kernel memory operations have a formal proof.
“””Except they haven’t completely fixed the hardware flaws.”””
Its that a hardware flaw gets countered but new ones get found
https://www.csoonline.com/article/3573888/spectre-flaw-still-haunts-intel-and-amd-chips-putting-security-at-risk.html
Yes the 2017 spectre is counted by migitations. “new variants” bit is important. So more spectre type flaws have been found in CPUs since 2017 and more variants are still turn up. So more mitigations are required and more 2 years delays before we can get good silicon..
“””Please don’t misconstrue my point as suggesting that C is no better than BF. But the fact that you can have a mathematical proof for BF does not make it a good reason to choose BF over any other turing complete language. This same logic applies to proofs in C. It means nothing in terms of the merits of a programming language.”””
You are looking the wrong way.
You can take a perfectly safe language and write a
https://github.com/kianenigma/c-interpreter
C interpreter in it replicating all the security flaws of using C.
What is more important the mathematical proof that you are not doing something stupid or the use of a safe or safer language.
There was a darpa autonomous vehicle test where a party depend on safe language without proofs and the vehicle run out of ram and crashed itself.
There was another example of a true prototype of a auto targeting auto turret that end up killing 6 soldiers this one the programmers had effectively written C in a safe language again no proof to be sure they knew what their code was going todo.
Yes using BF that perform a task with a full math proof that you have informed information on what bit of software is going todo is better than using any other language without a proof where do not have the information to be sure of what the code is going to-do.
Mathmatical proof of the Linux kernel memory model has removed from the Linux kernel over 60000 flaws that did not get CVE.
Alfman use of a safe language or safer language can make doing the proof simpler this is true. But the big thing here is using a safe or a safer language does not mean you should not be doing the proofs. Remember the Linux kernel does not have a full proof for all it operations but it does have a proof for a lot more of it operations than majority of unikernels.
Yes you can make secure software with almost any Turing complete language as long as you use mathematical proofs. Yes there are handful of programming languages where what the mathematical proof constantly tells you this will be insecure and unsafe.
The next critical question with safer languages can you make truly secure/safe software without mathematical proofs the answer is no you cannot because as human you will miss something that the proof process would have shown.. Sometimes big enough to drive a truck though like emulating known broken C semantics inside the safe language.
This is what has lead to top end safe languages have mathematical proofs as part of general operation. This is why I pointed you to sel4 pancake as it has the most human readable documentation on how these safe languages work. Remember the path leading to safe languages that have mandatory mathematical proofs of what operation is comes from military people who died because software malfunctioned.
Alfman what is the point of running processes in split containment if by mistake you have effectively mapped critical memory between all of them. This is the problem without mathematical proofs of operation there is very little you know. Yes no mathematical proofs and attempting to be secure results in missing something glaring wrong with the code more often than not.
oiaohm,
It’s not a matter of difficulty, only a matter of doing. Anyone who is capable of writing an OS can implement ASLR, but the more you dig into ASLR the more it becomes clear that it’s a flawed hack.
Even Linux’s implementation of ASLR is on the weak side because they made compromises to favor the sharing of memory across processes. I don’t blame linux devs for that, but it’s just another sign that we need to fix the underlying fault, it shouldn’t be leaking trusted state information in the first place. All of us need to stop portraying ASLR as the fix. It should only be regarded as a stop gap measure until we can get a better engineered solution that’s more more secure to boot.
Again with these generalizations.. I respect your technical expertise but I don’t know if you are aware of just how bent you seem to be on stereotyping. Let’s just stick to specific instances that you can sight specific evidence for.
And some of predicted this too when they said it was fixed, I knew it wasn’t and stated as much publicly. We leaned too much on hacky mitigations that effectively treated the symptoms rather than the causes. There’s a lot of ways to fix this from trivial to more involved all having different pros and cons. Like I said at the top, I figure we have enough cores these days that we can afford to dedicate cores to security domains and to never schedule untrusted code on those cores (until they’re flushed). This way most of the shared resources would be taken off the table for malicious code. They should also pin down or disable level 3 caches to foil attacks on shared resources outside of the cores. The point being, it’s not a mystery why such flaws happen, but we need to commit to fixing it and to stop relying on ASLR as a crutch.
The problem is NOT that C is not interpreted, but that C code as specified by the language doesn’t provide enough information to validate memory access. For example when you pass a byte pointer to a function, the function has no knowledge of the bounds of the structure it’s pointing to. It could be a single byte to gigabytes. Moreover the C compiler doesn’t strictly know if a memory access that happens to be a valid memory address is actually pointing to the correct memory even though it doesn’t crash. We just assume it was intended, but these assumptions open up classic memory vulnerabilities and exploits.
You’ve missed the point. The mathematical proof can be applied to any language….it is not a differentiator between turing complete languages. Every last one of them could support the same mathematical proofs.
I feel you may be dragging this out on purpose because I think you know this already, but it should be understood by all that the safe languages we’re talking about are not a guarantee against all software bugs, just the type of memory access bugs that C coders often make by accident. Languages like rust aren’t the first to offer this either. Java, C#, as well as countless interpreted languages offer this as well…but these have often come at the expense of adding run time checks and garbage collectors; techniques that aren’t suitable for low level systems. Languages like rust offer a way to automatically verify memory access checks at compile time. If humans were ideal programmers it wouldn’t matter, but this matters because real humans make mistakes.
Of course, developers can still write programs with logic errors in a safe language that protects against memory faults….this is well understood but it doesn’t really negate the merit of using safe languages over unsafe ones. If you want to make the case that “safe language” is semantically confusing, that’s fine and I wouldn’t necessarily disagree, but my interests here are more about net benefits than semantics.
Again I’m going to refer back to the BF argument. The fact that you still need humans to verify code against logic bugs – even in a “safe language” – is NOT an argument that BF is just as good, nor is it an argument that C is just as good. Mathematical proofs are NOT a differentiator of turing complete languages. The merits of a program language are about making the task easier for humans, otherwise we’d all be using machine code, which is mathematically equivalent.
Unikernels were a fad back before container orchestration because they let you cram a bunch of server applications into one hypervisor node. Now they’re basically stuck in the academic development ghetto.