Linked by Thom Holwerda on Mon 8th Oct 2018 23:51 UTC
OSNews, Generic OSes

This paper presents an evaluation of the use of a high-level language (HLL) with garbage collection to implement a monolithic POSIX-style kernel. The goal is to explore if it is reasonable to use an HLL instead of C for such kernels, by examining performance costs, implementation challenges, and programmability and safety benefits.

The paper contributes Biscuit, a kernel written in Go that implements enough of POSIX (virtual memory, mmap, TCP/IP sockets, a logging file system, poll, etc.) to execute significant applications. Biscuit makes liberal use of Go's HLL features (closures, channels, maps, interfaces, garbage collected heap allocation), which sub- jectively made programming easier. The most challenging puzzle was handling the possibility of running out of kernel heap memory; Biscuit benefited from the analyzability of Go source to address this challenge.

On a set of kernel-intensive benchmarks (including NGINX and Redis) the fraction of kernel CPU time Biscuit spends on HLL features (primarily garbage collection and thread stack expansion checks) ranges up to 13%. The longest single GC-related pause suffered by NGINX was 115 microseconds; the longest observed sum of GC delays to a complete NGINX client request was 600 microsec- onds. In experiments comparing nearly identical system call, page fault, and context switch code paths written in Go and C, the Go version was 5% to 15% slower.

Scientific papers about operating system experiments - who doesn't love them?

Order by: Score:
Excellent find Thom!
by Alfman on Tue 9th Oct 2018 03:11 UTC
Alfman
Member since:
2011-01-28

We often debate this very topic without having good data to talk about. This should make for good points of discussion.

Personally I've been leaning towards achieving memory safety through static language analysis rather than garbage collection, but this is very interesting none-the-less. These performance tradeoffs may be worthwhile in some security contexts.


Should one use HLL for a new kernel?
The HLL worked well for kernel development

* Performance is paramount ⇒ use C (up to 15%)

* Minimize memory use ⇒ use C (↓ mem. budget, ↑ GC cost)

* Safety is paramount ⇒ use HLL (40 CVEs stopped)

* Performance merely important ⇒ use HLL (pay 15%, memory)



With garbage collection, additional memory usage is to be expected, but I do wonder if the performance could be improved upon with further optimization.

Reply Score: 5

RE: Excellent find Thom!
by Lennie on Tue 9th Oct 2018 15:46 UTC in reply to "Excellent find Thom! "
Lennie Member since:
2007-09-22

The one I wonder about it: I always get the impression Rust would be a better choice than Go. Because Rust doesn't use GC, uses less memory and less CPU and is faster in a lot of cases.

Reply Score: 2

RE[2]: Excellent find Thom!
by Alfman on Tue 9th Oct 2018 18:12 UTC in reply to "RE: Excellent find Thom! "
Alfman Member since:
2011-01-28

Lennie,

The one I wonder about it: I always get the impression Rust would be a better choice than Go. Because Rust doesn't use GC, uses less memory and less CPU and is faster in a lot of cases.


I think so too, but right I'd welcome development in any of them because I feel change is long overdue.

High level developers have been more adventurous with new languages like python, coffeescript, ruby, etc, but most low level systems programming hasn't embraced change. I'm just as guilty as anyone. The barrier for me is that the current tools are so well established that retooling/rewriting everything to basically get what we already have is a really hard sell. Of course security is a pro for switching, but investing in platforms with an uncertain future with rather limited support and requiring niche developer skills can be difficult to justify. For better or worse, there are efficiencies to be had by following the big crowds, regardless of if it's the best choice or not.

Reply Score: 3

RE[3]: Excellent find Thom!
by Lennie on Wed 10th Oct 2018 17:52 UTC in reply to "RE[2]: Excellent find Thom! "
Lennie Member since:
2007-09-22

Well you can combine Rust and C(++) and I assume assembly language in the same project.

Because Mozilla is doing it for their C++ code base:

https://research.mozilla.org/servo-engines/

"Hopefully, Rust is a pretty intuitive language for C++ programmers. Most of the syntax is pretty similar. The big difference (in my experience) is that the sometimes vague concepts of good systems programming are strictly enforced by the compiler. This can be infuriating at first - there are things you want to do, but the compiler won't let you (at least in safe code), and sometimes these things are safe, but you can't convince the compiler of that. However, you'll quickly develop a good intuition for what is allowed. Communicating your own notions of memory safety to the compiler requires some new and sometimes complicated type annotations. But if you have a strong idea of lifetimes for your objects and experience with generic programming, they shouldn't be too tough to learn."

https://github.com/nrc/r4cppp

Reply Score: 3

RE: Excellent find Thom!
by christian on Tue 9th Oct 2018 20:53 UTC in reply to "Excellent find Thom! "
christian Member since:
2005-07-06

We often debate this very topic without having good data to talk about. This should make for good points of discussion.


Wish this paper had come out a few weeks ago ;)


Personally I've been leaning towards achieving memory safety through static language analysis rather than garbage collection, but this is very interesting none-the-less. These performance tradeoffs may be worthwhile in some security contexts.



Not sure static analysis can handle cases such as multiple threads have references to a shared buffer. We can't statically determine which thread will last access the buffer, for example. Sure, reference counting can handle this, but this has it's own, albeit minimal, overhead, and is prone to leak bugs in C (forgetting to release buffers leaks memory.)




"Should one use HLL for a new kernel?
The HLL worked well for kernel development

* Performance is paramount ⇒ use C (up to 15%)

* Minimize memory use ⇒ use C (↓ mem. budget, ↑ GC cost)

* Safety is paramount ⇒ use HLL (40 CVEs stopped)

* Performance merely important ⇒ use HLL (pay 15%, memory)



With garbage collection, additional memory usage is to be expected, but I do wonder if the performance could be improved upon with further optimization.
"

To be honest, given that the best case performance advantage of Linux over biscuit was about 10%, and that was using RAM disk based I/O, and given that once an SSD based test was included that performance got to within 2%, that says to me performance is perfectly adequate for most use cases.

We've paid a bigger performance hit than that with Spectre and Meltdown mitigation.

Reply Score: 2

RE[2]: Excellent find Thom!
by Alfman on Wed 10th Oct 2018 11:25 UTC in reply to "RE: Excellent find Thom! "
Alfman Member since:
2011-01-28

christian,


Not sure static analysis can handle cases such as multiple threads have references to a shared buffer. We can't statically determine which thread will last access the buffer, for example. Sure, reference counting can handle this, but this has it's own, albeit minimal, overhead, and is prone to leak bugs in C (forgetting to release buffers leaks memory.)




This is a hand-wavy answer, but that fact that you can manually prove using logic that two threads can not deadlock nor mishandle pointers means that a software algorithm could implement the same logical rules to prove the same conclusions about correctness. Most languages can't do this because the code is assumed to be correct and the programmer's intent of pointer ownership is not conveyed. But if you have this information, then in principal, static analysis can prove code correctness even across threads.

Rust has taken this concept and run with it, all with static compile time analysis and no runtime overhead. It's one of the reasons I wish it would be used more.

https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html

Reply Score: 3

v Comment by The123king
by The123king on Tue 9th Oct 2018 07:38 UTC
RE: Comment by The123king
by feamatar on Tue 9th Oct 2018 10:07 UTC in reply to "Comment by The123king"
feamatar Member since:
2014-02-25

I would suggest not to start this again.

Reply Score: 5

v RE[2]: Comment by The123king
by The123king on Tue 9th Oct 2018 12:29 UTC in reply to "RE: Comment by The123king"
Fuschia?
by AndrewZ on Tue 9th Oct 2018 12:44 UTC
AndrewZ
Member since:
2005-11-15

What is Fuschia written in, C++?

Reply Score: 4

RE: Fuschia?
by moondevil on Tue 9th Oct 2018 13:21 UTC in reply to "Fuschia?"
moondevil Member since:
2005-07-08

A mix of C++, Rust, Go and Dart.

The kernel process, most drivers and UI composition engine are in C++.

The TCP/IP stack, disk management utilities are in Go.

Rust is minimally used still.

The UI framework is Dart, as it makes use of Flutter.

Reply Score: 5

goroutines vs coroutines
by kwan_e on Tue 9th Oct 2018 16:27 UTC
kwan_e
Member since:
2007-02-18

The Go collector does most of its work during calls to
the heap allocator, spreading out this work roughly evenly
among calls. Thus goroutines see delays proportional to
the amount that they allocate;
§
8.5 presents measurements
of these delays for Biscuit.


Coroutines will be coming to C++, and so will heap elision. They both have working implementations, leading to what some call "negative overhead abstractions".

Reply Score: 4

Comment by FlyingJester
by FlyingJester on Tue 9th Oct 2018 17:45 UTC
FlyingJester
Member since:
2016-05-11

The one thing that smells slightly to me is that they changed the code paths in an existing C kernel to be more like what they had written in Go.

Different languages encourage and allow different operations and methodologies, and make different guarantees. I think a better comparison would be a "natural" C kernel and a "natural" Go kernel, both with similar effort put into optimization.

Reply Score: 3

RE: Comment by FlyingJester
by christian on Tue 9th Oct 2018 20:26 UTC in reply to "Comment by FlyingJester"
christian Member since:
2005-07-06

The one thing that smells slightly to me is that they changed the code paths in an existing C kernel to be more like what they had written in Go.


The changes were to measure the performance difference of Go versus C. They needed apples to apples code for that comparison.



Different languages encourage and allow different operations and methodologies, and make different guarantees. I think a better comparison would be a "natural" C kernel and a "natural" Go kernel, both with similar effort put into optimization.


Linux and BSDs have had decades, and man-millennia of effort put into optimization. It's not reasonable to expect a new Go kernel to even approach that sort of effort.

Reply Score: 2

RE[2]: Comment by FlyingJester
by FlyingJester on Tue 9th Oct 2018 20:40 UTC in reply to "RE: Comment by FlyingJester"
FlyingJester Member since:
2016-05-11

What I mean is that what is idiomatic in Go is not idiomatic in C, and that will change the results.

In C, you very often tightly pack structures, where in a GC'ed language this is not nearly as natural (and in some cases not possible). You often rely on non-exhaustive logic in the small scale and partial structure initialization, both of which are usually frowned upon in other languages.

I don't know a lot of Go, but many languages provide very easy to use systems that are extremely verbose in C (interfaces, exhaustive disjunctions, compiler-verified memory region re-use), as well.

Making the C similar to Go will cause the benchmark to be "If I make C look like Go, which is faster?", which I very much suspect will not favor C as much as what idiomatic C would compared with idiomatic Go.

Reply Score: 4

RE[3]: Comment by FlyingJester
by christian on Tue 9th Oct 2018 21:02 UTC in reply to "RE[2]: Comment by FlyingJester"
christian Member since:
2005-07-06

Making the C similar to Go will cause the benchmark to be "If I make C look like Go, which is faster?", which I very much suspect will not favor C as much as what idiomatic C would compared with idiomatic Go.


Idiomatic C often results in things like leaked buffers, buffer overruns and buffers used after they've been free'd.

I personally like C the language, but would happily dump many of it's idioms which are based around it's limited runtime environment.

What I most miss from other languages, for example, are things like nice exception handling and nice standard library constructs for strings and data collections.

GC got me in hot water the other week, so I won't mention that ;)

Reply Score: 2

FlyingJester Member since:
2016-05-11

I never said it was better to use C. But I think you're close to the point I'm trying to make: C encourages different things than Go. Comparing C made to look like Go doesn't make as much sense to me as comparing "normal" C and "normal" Go. The results will be different. Go will probably have fewer bugs, but C will probably be faster and use less memory. I strongly suspect making C look like Go will reduce the gap in both case.

Reply Score: 3

RE[5]: Comment by FlyingJester
by kwan_e on Wed 10th Oct 2018 03:34 UTC in reply to "RE[4]: Comment by FlyingJester"
kwan_e Member since:
2007-02-18

I strongly suspect making C look like Go will reduce the gap in both case.


Yes, when you do slow things in C, of course you'll reduce the gap. But a 15% improvment in important parts of an OS kernel is nothing to sneeze at. If a Go kernel requires a significant slow down to be safe, then the kernel won't be used, meaning it has added nothing to safety landscape.

Reply Score: 3

RE[6]: Comment by FlyingJester
by christian on Wed 10th Oct 2018 09:13 UTC in reply to "RE[5]: Comment by FlyingJester"
christian Member since:
2005-07-06

"I strongly suspect making C look like Go will reduce the gap in both case.


Yes, when you do slow things in C, of course you'll reduce the gap. But a 15% improvment in important parts of an OS kernel is nothing to sneeze at. If a Go kernel requires a significant slow down to be safe, then the kernel won't be used, meaning it has added nothing to safety landscape.
"

Very much depends on your use case. Desktop and HPC will be largely CPU bound, but CPU bound at user level, so not involving the kernel much.

Anything that is storage I/O bound will find the storage system is the bottleneck, and the kernel simply becomes an occasional arbiter of disk I/O.

The main use case I can think of where this Go kernel might be a problem is anything requiring very low latency. So, for example, high throughput network routing. A 15% dip in performance here might add 15% to the bottom line latencies, and a corresponding loss of throughput.

Anywhere else, that performance lost will be eaten by Moore's Law in a couple of months. Otherwise, it's just 15% on top of typically less than 15% kernel time for the other use cases.


I, personally, would happily pay that price on any of my (non low-latency critical) computers.

Reply Score: 1

RE[7]: Comment by FlyingJester
by Alfman on Wed 10th Oct 2018 11:09 UTC in reply to "RE[6]: Comment by FlyingJester"
Alfman Member since:
2011-01-28

christian,

To be honest, given that the best case performance advantage of Linux over biscuit was about 10%, and that was using RAM disk based I/O, and given that once an SSD based test was included that performance got to within 2%, that says to me performance is perfectly adequate for most use cases.


I agree with this, the overhead is probably reasonable for many applications that could benefit from safe pointers.


Very much depends on your use case. Desktop and HPC will be largely CPU bound, but CPU bound at user level, so not involving the kernel much.

Anything that is storage I/O bound will find the storage system is the bottleneck, and the kernel simply becomes an occasional arbiter of disk I/O.



You are right, the kernel overhead only affects the overall performance in proportion to the amount of time spent in the kernel. If the kernel is using 5% CPU, then 10-15% of 5% would represent less than 1% overhead overall. It obviously makes sense to have a fast kernel, but we also need to consider the overall context of what it's being used for.


The main use case I can think of where this Go kernel might be a problem is anything requiring very low latency. So, for example, high throughput network routing. A 15% dip in performance here might add 15% to the bottom line latencies, and a corresponding loss of throughput.


The paper said this too. Obviously if you never collect the garbage, there would be zero overhead. The main time overhead occurs is during a collection cycle. I do wonder what can be done to address the latency problem. One idea is to use a separate CPU to perform the garbage collection asynchronously. I did a search for this...

https://stackoverflow.com/questions/5684306/possible-pitfalls-of-thi...

The main source of latency comes from "stop the world GC" and sync barriers. Just now I was giving some thought to how a GC could perform it's work asynchronously without stopping execution by locking the pages such that program code can run concurrently and conflicts will be detected & resolved the moment program code tries to access the objects being scanned. This should produce very little latency. Well it turns out someone else has already thought of that. Here's a paper for a pauseless GC that tries to do this which is very interesting!

http://www.usenix.org/events/vee05/full_papers/p46-click.pdf
Table 1 shows the worse-case transaction times. The Pauseless
algorithm's worse-case transaction time of 26ms is over 45 times
better than the next JVM, BEA's parallel collector. Average
transaction times are remarkable similar given the wide variation
in hardware used.





Anywhere else, that performance lost will be eaten by Moore's Law in a couple of months. Otherwise, it's just 15% on top of typically less than 15% kernel time for the other use cases.


Moores law died for single threaded performance several years back. We keep adding more transistors, but in practice this translates into more cores, which might not do as much to counteract increased kernel overhead occurring on every core.


I've stated my opinion on Garbage Collectors before, I prefer the deterministic behavior of explicit frees, which often go hand in hand with destructors, but it's still an interesting problem to me and it's fun to have these kinds of discussions!

Reply Score: 3

RE[7]: Comment by FlyingJester
by kwan_e on Wed 10th Oct 2018 18:12 UTC in reply to "RE[6]: Comment by FlyingJester"
kwan_e Member since:
2007-02-18

The main use case I can think of where this Go kernel might be a problem is anything requiring very low latency. So, for example, high throughput network routing. A 15% dip in performance here might add 15% to the bottom line latencies, and a corresponding loss of throughput.


Ironically one of the many systems you'd want the most security. Which brings us back to square one. The systems we want the most security are also the systems where performance is also paramount, and why both the languages and their security bugs go hand in hand. Software engineering is about information hiding, but performance requires breaking through some abstractions to treat them as they are.

Anywhere else, that performance lost will be eaten by Moore's Law in a couple of months. Otherwise, it's just 15% on top of typically less than 15% kernel time for the other use cases.


This attitude is why end-user software still hasn't improved efficiency in years, as discussed many times on OSNews. A wise person once said: "all computers wait at the same speed". Moore's Law doesn't really help, since software also increases in complexity over those same months.

I, personally, would happily pay that price on any of my (non low-latency critical) computers.


How about your battery-powered handheld computers? 15% of its charge being used to clean up garbage?

Reply Score: 3