To view parent comment, click here.
To read all comments associated with this story, please click here.
Your experiment is pretty convincing. :-) Especially when considering your original suggestion to use a database a workload. For this application, requests for individual database records are certainly much smaller then 10 KiB.
But your experiment also shows another point quite clear: The effectiveness of the Linux block cache. A throughput of 3 GiB/sec is quite nice for accessing a disk. ;-) I think that the addition of a block-cache component to Genode would be the most valuable performance improvement at the current stage. There is actually a topic in our issue tracker but nobody is actively working on it at the moment:
https://github.com/genodelabs/genode/issues/113
"I imagine you just ignore the offset that gets passed for files opened in append mode?"
Almost. The file-system interface does not differentiate a mode when opening a file but there is an append operation that can be applied to an open file by specifying ~0 as seek position. For reference, here is the interface:
https://github.com/genodelabs/genode/blob/master/os/include/file_sys...
"I'm a little surprised that even with a 10K buffer size, there's still a very noticeable half-second difference with the lseek syscall approach on linux. I suspect Genode-Noux would exhibit similar trends."
I agree. Thanks for investigating. I will keep your findings in the back of my head. Once we stumble over a pread/pwrite-heavy Noux application with suffering performance, getting rid of superfluous lseek calls looks like a worthwhile consideration.




Member since:
2011-01-28
nfeske,
"Your example does indeed subvert the locking scheme. But as Genode does not provide fork(), it wouldn't work anyway. ;-)"
Shows what I know
"The file-system interface is designed such that the seek offset is passed from the client to the file system with each individual file-system operation."
Makes sense. How do you handle files with the append flag?
int f=open("xyz", O_WRONLY | O_APPEND | O_CREAT, S_IRUSR | S_IWUSR);
sleep(1);
write(f, "1", 1);
sleep(1);
write(f, "2", 1);
close(f);
Running two instances of this program simultaneously on linux produces "1122". However if libc uses a process-local file offset, then it would probably output "12". I imagine you just ignore the offset that gets passed for files opened in append mode?
"To prevent falling into the premature-optimization trap, I'd first try to obtain the performance profile of a tangible workload."
A simple test here on an arbitrary linux system:
char buffer[1000];
int f=open("xyz", O_RDWR | O_APPEND | O_CREAT, S_IRUSR | S_IWUSR);
for(i=0; i<1000000; i++) {
/* TEST 1
off_t old = lseek(f, 10, SEEK_CUR);
lseek(f, 10, SEEK_SET);
read(f, &buffer, sizeof(buffer));
lseek(f, old, SEEK_SET);
*/
/* TEST 2
pread(f, &buffer, sizeof(buffer), 10);
*/
}
I recorded the fastest time of 3 runs...
buffer size=1
TEST 1 - seek + read = 1.072s
TEST 2 - pread = 0.663s
buffer size=1000
TEST 1 - seek + read = 1.254s
TEST 2 - pread = 0.882s
buffer size=10000
TEST 1 - seek + read = 3.636s
TEST 2 - pread = 3.183s
I'm a little surprised that even with a 10K buffer size, there's still a very noticeable half-second difference with the lseek syscall approach on linux. I suspect Genode-Noux would exhibit similar trends. But does it matter? That depends on who we ask. Sometimes design factors are worth some additional overhead. There are always trade offs.