Linked by Elad Lahav on Thu 18th Feb 2016 19:27 UTC
QNX

A mutex is a common type of lock used to serialize concurrent access by multiple threads to shared resources. While support for POSIX mutexes in the QNX Neutrino Realtime OS dates back to the early days of the system, this area of the code has seen considerable changes in the last couple of years.

Order by: Score:
Compliments!
by uridium on Thu 18th Feb 2016 20:39 UTC
uridium
Member since:
2009-08-20

Most interesting story here in years. Well done!

I'm afraid to ask, but does this herald a return to the days of stories with real substance to them instead of zOMGF! NEW PHONE! Widget is clickable now!!1

Seriously.. great story. Finally something with meat again and no froth or bubble in sight.

Reply Score: 14

Awesome
by Morgan on Thu 18th Feb 2016 21:37 UTC
Morgan
Member since:
2005-06-29

Most of this is over my head (I'm not a developer by any stretch of the imagination) but this was well written, easy to read and a wealth of information about an old favorite OS. Bravo!

Reply Score: 6

Cool article!
by tylerdurden on Thu 18th Feb 2016 23:02 UTC
tylerdurden
Member since:
2009-03-17

A piece about Operating Systems, finally!

I have a question; is the neutrino internal threading model based on pthreads, or does it have a different native model onto which they get mapped?


PS. I used to develop on QNX. I really miss the pervasive message passing, basically its programming model.

Reply Score: 2

RE: Cool article!
by elahav on Thu 18th Feb 2016 23:57 UTC in reply to "Cool article!"
elahav Member since:
2009-05-28

For the most part, the pthread API is implemented as a thin layer on top of the native thread API, e.g., pthread_create() does very little other than invoking the ThreadCreate() kernel call.

Reply Score: 2

RE[2]: Cool article!
by tylerdurden on Fri 19th Feb 2016 00:08 UTC in reply to "RE: Cool article!"
tylerdurden Member since:
2009-03-17

Thanks for the reply.

Follow up, is the native thread API exposed to the user? (as I said, I programmed for QNX a looong time ago, back when still used Codewarrior)

Also, are there any plans to support QNX on low end ARM systems like Raspberry Pi?

Reply Score: 2

RE[3]: Cool article!
by elahav on Fri 19th Feb 2016 01:17 UTC in reply to "RE[2]: Cool article!"
elahav Member since:
2009-05-28

Yes, the kernel calls are exposed and documented, and you can use them directly. For most pthread functions, however, there is little reason to do that (and, in fact, for mutexes, you will get sub-optimal results from going directly to the kernel calls, as explained in the article).

For low-end systems, you should be able to run QNX on any ARMv7 or x86 system with an MMU, so Pi2 is feasible, but not the older Pi (ARMv6 processor) or Cortex-M-based systems (no MMU).

Reply Score: 3

RE[4]: Cool article!
by tylerdurden on Fri 19th Feb 2016 01:52 UTC in reply to "RE[3]: Cool article!"
tylerdurden Member since:
2009-03-17

Again, thanks for the answers much appreciated.

Reply Score: 2

Mutex behavior
by Alfman on Fri 19th Feb 2016 00:10 UTC
Alfman
Member since:
2011-01-28

In a recent project, I encountered bugs due to undefined conditions of the pthread mutex.

I needed to capture a mutex, hold it across thread invocation, and then release it. Turns out that there's no defined behavior in this scenario.

http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_mutex_lo...

If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behaviour results.


So I opted to use the posix semaphore mechanisms sem_post sem_wait instead, which behaves as expected even under edge conditions.

I wish I could change one thing about posix semaphores: that one could wait on any number of events instead of decremented them one at a time. If 20 threads call sem_post(semaphore), the blocking thread has to call sem_wait 20 times just to decrement the counter back to zero.

while(true) {
sem_wait(semaphore);
// process thread events 20 times even if only the first iteration is productive
}


Turns out Linux kernel devs recognized this deficiency and corrected it with eventfd, although it's proprietary to linux and uses a new read/write syscall API.

Man, software development entails so many little factoids, if I had to start over, I don't even want to think about it ;)

Reply Score: 2

RE: Mutex behavior
by elahav on Fri 19th Feb 2016 01:27 UTC in reply to "Mutex behavior"
elahav Member since:
2009-05-28

If you are waiting on an event that can be generated by any, or even multiple, threads, then semaphores are not what you are looking for. Perhaps a condition variable?

Also, note that semaphores and mutexes, while sometimes confused for one another, achieve completely different tasks: the first are a synchronization (i.e., temporal) mechanism, the second a data-protection (i.e., spacial) mechanism. Search for "Concurrent Urban Legends" by Peter Buhr and Ashif Harji.

Interestingly, your original goal, that of explicitly handing over a mutex from one thread to another without releasing it is something that has come up several times in the past. It's not hard to implement, I'm just not sure that the semantics of such an operation are fully understood.

Reply Score: 1

RE[2]: Mutex behavior
by Alfman on Fri 19th Feb 2016 02:52 UTC in reply to "RE: Mutex behavior"
Alfman Member since:
2011-01-28

elahav,

If you are waiting on an event that can be generated by any, or even multiple, threads, then semaphores are not what you are looking for. Perhaps a condition variable?


No actually, the purpose of the mutex was really just to protect a data structure from concurrent access.

Interestingly, your original goal, that of explicitly handing over a mutex from one thread to another without releasing it is something that has come up several times in the past. It's not hard to implement, I'm just not sure that the semantics of such an operation are fully understood.



Once the error was tracked to the mutex, it wasn't hard to fix: just replace the mutex with a posix semaphore.

sem_init(&sem, 1);
...
sem_wait(sem); // lock
// equivalent to mutex
sem_post(sem); // unlock (even in new thread)


It's semantically identical to the mutex, with the benefit of having defined behavior across threads.

Edited 2016-02-19 02:54 UTC

Reply Score: 2

RE[3]: Mutex behavior
by Alfman on Fri 19th Feb 2016 07:09 UTC in reply to "RE[2]: Mutex behavior"
Alfman Member since:
2011-01-28

elahav,

If you are waiting on an event that can be generated by any, or even multiple, threads, then semaphores are not what you are looking for. Perhaps a condition variable?



Oh sorry, you were talking about my other example.
Yes a pthread condition variable might work there, there are many options really.


AFAIK pthread condition variables only support mutex synchronization which can potentially block the sender when it needs to change the predicate condition variable. The block would only be momentary, but with asynchronous programming the main thread should not enter a wait state to wait on locks in other threads - it goes against the design. I felt a spinlock was the best choice here to protect the thread data queue. Both linux eventfd and semaphores (and even pipes) allow us to send events without risk of putting the main thread in a blocked state.

Reply Score: 2

RE[3]: Mutex behavior
by elahav on Fri 19th Feb 2016 16:13 UTC in reply to "RE[2]: Mutex behavior"
elahav Member since:
2009-05-28


sem_init(&sem, 1);
...
sem_wait(sem); // lock
// equivalent to mutex
sem_post(sem); // unlock (even in new thread)


This is not equivalent to a mutex. While a count-1 semaphore behaves similarly to a mutex, there are subtle, yet important, differences, resulting from a semaphore not having an owner:
1. Any thread can signal the semaphore, leading to a loss of mutual exclusion (which will go unnoticed until data corruption is observed).
2. No way to avoid priority inversion.

You may not care about these problems in this particular case, but it is important to understand the semaphore/mutex divide.

Reply Score: 1

RE[4]: Mutex behavior
by Alfman on Fri 19th Feb 2016 16:58 UTC in reply to "RE[3]: Mutex behavior"
Alfman Member since:
2011-01-28

elahav,

1. Any thread can signal the semaphore, leading to a loss of mutual exclusion (which will go unnoticed until data corruption is observed).


Notice in the link I provided earlier that the pthread mutex behavior is undefined for the conditions where one would try to signal from another thread. It means that no code which correctly uses the mutex is allowed to attempt to signal the mutex from another thread. The fact that the semaphore defines and/or behaves differently for those conditions doesn't make it any less suitable as a mutex in code which makes no assumptions about those conditions. For all conditions that are defined for the mutex, the semaphore works the same way.

In other words, I think the semaphore pseudocode I gave above would be a valid implementation of a pthread mutex.

Note that I agree with you that a mutex is generally the accepted mechanism to use to serialize access to a data structure, but I'm not able to come up with examples where a semaphore doesn't work...can you come up with any counter examples?

2. No way to avoid priority inversion.


Even with a mutex, a low priority thread can block a high priority thread. Am I misunderstanding you?

Edited 2016-02-19 17:03 UTC

Reply Score: 2

RE[5]: Mutex behavior
by elahav on Fri 19th Feb 2016 17:58 UTC in reply to "RE[4]: Mutex behavior"
elahav Member since:
2009-05-28

Notice in the link I provided earlier that the pthread mutex behavior is undefined for the conditions where one would try to signal from another thread


It may be undefined in the spec, but it is certainly defined if you are using error-checking mutexes, which is the default on QNX and optional on Linux.

Even with a mutex, a low priority thread can block a high priority thread. Am I misunderstanding you?


Read the article ;-) With priority inheritance (again, default on QNX, optional on Linux), the low priority thread will be boosted to the priority of the highest waiter until it relinquishes the mutex.

Reply Score: 1

RE[6]: Mutex behavior
by Alfman on Fri 19th Feb 2016 19:33 UTC in reply to "RE[5]: Mutex behavior"
Alfman Member since:
2011-01-28

elahav,

It may be undefined in the spec, but it is certainly defined if you are using error-checking mutexes, which is the default on QNX and optional on Linux.


It's true error checking extensions can help catch code that triggers undefined mutex behavior. One could implement the same optional error checking with semaphores too, but I think it is unfortunate that useful scenarios were left undefined in the first place. Now we're forced to "reinvent the wheel" just to work around the cases for which a mutex is not properly defined. Alas, it is what it is.


Read the article ;-) With priority inheritance (again, default on QNX, optional on Linux), the low priority thread will be boosted to the priority of the highest waiter until it relinquishes the mutex.


I did read it, thank you for the article by the way.

This is a good point, you can't really bump the priority of unequal threads when the kernel doesn't know which thread will release a lock. Maybe there should be a well to tell it.

Reply Score: 2

RE[7]: Mutex behavior
by dpJudas on Fri 19th Feb 2016 19:56 UTC in reply to "RE[6]: Mutex behavior"
dpJudas Member since:
2009-12-10

Now we're forced to "reinvent the wheel" just to work around the cases for which a mutex is not properly defined. Alas, it is what it is.

It is not "properly defined" to increase the chance more optimal implementations are possible without having to support weird usages such as "lock mutex in one thread and unlock in another".

Exactly why you had to lock it in one thread and release it in another is still not clear (to me), but I'm guessing there is a good chance that some of the other pthread primitives available would have been a better fit for the problem at hand.

Reply Score: 2

try out qnx
by Dawgmatix on Fri 19th Feb 2016 03:03 UTC
Dawgmatix
Member since:
2008-09-28

What's the best way to try out qnx these days? Is it available for download?

Reply Score: 2

RE: try out qnx
by judgen on Fri 19th Feb 2016 05:03 UTC in reply to "try out qnx"
judgen Member since:
2006-07-12

Yes the desktop OS along with the dev SDK is available for download at the qnx website. However payed support costs an arm an a leg (or at least last time i tried it) so you are going to depend on written documentation and community for help unless you want to pay.

Reply Score: 5

What if?
by flyingrobots on Mon 22nd Feb 2016 20:36 UTC
flyingrobots
Member since:
2010-09-30

What if I told you that for most cases you don't need them? Would you believe me? What if I also told you would see dramatic increases in performance for those areas of code that had high levels of contention?

QNX only....

Reply Score: 1

RE: What if?
by elahav on Tue 23rd Feb 2016 14:36 UTC in reply to "What if?"
elahav Member since:
2009-05-28

It's hard to say, because you haven't provided any details.
In general, there are a few ways to avoid or reduce the use of mutexes, but, as far as I know, there is always a trade-off:
1. Avoid data sharing, e.g., with a pure event-driven design or by a strict assignment of tasks to threads. While I often go for such a design myself, it does restrict parallelism, and therefore does not scale well with the number of processors.
2. Solve the data-sharing problem with lock-free data structures or read-modify-write schemes. These are typically harder to implement and are not always applicable.

Nevertheless, the issue is a moot one as far as the article goes: 99% of the multi-threaded code out there uses mutexes, which means that the OS must support this mechanism.

Reply Score: 1