Implementing Mutexes in the QNX Neutrino Realtime OS

Guest post by Elad Lahav 2016-02-18 QNX 28 Comments

A mutex is a common type of lock used to serialize concurrent access by multiple
threads to shared resources. While support for POSIX mutexes in the QNX Neutrino
Realtime OS dates back to the early days of the system, this area of the code
has seen considerable changes in the last couple of years.

This article is written by Elad Lahav, Software Developer, QNX Software Systems Limited.

Introduction

Multi-threaded programming allows software developers to break up a process into
multiple, concurrent streams of execution. There are several good (and various
bad) reasons for using multiple threads within a program:

Achieving true parallelism by executing different threads on different
processor cores.
Adhering to real-time constraints with priority-based pre-emption.
Simulating asynchronous I/O in the presence of blocking calls.

Almost every multi-threaded program must deal with the problem of sharing data
among the threads. Access to data by different threads needs to be serialized in
order to preserve data integrity. Even the most trivial of data
manipulations, such as incrementing a variable by one and reading its value, is
prone to race conditions in the presence of multiple threads that can access
that data. These race conditions can, and often will, lead to incorrect
execution.

There are various techniques to ensure serialization, which is the act of
governing access to data by multiple threads. Most modern processors provide
atomic operations that allow for integrity-preserving manipulation of data, such
as test-and-set and compare-and-exchange. These atomic operations, however, are
limited to well-defined data types, typically those that can fit within a single
register. Nevertheless, such operations are useful as building blocks
for other serialization mechanisms, most notably for various lock implementations.

A spin-lock is the simplest lock, requiring only one shared bit. A thread
that wishes to access a piece of shared data can check whether the bit is set,
and, if not, acquire the lock by setting it. This is done in a loop that spins as
long as the bit is set, i.e., as long as the lock is held. The operation of
testing whether the bit is set and setting it if not itself needs to be atomic,
in order to prevent race conditions among threads contending for the lock.

While spin-locks are indispensable in certain scenarios (mostly on SMP systems
in contexts that cannot block, such as interrupt services routines), these locks
are rarely used in (correct) multi-threaded programs. Since the OS scheduler is
unaware that the thread is waiting for a lock to be released, it may take time
before the spinning thread is pre-empted and the thread currently holding the
lock is scheduled. On single-processor systems, if the spinning thread has a
higher priority than the current lock owner, the former can end up spinning
forever. Spinning is also bad for power consumption.

Most operating systems provide a locking primitive that allows a thread to block
while waiting for the lock to become available. Such a lock is commonly referred
to as a mutex. Blocking a thread requires that the OS
scheduler be aware of the lock, avoid scheduling a thread while it is held by
another thread, and re-schedule the thread once the lock can be acquired. POSIX
defines the pthread_mutex_t type and a set of functions to initialize,
lock, unlock and destroy an object of this type. The implementation of the mutex
data type and functions is the subject of the following sections.

Mutexes in the QNX Neutrino OS

The QNX Realtime OS is a POSIX-certified operating system and, as
such, provides a complete implementation of the POSIX Threads (pthread)
API, including mutexes. The default mutex object in the QNX Neutrino OS (i.e., one that
is statically initialized with PTHREAD_MUTEX_INITIALIZER or via a call
to pthread_mutex_init() with a NULL mutex attribute pointer) is:

Fast: Operations on uncontested locks do not require any kernel calls.
Light weight: The cost of declaring a mutex is 8 bytes in the declaring process’ address space. No kernel resources are required unless a thread is blocked on a locked mutex.
Priority inheriting: The default mutex implements the PTHREAD_PRIO_INHERIT protocol for priority inheritance. Other protocols
(PTHREAD_PRIO_PROTECT and PTHREAD_PRIO_NONE) are also
supported.
Non-recursive: An attempt to lock a mutex more than once by the same thread results in an error.
Shared: A mutex placed in memory that is visible to multiple processes (e.g., memory obtained by mapping a shared-memory object) can be operated on by these processes as a single, common, lock.

The implementation of mutexes is split between the C library and the
QNX Neutrino micro-kernel.

C Library Implementation

The C library provides the POSIX-defined functions for handling mutexes,
(pthread_mutex_init(), pthread_mutex_lock(), etc.), as well as
wrappers for the relevant kernel calls ¹.

User code creates a mutex by declaring a variable of type pthread_mutex_t. As mentioned above, this type is an 8-byte structure,
common to mutexes, semaphores and condition variables. The structure holds the
current owner of the mutex (a system-wide identifier of the thread that had last
locked the mutex successfully), a counter for recursive mutexes and various flag
bits. One bit in the owner field is used to indicate whether other threads are
currently waiting for the mutex. The variable can be initialized statically, or
with a call to pthread_mutex_init().

A call to pthread_mutex_lock() will attempt to lock the mutex by
performing an atomic compare-and-exchange operation on the variable’s owner
field, looking for a current value of 0 (no owner). If the operation succeeds,
then the owner field is updated with the calling thread’s system-unique ID, and
the mutex is now considered locked. No kernel intervention is required in this
case. The atomic operation will fail if the owner field has a value other than
0, which can happen in the following cases:

The mutex is locked by this thread.
The mutex is locked by another thread.
There are other threads waiting for the mutex.

The first case is handled by the C library code (recursive mutexes increment
their lock count, non-recursive mutexes return an error). The other two are handled by
invoking the SyncMutexLock() kernel call.

Unlocking the mutex with a call to pthread_mutex_unlock() is again
handled by a compare-and-exchange operation, which attempts to replace the
calling thread’s ID with the value 0. The operation will fail if:

The mutex is locked by another thread.
There are other threads blocked on the mutex.

The first case returns an error, while the second invokes the SyncMutexUnlock() kernel call.

It can be seen from the description above that an uncontested mutex, i.e., a
mutex that is locked and unlocked by the same thread without any other thread
trying to acquire it at the same time, is handled completely in user mode,
without any kernel intervention. The only overhead is that of a function call
and an atomic operation. While not free (the atomic operation impacts the bus
and memory barriers are required after acquiring and before releasing the
mutex), this operation is orders of magnitude cheaper than a call that requires
a trap into the kernel. Nevertheless, since the information stored in the pthread_mutex_t structure is essential to the correct operation of the
mutex, care must be taken within the kernel when the values of the structure are
read and used. For example, the owner field may be in a state of flux until the
bit indicating waiting threads is set, and all processors are aware of
that. Moreover, bad values in this structure written by a faulty or malicious
process should be handled properly. Such values should cause the kernel calls to
return an error or, in the worst case scenario, cause the process that wrote
them to malfunction, without affecting the kernel or other processes.

Kernel Implementation

The two most important kernel calls dealing with mutexes are SyncMutexLock() and SyncMutexUnlock(). As described in the previous
section, these are invoked when the C library is unable to deal with the mutex
by itself. There are other calls to initialize a mutex to non-default
attributes, assign a priority to a priority-ceiling mutex, associate an event
with a locked mutex whose owner dies unexpectedly, and more.

Since an attempt to lock a mutex may lead to the calling thread blocking, the
kernel needs to maintain a list of threads waiting on the mutex. The list is
sorted first by priority and than on a first-come-first-served basis. When a
mutex that is waited on by other mutexes is unlocked (as indicated by a special
bit in the user-mode owner field) the kernel will choose the next thread to wake
up and attempt to lock the mutex.

This design has two important implications:

The list of waiting threads requires a kernel object to serve as its head,
as a user-mode object cannot be used to store kernel pointers.
The kernel needs to be able to identify this object from the user-mode
mutex.

The kernel object is a sync_entry structure which serves as the head of
the waiting threads list. The association between the user-mode pthread_mutex_t object and the kernel’s sync_entry object is
accomplished by a hash table. However, since mutexes in QNX are shared by
default, we cannot use the user-mode object’s virtual address as the key to the
hash table. Instead, the kernel consults the memory manager, which provides a
globally-unique handle for the object. This handle is used as a key to the hash
function.

Other than holding the head of the waiters list, the sync_entry structure
contains pointers that allow it to be linked on two other lists: the hash table
bucket and a list of mutexes locked by a thread.

Traditionally, a sync_entry object was allocated for every mutex upon a
call to pthread_mutex_init() (or the first call to pthread_mutex_lock() for statically-initialized mutexes), and persisted
until the mutex was deleted with a call to pthread_mutex_destroy(). While this approach is usually acceptable, we
have run into cases where too much kernel memory was tied into these
objects (consider a database holding millions of records with a mutex
associated with each record). To overcome the problem, we made the observation
that, in most cases, the kernel object is required only when threads are
actually blocked waiting for the mutex to be released. The implementation was
therefore changed such that a kernel object is allocated when a thread blocks on
it, and is freed as soon as the last waiter is woken up. Such a strategy,
however, opens up the possibility that kernel memory will not be available when
a thread needs to block, which means that a call to lock a mutex can fail for no
fault of the caller. To avoid this, a pool of sync_entry objects is used,
with a new object reserved each time a thread is created (and unreserved when it
is destroyed). Since a thread cannot block on more than one mutex at a time,
this pool guarantees that an object is available whenever it is required. The
exceptions to this use of dynamic objects are robust mutexes (those whose owner
value needs to be updated if the locking process dies unexpectedly) and
priority-ceiling mutexes, where the priority is held by the sync_entry
structure. Nevertheless, such objects are considerably less common than the
default non-robust, priority-inheriting mutexes, and dynamic allocation works
well in practice.

The implementation of SyncMutexLock() goes through the following steps:

Set the waiters bit in the owner field of the user-mode variable, to force
other threads into the kernel.
Check whether the mutex is currently locked by inspecting the owner field
(it could have been unlocked since the time the C library code decided to
invoke the kernel call). If not locked, and if higher-priority threads are not
on the mutex waiting list, the owner field is set and the kernel call returns.
Look up a sync_entry object in the hash table for the user-mode
variable. If one does not yet exist, a new object is allocated and put in the
hash table and on the list of mutexes held by the current owner thread.
Add the calling thread to the sync_entry waiting list and mark it
as blocked on a mutex.
Adjust the priority of the current owner, if needed.

SyncMutexUnlock() looks up the sync_entry object for the user-mode
mutex in the hash table, removes the first waiter from the list and moves it to
a ready state. Information stored in the thread structure will cause the thread
to try and acquire the mutex when it is next scheduled to run. If that thread is
the last one waiting on the sync_entry object, that object is freed back
to the pool.

Priority Considerations

The correct assignment and handling of thread priorities is essential for
achieving real-time behaviour in an operating system. Unfortunately, the use of
mutexes in multi-threaded code can easily lead to the break-down of an otherwise
properly-designed priority-based system: when a high-priority thread blocks on a
mutex held by a lower-priority thread, it cannot make any forward progress until
the low-priority activity reaches the point of releasing the mutex. Thus, the
priority of the waiter is effectively lowered to that of the owner, a situation
known as priority inversion.

POSIX mutexes provide three protocols for dealing with priority inversion:

Priority inheritance: The priority of the owner thread is at least that of the highest-priority thread waiting on the mutex.
Priority ceiling: Each mutex employing this protocol is associated with a fixed priority. The priority of the owner thread is at least that associated with the mutex.
None: The mutex has no impact on the owner’s priority, and the user accepts the possibility of priority inversion.

To facilitate both the design and implementation of these protocols, it is
easier to associate a priority with each mutex: the priority of a
priority-inheritance mutex is that of the highest-priority waiter, that of a
priority-ceiling mutex is the fixed ceiling value and that of a “none” mutex
is 0. We can now define the mutex-induced priority of a thread as the
maximal priority of all mutexes the thread currently holds. In the QNX Realtime OS, a
thread’s effective priority is the maximum of its base priority (the one set
when the thread is created), its client-inherited priority (if receiving a
message from another thread) and its mutex-induced priority.

Figure 1: The in-kernel representation of a list of mutexes held by a thread (T₁),
each of which is waited on by one or more threads. The mutexes are sorted in
priority order on thread T₁‘s list, and threads are sorted in priority order on
the waiting list of each mutex.

Figure 1 depicts four mutexes of different protocols held by a
single thread². The
mutexes are sorted by their priorities: M₁ is a priority-inheritance mutex
with a priority 20 waiter, M₂ is a priority-ceiling mutex with a ceiling
value of 11 (note that the priority 30 waiter has no impact on a
priority-ceiling mutex), M₃ is a priority-inheritance mutex with a priority
10 waiter and M₄ is a “none” mutex, so is associated with a priority of
0. Since the base priority of T₁ is 10, and since its mutex-induced priority
is 20, the effective priority of this thread is 20.

The mutex-induced priority of T₁ can change every time a thread blocks or
unblocks on one of the mutexes it holds. If a priority 30 thread blocks on
M₃, the priority of that mutex becomes 30, requiring the effective priority of
T₁ to become 30. Conversely, if T₂ stops waiting for M₁ (e.g., because
of a timeout on a pthread_mutex_timedlock() call or because its process
exited abnormally), then the new highest-priority mutex held by T₁ becomes
M₂, and its priority is lowered to 11. Furthermore, any change to the
priority of T₁ can have an impact on the priority of other threads: if T₁
is currently waiting on another thread to reply to a message, that thread (the
server) needs to inherit the newly adjusted priority of T₁. If T₁ is
currently blocked on another priority-inheriting mutex, the priority of that
mutex needs to be adjusted, with potentially implications on the priority of its
owner. The kernel is responsible for transitively adjusting all of these
priorities.

Conclusion

While a basic mutex lock is fairly easy to implement in a kernel, performance
considerations and support for a wide variety of features require a robust
design and a careful implementation. In particular, since mutexes are the
governors of concurrent access to shared resources, care must be taken to
account for potential race conditions within the code.

For a real-time operating system, it is imperative that the kernel provide a
solution for the priority inversion problem associated with mutexes. The
solution needs to deal with threads holding multiple, potentially heterogeneous,
mutexes, with threads of different priorities blocking on mutexes, and with
threads leaving the wait queues unexpectedly. The concept of mutexes associated
with priorities facilitates both the design and the implementation of the mutex
code in the QNX Neutrino microkernel.

¹ The micro-kernel
architecture distinguishes between system calls, many of which are
implemented as messages passed to various services, and kernel calls,
which are traps into the micro-kernel.
² In practice, this situation is indicative of a rather
poor design. Nevertheless, the kernel needs to deal with it correctly.

28 Comments

2016-02-18 8:39 pm
uridium
Most interesting story here in years. Well done!
I’m afraid to ask, but does this herald a return to the days of stories with real substance to them instead of zOMGF! NEW PHONE! Widget is clickable now!!1
Seriously.. great story. Finally something with meat again and no froth or bubble in sight.
2016-02-18 9:37 pm
Morgan
Most of this is over my head (I’m not a developer by any stretch of the imagination) but this was well written, easy to read and a wealth of information about an old favorite OS. Bravo!
2016-02-18 11:02 pm
tylerdurden
A piece about Operating Systems, finally!
I have a question; is the neutrino internal threading model based on pthreads, or does it have a different native model onto which they get mapped?
PS. I used to develop on QNX. I really miss the pervasive message passing, basically its programming model.

2016-02-18 11:57 pm
elahav
For the most part, the pthread API is implemented as a thin layer on top of the native thread API, e.g., pthread_create() does very little other than invoking the ThreadCreate() kernel call.

2016-02-19 12:08 am
tylerdurden
Thanks for the reply.
Follow up, is the native thread API exposed to the user? (as I said, I programmed for QNX a looong time ago, back when still used Codewarrior)
Also, are there any plans to support QNX on low end ARM systems like Raspberry Pi?

2016-02-19 1:17 am
elahav
Yes, the kernel calls are exposed and documented, and you can use them directly. For most pthread functions, however, there is little reason to do that (and, in fact, for mutexes, you will get sub-optimal results from going directly to the kernel calls, as explained in the article).
For low-end systems, you should be able to run QNX on any ARMv7 or x86 system with an MMU, so Pi2 is feasible, but not the older Pi (ARMv6 processor) or Cortex-M-based systems (no MMU).

2016-02-19 1:52 am
tylerdurden
Again, thanks for the answers much appreciated.

2016-02-19 12:10 am
Alfman verbose=1
In a recent project, I encountered bugs due to undefined conditions of the pthread mutex.
I needed to capture a mutex, hold it across thread invocation, and then release it. Turns out that there’s no defined behavior in this scenario.
http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_mutex_lo…
If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behaviour results.
So I opted to use the posix semaphore mechanisms sem_post sem_wait instead, which behaves as expected even under edge conditions.
I wish I could change one thing about posix semaphores: that one could wait on any number of events instead of decremented them one at a time. If 20 threads call sem_post(semaphore), the blocking thread has to call sem_wait 20 times just to decrement the counter back to zero.
while(true) {
sem_wait(semaphore);
// process thread events 20 times even if only the first iteration is productive
}
Turns out Linux kernel devs recognized this deficiency and corrected it with eventfd, although it’s proprietary to linux and uses a new read/write syscall API.
Man, software development entails so many little factoids, if I had to start over, I don’t even want to think about it

2016-02-19 1:27 am
elahav
If you are waiting on an event that can be generated by any, or even multiple, threads, then semaphores are not what you are looking for. Perhaps a condition variable?
Also, note that semaphores and mutexes, while sometimes confused for one another, achieve completely different tasks: the first are a synchronization (i.e., temporal) mechanism, the second a data-protection (i.e., spacial) mechanism. Search for “Concurrent Urban Legends” by Peter Buhr and Ashif Harji.
Interestingly, your original goal, that of explicitly handing over a mutex from one thread to another without releasing it is something that has come up several times in the past. It’s not hard to implement, I’m just not sure that the semantics of such an operation are fully understood.

2016-02-19 2:52 am
Alfman verbose=1
elahav,
If you are waiting on an event that can be generated by any, or even multiple, threads, then semaphores are not what you are looking for. Perhaps a condition variable? [/q]
No actually, the purpose of the mutex was really just to protect a data structure from concurrent access.
[q]Interestingly, your original goal, that of explicitly handing over a mutex from one thread to another without releasing it is something that has come up several times in the past. It’s not hard to implement, I’m just not sure that the semantics of such an operation are fully understood.
Once the error was tracked to the mutex, it wasn’t hard to fix: just replace the mutex with a posix semaphore.
sem_init(&sem, 1);
…
sem_wait(sem); // lock
// equivalent to mutex
sem_post(sem); // unlock (even in new thread)
It’s semantically identical to the mutex, with the benefit of having defined behavior across threads.
Edited 2016-02-19 02:54 UTC

2016-02-19 7:09 am
Alfman verbose=1
elahav,
If you are waiting on an event that can be generated by any, or even multiple, threads, then semaphores are not what you are looking for. Perhaps a condition variable?
Oh sorry, you were talking about my other example.
Yes a pthread condition variable might work there, there are many options really.
AFAIK pthread condition variables only support mutex synchronization which can potentially block the sender when it needs to change the predicate condition variable. The block would only be momentary, but with asynchronous programming the main thread should not enter a wait state to wait on locks in other threads – it goes against the design. I felt a spinlock was the best choice here to protect the thread data queue. Both linux eventfd and semaphores (and even pipes) allow us to send events without risk of putting the main thread in a blocked state.
2016-02-19 4:13 pm
elahav
sem_init(&sem, 1);
…
sem_wait(sem); // lock
// equivalent to mutex
sem_post(sem); // unlock (even in new thread)
This is not equivalent to a mutex. While a count-1 semaphore behaves similarly to a mutex, there are subtle, yet important, differences, resulting from a semaphore not having an owner:
1. Any thread can signal the semaphore, leading to a loss of mutual exclusion (which will go unnoticed until data corruption is observed).
2. No way to avoid priority inversion.
You may not care about these problems in this particular case, but it is important to understand the semaphore/mutex divide.

2016-02-19 4:58 pm
Alfman verbose=1
elahav,
1. Any thread can signal the semaphore, leading to a loss of mutual exclusion (which will go unnoticed until data corruption is observed). [/q]
Notice in the link I provided earlier that the pthread mutex behavior is undefined for the conditions where one would try to signal from another thread. It means that no code which correctly uses the mutex is allowed to attempt to signal the mutex from another thread. The fact that the semaphore defines and/or behaves differently for those conditions doesn’t make it any less suitable as a mutex in code which makes no assumptions about those conditions. For all conditions that are defined for the mutex, the semaphore works the same way.
In other words, I think the semaphore pseudocode I gave above would be a valid implementation of a pthread mutex.
Note that I agree with you that a mutex is generally the accepted mechanism to use to serialize access to a data structure, but I’m not able to come up with examples where a semaphore doesn’t work…can you come up with any counter examples?
[q]2. No way to avoid priority inversion.
Even with a mutex, a low priority thread can block a high priority thread. Am I misunderstanding you?
Edited 2016-02-19 17:03 UTC
2016-02-19 5:58 pm
elahav
Notice in the link I provided earlier that the pthread mutex behavior is undefined for the conditions where one would try to signal from another thread [/q]
It may be undefined in the spec, but it is certainly defined if you are using error-checking mutexes, which is the default on QNX and optional on Linux.
[q]Even with a mutex, a low priority thread can block a high priority thread. Am I misunderstanding you?
Read the article 😉 With priority inheritance (again, default on QNX, optional on Linux), the low priority thread will be boosted to the priority of the highest waiter until it relinquishes the mutex.
2016-02-19 7:33 pm
Alfman verbose=1
elahav,
It may be undefined in the spec, but it is certainly defined if you are using error-checking mutexes, which is the default on QNX and optional on Linux. [/q]
It’s true error checking extensions can help catch code that triggers undefined mutex behavior. One could implement the same optional error checking with semaphores too, but I think it is unfortunate that useful scenarios were left undefined in the first place. Now we’re forced to “reinvent the wheel” just to work around the cases for which a mutex is not properly defined. Alas, it is what it is.
[q]Read the article 😉 With priority inheritance (again, default on QNX, optional on Linux), the low priority thread will be boosted to the priority of the highest waiter until it relinquishes the mutex.
I did read it, thank you for the article by the way.
This is a good point, you can’t really bump the priority of unequal threads when the kernel doesn’t know which thread will release a lock. Maybe there should be a well to tell it.
2016-02-19 7:56 pm
dpJudas
Now we’re forced to “reinvent the wheel” just to work around the cases for which a mutex is not properly defined. Alas, it is what it is.
It is not “properly defined” to increase the chance more optimal implementations are possible without having to support weird usages such as “lock mutex in one thread and unlock in another”.
Exactly why you had to lock it in one thread and release it in another is still not clear (to me), but I’m guessing there is a good chance that some of the other pthread primitives available would have been a better fit for the problem at hand.
2016-02-19 11:39 pm
Alfman verbose=1
dpJudas,
It is not “properly defined” to increase the chance more optimal implementations are possible without having to support weird usages such as “lock mutex in one thread and unlock in another”.
Exactly why you had to lock it in one thread and release it in another is still not clear (to me), but I’m guessing there is a good chance that some of the other pthread primitives available would have been a better fit for the problem at hand.
It’s not really as weird as you might think. Your thread is holding a lock and wishes to finish the request in a new thread without blocking the main thread. Therefor the easiest most natural place to release the lock is in the new thread. This pattern comes up a lot as elahav said himself.
Using a mutex is very desirable because it’s the conventional choice for protecting resources from concurrent access, the only problem is pthread’s implementation doesn’t allow it here. Since a semaphore closely resembles the mutex’s normal semantics with the added benefit that releasing the lock from a new thread is well defined, it seemed like the best alternative. However elahav was very observant and pointed out that thread priority bumping would be lost with semaphores.
Hypothetically we could have used a mutex from Thread A, spawned thread B to do the work needed, and then used signaling back to A to release the mutex. While this would have worked using only defined pthread mutex operations, the kernel still would have gotten the thread priority bumping wrong because it doesn’t know that thread B needs to finish it’s work before thread A can release the mutex. I can’t think of an easy way to represent this complex thread inter-dependency to the kernel, it would be so ugly that nobody would really want to.
I think the best solution would be if a locked mutex could be explicitly transferred to a new thread using a new pthread function. Not only would this resolve the issue of undefined behavior across threads, it would also resolve the issue of giving the correct thread a priority boost – something a semaphore can’t do.
If I could go back in time to demonstrate that such a use case is useful, I probably could have convinced the original developers that it should be defined and it would not have been more computationally expensive to implement with atomic CPU operations. Alas, that didn’t happen…
We could come up with a big list of things that could have been improved, in hindsight
Edited 2016-02-19 23:47 UTC
2016-02-20 6:47 am
dpJudas
It’s not really as weird as you might think. Your thread is holding a lock and wishes to finish the request in a new thread without blocking the main thread. Therefor the easiest most natural place to release the lock is in the new thread. [/q]
Locking something for this long a duration is really poor behavior and kind of ruins the point multi-threading in the first place. As a result, mutexes aren’t designed for this.
Using a mutex is very desirable because it’s the conventional choice for protecting resources from concurrent access, the only problem is pthread’s implementation doesn’t allow it here.
A mutex is designed to mutually exclude access to a resource for a short duration of time. It is intended for the situations where your data won’t fit into atomic operations.
What you are trying to do with the mutex IMO doesn’t fit into the intended usage pattern of that construct and it shouldn’t even try to support it. It is like complaining that a std::vector doesn’t scale for very large lists where you should have used std::list or some other container type instead.
Since a semaphore closely resembles the mutex’s normal semantics with the added benefit that releasing the lock from a new thread is well defined, it seemed like the best alternative. However elahav was very observant and pointed out that thread priority bumping would be lost with semaphores.
Not knowing the code and problem at hand, I can’t say for sure exactly what you should have used instead of a mutex. Maybe it should have been semaphores, maybe condition variables, maybe pull the working set out of the mutex protected data and put it back in when done. In any case, if posix mutexes had been forced to support what you’re asking for they might have taken a performance hit for their intended usage.
Hypothetically we could have used a mutex from Thread A, spawned thread B to do the work needed, and then used signaling back to A to release the mutex. While this would have worked using only defined pthread mutex operations, the kernel still would have gotten the thread priority bumping wrong because it doesn’t know that thread B needs to finish it’s work before thread A can release the mutex. I can’t think of an easy way to represent this complex thread inter-dependency to the kernel, it would be so ugly that nobody would really want to.
Typically when I end up in situations requiring solutions as creative as you just described, it is almost always a sign that I somehow ended up over-complicating things and there’s a much simpler problem to solve if I just change my approach. In this case, the key question to ask is probably somewhere along the lines of: Why do I need to maintain such a long lock in the first place? Is there a way to avoid it?
If I could go back in time to demonstrate that such a use case is useful, I probably could have convinced the original developers that it should be defined and it would not have been more computationally expensive to implement with atomic CPU operations. Alas, that didn’t happen…
Keep in mind that when posix was originally defined CPU architecture looked quite different from what it did today. When doing an standard like pthreads there’s a fine balance between defining something too exact that will lock it into current architecture and algorithms, and too loosely so that the standard is effectively useless. Even if you could today (on Intel/AMD hardware) make your cross-thread mutex lock/unlock without cost, it might not translate to future architectures.
[q]We could come up with a big list of things that could have been improved, in hindsight
No doubt the computing world look very different.
2016-02-20 8:56 am
Alfman verbose=1
dpJudas,
Locking something for this long a duration is really poor behavior and kind of ruins the point multi-threading in the first place. As a result, mutexes aren’t designed for this….
A mutex is designed to mutually exclude access to a resource for a short duration of time. [/q]
That’s not really true though, locking will take however long it needs to take depending on the context. In an intensive simulation, something may lock for a long while and be perfectly normal. Obviously you wouldn’t want to do that for a GUI, but it depends on the context. There’s nothing wrong with longer locks if that’s what the granularity of the problem calls for.
Secondly, it’s quite reasonable to use a mutex for shared file descriptors or whatever else even if they’re known to block. If we were talking about a spinlock, then I would agree with you that any blocking is bad, but with a mutex that’s not true. The mutex would become much less useful if you couldn’t perform blocking operations while holding a mutex lock.
Thirdly, starting a new thread is not a blocking operation anyways. We can start many such threads on a multicore CPU in the same time it takes to complete one blocking disk or network operation. The “long duration” argument against passing a mutex into threads doesn’t really hold up.
In any case, if posix mutexes had been forced to support what you’re asking for they might have taken a performance hit for their intended usage.
There wouldn’t be a performance hit at all, it’s already locked, you just need a syscall to tell the kernel about the new thread association. Code that doesn’t invoke this syscall wouldn’t change.
Typically when I end up in situations requiring solutions as creative as you just described, it is almost always a sign that I somehow ended up over-complicating things and there’s a much simpler problem to solve if I just change my approach. In this case, the key question to ask is probably somewhere along the lines of: Why do I need to maintain such a long lock in the first place? Is there a way to avoid it?
It might not have been clear that paragraph was describing the convoluted approach needed to use the mutex as is today from a single thread. If it were fixed, it would be just as straitforward as the single threaded case.
Switching to semaphores is dead simple, maybe this discussion should end on that. However when elahav pointed out the fact that semaphores can’t tell the kernel which thread to prioritize, it made a good argument as to why mutexes would have been better assuming this case was implemented.
[q]Keep in mind that when posix was originally defined CPU architecture looked quite different from what it did today. When doing an standard like pthreads there’s a fine balance between defining something too exact that will lock it into current architecture and algorithms, and too loosely so that the standard is effectively useless.
I know what you are saying, but I don’t think any new mechanics would have to be added to the mutex to support this case. I get the impression that even if I proved this, you’d still want to object
Oh well, that’s ok. It’s not like we carry any weight with IEEE or posix anyways.
Edited 2016-02-20 09:14 UTC
2016-02-20 4:11 pm
dpJudas
Secondly, it’s quite reasonable to use a mutex for shared file descriptors or whatever else even if they’re known to block. If we were talking about a spinlock, then I would agree with you that any blocking is bad, but with a mutex that’s not true. The mutex would become much less useful if you couldn’t perform blocking operations while holding a mutex lock. [/q]
It isn’t so much a question not being allowed to perform blocking operations, but more that the longer time the mutex is locked the further you move away from what a (posix) mutex was intended for. Starting a new thread already makes me wonder things like why didn’t you acquire the lock in the worker thread in the first place?
I know what you are saying, but I don’t think any new mechanics would have to be added to the mutex to support this case. I get the impression that even if I proved this, you’d still want to object
Your posix_transfer_mutex_to_other_thread(pid) function might not add much of an overhead, but if I was part of posix I’d want some kind of solid proof/example where it would be preferable other more well-known methods.
[q]Oh well, that’s ok. It’s not like we carry any weight with IEEE or posix anyways.
Indeed. Personally I’m not even remotely an expert in this subject – I just found your needs of a cross-thread mutex somewhat odd. Clearly you don’t and that’s OK.
2016-02-20 5:57 pm
Alfman verbose=1
dpJudas,
Starting a new thread already makes me wonder things like why didn’t you acquire the lock in the worker thread in the first place? [/q]
You may not want other users coming in and grabbing the mutex between the time the thread is created and the time the thread grabs the mutex. Say the protected resource is a memory mapped file or a socket that must not be touched until the thread has unlocked them.
[q]Your posix_transfer_mutex_to_other_thread(pid) function might not add much of an overhead, but if I was part of posix I’d want some kind of solid proof/example where it would be preferable other more well-known methods.
To be honest I’m not too bothered about using a semaphore as a lock (with the caveats brought up earlier), but you will hear some groans in the community because a semaphore is meant for signaling rather than locking. In principal a mutex would be the correct primitive to use.
If I have to live with a mutex that does not implement edge cases, they will have to live with the fact that semaphores will be used as locks. I guess it’s fair, haha
2016-02-21 12:14 am
kwan_e
dpJudas,
Starting a new thread already makes me wonder things like why didn’t you acquire the lock in the worker thread in the first place?
You may not want other users coming in and grabbing the mutex between the time the thread is created and the time the thread grabs the mutex. Say the protected resource is a memory mapped file or a socket that must not be touched until the thread has unlocked them.
In that scenario, couldn’t you just have the thread remove the resource from global access until the new thread is created then hand ownership over when it is ready?
2016-02-21 1:36 am
Alfman verbose=1
kwan_e,
In that scenario, couldn’t you just have the thread remove the resource from global access until the new thread is created then hand ownership over when it is ready?
Hopefully I follow your meaning… I’m really tired.
Thread A:
Lookup object, lock mutex, remove object from collection, unlock mutex, invoke Thread B with object as parameter.
Thread B:
Lock mutex re-add object to collection.
This what you have in mind? Well, it kind of changes the semantics a bit. Thread C might come in and it won’t find the object, even though it’s supposed to exist.
Also, Thread D might already be blocked on the mutex even before the object was removed from the collection. When Thread A releases the mutex, there’s no way to guaranty new Thread B gets it next.
Maybe more synchronization and structures could solve it, but it adds quite a bit of complexity over just keeping the mutex locked across threads.
We probably can’t beat the trivial simplicity of implementing our own mutex with semaphores ( http://www.osnews.com/comments/29090 ).
Anyways thanks for the discussion, I won’t be around much this week.
2016-02-21 3:28 am
kwan_e
Thread C might come in and it won’t find the object, even though it’s supposed to exist.
Also, Thread D might already be blocked on the mutex even before the object was removed from the collection. When Thread A releases the mutex, there’s no way to guaranty new Thread B gets it next.
Yes, there will be a change in semantics. The way I may do it, seeing as I haven’t seen the code, is to make Thread C, D, etc operate expecting that they may not see the resource they wanted. In a way, they already do because they’re already expecting to have to own a mutex before they can use it any way.
So if Thread D is blocked on the mutex, Thread A removes the resource from global access, once Thread D is unblocked, it won’t see the resource it was after. In which case it will have to do something else or wait until the resource becomes available again.

2016-02-19 3:03 am
Dawgmatix
What’s the best way to try out qnx these days? Is it available for download?

2016-02-19 5:03 am
judgen
Yes the desktop OS along with the dev SDK is available for download at the qnx website. However payed support costs an arm an a leg (or at least last time i tried it) so you are going to depend on written documentation and community for help unless you want to pay.

2016-02-22 8:36 pm
flyingrobots
What if I told you that for most cases you don’t need them? Would you believe me? What if I also told you would see dramatic increases in performance for those areas of code that had high levels of contention?
QNX only….

2016-02-23 2:36 pm
elahav
It’s hard to say, because you haven’t provided any details.
In general, there are a few ways to avoid or reduce the use of mutexes, but, as far as I know, there is always a trade-off:
1. Avoid data sharing, e.g., with a pure event-driven design or by a strict assignment of tasks to threads. While I often go for such a design myself, it does restrict parallelism, and therefore does not scale well with the number of processors.
2. Solve the data-sharing problem with lock-free data structures or read-modify-write schemes. These are typically harder to implement and are not always applicable.
Nevertheless, the issue is a moot one as far as the article goes: 99% of the multi-threaded code out there uses mutexes, which means that the OS must support this mechanism.