Linked by Hadrien Grasland on Fri 27th May 2011 11:34 UTC
General Development After having an interesting discussion with Brendan on the topic of deadlocks in threaded and asynchronous event handling systems (see the comments on this blog post), I just had something to ask to the developers on OSnews: could you live without blocking API calls? Could you work with APIs where lengthy tasks like writing to a file, sending a signal, doing network I/O, etc is done in a nonblocking fashion, with only callbacks as a mechanism to return results and notify your software when an operation is done?
Permalink for comment 474893
To read all comments associated with this story, please click here.
Mutex-less computing
by xiaokj on Sat 28th May 2011 14:39 UTC
Member since:

@ Original Article,

There are actually alternatives to just plain old threaded / async / semi-async (you defined threaded + threshold == semi-async). Non-blocking algorithms exist of a more radical type. I have read that there are people who have successfully implemented system-level stuff using unconventional non-blocking mechanisms, most notably compare-and-swap

This is actually very useful, especially in conjunction with client-server model. It would live somewhat more like async-threaded rather than threaded-async that you are looking at.

Let's use the client server model analogous to the X-org model. The owner of the resource, the graphics card, is the server. Within this server, there is only 1 point of I/O. This I/O point is the server's queue number. All clients wanting to perform any operation will first formulate an atomic transaction. Then, it looks at the queue number and attaches it to the transaction. It would then try to compare and swap the queue number with the server's: if the number had changed in the meantime, then the server had just accepted a transaction from another client, and it needs to try the queue number thing again. Only if it manages to update the queue number will it send the transaction details to the server for the server to process at its leisure.

The original paper (it's been years since I last saw this, and I cannot find it anymore, I guess) says that, even without the compare and swap instruction, this method of setting things up tend to be very simple yet gives a lot better performance than it looks. Collisions are rare even in busy environments. Also, because the resource has its own monolithic server, it can look ahead in the queue and apply sensible algorithms, especially against deadlocking-style problems. Due to its atomicity, it is actually very useful for databases too.

One thing that this really requires is that the malloc and memory-cleaning implementation be amazing. This is because the compare and swap action is most naturally implemented as a linked list variant updating mechanism.

There is no reason whatsoever that, once the client obtains the queue number properly and securely, that the server cannot establish the callback to notify the client of progress. It should be done too. The server should also implement its own internal consistent history of transactions -- should it decide to reject transactions, something must have the unarguably correct state of the system.

The above manages to implement an async model (sequential handling, even) that deals well with threaded requests. In the appearance of the clients, it is as if it is threaded, but actually implements centralised database-style handling. For really busy systems, this server can then be threaded later. It is rather difficult to avoid a choke-point for this server, but given multi-tasking, the choke-point is actually a plus because we can apply load-balancing algorithms there.

Most cpus have the ability to do either of CAS or LL/SC, with LL/SC much better than CAS, although when push comes to shove, CAS with a fixed algorithm to do it can be good enough (just without the security). Damn legacy x86 with its idiosyncrasies.

Reply Score: 3