The DRI uses a fairly straightforward locking scheme to arbitrate access to the hardware. The old design docs are a bit out of date by now, so this page attempts to record how it works.
NOTE: This page is a work in progress. If it looks like a mess, it is.
The primary arbitration mechanism is the DRM context lock. This guards access to a hardware context, allowing it to be shared among multiple rendering contexts. This is essential for single-context hardware, which includes most pre-DX9 cards.
DRM clients - that is, any process using DRM services, including the X server - take the lock whenever they modify the state of that context. The lock is advisory; clients are not required to take it before touching the card, but if they don't, madness will follow. The unoptimized version of the lock is just an ioctl operation on the DRM device itself. While one client is holding the lock, other clients will block in the ioctl call until the holding client unlocks, at which point the kernel schedules in the next client.
Most DRI drivers only take the lock right as they're about to submit commands to the card. The X server takes it every time it wakes up to process requests from X clients (during the so-called) and releases it before executing any possibly-blocking select call (the ).
Optimized context locking
On most architectures, there is an optimization layer above this. The context lock itself is stored as a single dword in the SAREA, a shared mapping page owned by the kernel and mapped into the address space of every DRM client. Most of this word is devoted to the context ID, a small positive integer uniquely identifying a given context. The high bits are used to indicate two lock conditions: held, and contended. When a process goes to take the lock for the first time, it ioctls. The kernel then sets the context ID appropriately, and sets the HELD flag. At unlock, the client performs an atomic compare-and-swap on the lock: if the lock value is still the original context ID plus the HELD flag, it is atomically replaced with just the context ID, otherwise the client ioctls to the kernel to unlock. If the CAS succeeded, the next time the client attempts to take the lock, it does the same CAS operation in reverse: if the lock is just the context ID, set the held flag, else ioctl to acquire the lock. If the kernel receives a lock ioctl while the lock is already held, it sets the CONTENDED bit in the lock, thus ensuring that the holding client will ioctl to unlock and allow the next DRM client to schedule in.
Okay, so that sounds pretty complex. What's the result for the client?
- If the context is about to lock, and it was the last context to hold the lock, it can re-take it entirely in userspace, and can be assured that no card state has changed.
- If the context is about to lock but was not the last context to hold the lock, it must ioctl to acquire the lock. This is potentially a blocking operation, and when the client continues it must re-emit state.
- If the context is about to unlock, but no one is waiting for the lock, the lock is released with no kernel notification.
- If the context is about to unlock and another context is waiting for the lock, the context must ioctl to the kernel to release it.
The above description is slightly simplified, because there's one other piece of 3D state that is involved.
The window system maintains a clip list, which describes the overlaps among windows. In order for a direct rendering client to draw to the screen correctly, it must respect the server's ideas of window boundaries. To achieve this, the server maintains a small table in the SAREA of timestamps for each direct rendered drawable. Currently the limit is 256 direct rendered drawables.
The SAREA also contains a drawable lock. This lock is logically after the context lock described above; it may only be taken by a context that already holds the context lock. The server takes this lock just before it modifies the clip list for any direct rendered drawable, and releases it during(that is, just before processing the next client request). After taking the drawable lock, the timestamp for that drawable is incremented.
Once a GLX client has taken the DRM context lock, it needs to verify that its copy of the cliplist is current. It takes the drawable lock, and then verifies that the timestamp for the drawable it's about to draw to matches the last timestamp it saw. If the timestamp has changed, the cliplist has changed; the client then drops the drawable lock, then the context lock, and then sends DRI protocol to the X server to request the new cliplist for the drawable. Once it has received the reply, the sequence repeats, until the timestamp received in the DRI protocol reply matches the timestamp in the SAREA.
Context Switch Modes
DRI_SERVER_SWAP, DRI_KERNEL_SWAP, DRI_HIDE_X_CONTEXT. TODO: remember what these are and write them down.
More than one, but finite, hardware contexts
We don't currently have any drivers like this; glint may have worked this way, need to check. Would also be good to note here what hardware falls into this class.
One way to handle this kind of hardware would be to use the base DRM context lock as the mechanism for creating new contexts and protecting the drawable lock (and for the server's context, I suppose), and to create auxiliary context locks in the SAREA for each hardware context. If you ever failed the CAS and had to ioctl for access to the context, you'd try to grab the master context and let the kernel assign you your next hardware context.
Infinite hardware contexts but shared cliplist
We don't currently have any finished drivers like this, but nouveau might behave this way. TODO: other hardware fitting this model?
In this model you basically never need to hold the context lock except for cliplist validation.
Interaction with AIGLX
The DRI driver is normally the component that takes the context lock. In the AIGLX server model, GLX rendering is handled in the server by the DRI driver, but the server also takes the lock. Clearly a handoff is needed.
The X server takes the lock in, and then attempts to dispatch into the GLX code. GLX notices that a DRI driver is loaded, and unlocks the DRM by calling just the for the DRI. If the DRI driver ever needs to take the lock it will now succeed (albeit with an ioctl, since the X server was the last DRM client to hold the lock). Once the request has finished dispatch, the GLX core calls the DRI's so the server can reacquire the context lock.
Note that since the ioctl is potentially a blocking operation, this does give greedy clients a chance to starve the X server of time. Thus far the consensus seems to be "don't do that then".
Interaction with multiple screens
The X server takes one lock per X screen. Note that this means screen as in the protocol-visible object; :0.0 and :0.1 are different screens on the same display. MergedFB mode presents one screen per card. Zaphod mode presents one screen per output. As a result, for multiscreen systems, the X server can present some interesting scheduling challenges. This probably needs to be re-thought for the future.
Future optimization ideas
The AIGLX handoff can be made slightly more robust in the face of greedy direct clients, by making the server set an additional property on the DRM lock when it first acquires it. This flag would instruct the kernel to block any other process from acquiring the DRM context until the server released it for good. This would have the pleasant side effect of fixing glucose correctness.
If the DRM were significantly smarter, it could do away with the context lock for state emission altogether. DRM clients would simply submit all rendering commands to the DRM and let the kernel schedule their dispatch. Cliplist changes would simply be represented as reordering barriers in scheduling.