Background of API dispatch in Mesa

The operation of nearly every function in the OpenGL API depends on the state of the currently bound context. Even if all state in two context's is identical, the operation can be different between the two contexts. Consider a couple common examples:

  • One context is created for direct rendering (requiring that function calls be directed into the local driver) and the other context is created for indirect rendering (requiring that function calls be convertex to GLX protocol and sent to the server).
  • Both contexts are created for direct rendering, but one context is created on screen 0 of the display, which happens to be on a card by manufacturer Foo, and the other context is created on screen 1, which happens to be on a card by manufacturer Bar. In this function calls are directed to a different local driver for each context. To implement this, Mesa uses something similar to the virtual function table for a C++ class. Each context has an associated dispatch table. This table contains a function pointer for every function in the API. When a context is bound with glXMakeCurrent (or similar function), a pointer to its dispatch table is stored in a global variable. When an application calls an API function, such as glVertex3f, it is actually calling a generic dispatch stub in Mesa (dispatch stubs that are directly called are also referred to as static dispatch stubs). This dispatch stub fetches a pointer to the currently context's dispatch table, looks up a pointer to the desired function in that table, and, finally, calls the function.

It is also important to note that, since the current context is set per-thread in a multithreaded application, the dispatch pointer is also stored per-thread.

The Linux OpenGL ABI requires that certain functions be statically exported by the system libGL. However, nearly any useful implementation will want to expose more functionality. Since applications cannot depend on these functions being statically available, another method is needed to access them. Initially implemented as an extension and later incorporated into GLX 1.4, glXGetProcAddressARB (or glXGetProcAddress) is used to get a pointer to a named API function. GLX requires that pointers returned by glXGetProcAddressARB be context independent. This means that calling glXGetProcAddressARB((const GLubyte *) "glVertex3f") will return the same value no matter what context is bound .

In addition, glXGetProcAddressARB can be called when no context is bound. Since Mesa is capable of loading drivers with unknown functionality, Mesa has no way to know a priori that a requested function, such as glWillNeverExist, doesn't exist. For this reason Mesa's implementation of glXGetProcAddressARB will never return NULL for a well formed API function name. Other libGL implementations that only operate with a limited set of known drivers (e.g., Nvidia's closed-source libGL) can know which functions will never exist and may return NULL.

Mesa's implementation of glXGetProcAddressARB does two important steps when called with a function name that is currently unknown. It first assigns an offset in the dispatch table to the new function. Once the location of the function pointer in the dispatch table is known, Mesa generates a dynamic dispatch stub for the function. A pointer to this function is returned by glXGetProcAddressARB. The name of the function, the assigned offset, and the dispatch stub pointer are all stored in a table used internally by Mesa.

Threading models supported by Mesa

In terms of API dispatch, Mesa currently supports four different threading models. The compile-time choice of threading model dictates the implementation of the dispatch stubs. The core difference lies in how the global dispatch pointer is stored and retrieved. The threading mode is selected by defining one of PTHREADS, SOLARIS_THREADS, WIN32_THREADS, USE_XTHREADS, or BEOS_THREADS. When PTHREADS is selected, GLX_USE_TLS can also be used. If none of these values are defined, Mesa uses a single-threaded mode of operation.

Single-threaded

In single threaded mode as single, global variable is used to store the dispatch table pointer. This results in the simplest possible dispatch stubs as well.

void glVertex3fv( const GLfloat * v )
{
    (*_glapi_Dispatch->Vertex3fv)( v );
}

Non-TLS threading models

Mesa implements a generic wrapper function, called _glapi_get_dispatch, that is used to get the per-thread dispatch pointer. The naive implementation of a dispatch stub is shown below.

void glVertex3fv( const GLfloat * v )
{
    const struct _glapi_table * d = _glapi_get_dispatch();
    (*d->Vertex3fv)( v );
}

This implementation is very simple, but it results in poor performance in the single-threaded case. Single-threaded applications are by far more common that multi-threaded, so it make sense to do some optimization for that case. The old variable _glapi_Dispatch continues to exist, but its semantic is slightly modified. When the implementation of glXMakeCurrent detects that a new thread is setting a current context, _glapi_Dispatch is set to NULL and the true dispatch table pointer is stored in some piece of thread local storage. This allows the dispatch function to use the state of _glapi_Dispatch to determine whether or not the application is single- or multi-threaded.

void glVertex3fv( const GLfloat * v )
{
    const struct _glapi_table * d = (_glapi_Dispatch != NULL)
        ? _glapi_Dispatch : _glapi_get_dispatch();
    (*d->Vertex3fv)( v );
}

The result is vastly improved single-threaded performance with a small penalty to multi-threaded performance.

Pthreads optimization

Pthreads is by far the most common threading model used by Mesa. Some of the overhead of _glapi_get_dispatch can be avoided by directly calling pthread_getspecific from the dispatch stub. This helps the multi-threaded case slightly but has no impact on the single-threaded case.

void glVertex3fv( const GLfloat * v )
{
    const struct _glapi_table * d = (_glapi_Dispatch != NULL)
        ? _glapi_Dispatch : pthread_getspecific( & _gl_DispatchTSD );
    (*d->Vertex3fv)( v );
}

TLS

Thread-local storage on Linux provides compiler supported, per-thread variables. Once a variable is defined as being per-thread, it can be accessed with in C code like any global variable. However, the compiler (and linker) will perform some magic behind the scenes to ensure that each thread has its own data. On x86 this requires the use of a slightly more expensive addressing mode to access the TLS variables. The performance penalty of this addressing mode in the single-threaded case has been measure to be comparable to the penalty of the _glapi_Dispatch test in the non-TLS case. The performance advantage of the TLS access versus the call to either _glapi_get_dispatch or pthreads_getspecific is quite large.

The name of the dispatch table pointer is changed in the TLS case to prevent conflicts between a TLS libGL and a non-TLS DRI driver.

void glVertex3fv( const GLfloat * v )
{
    (*_glapi_tls_Dispatch->Vertex3fv)( v );
}

Implementation of static dispatch functions in Mesa

In the Mesa source tree, the file gl_API.xml describes all of the known API functions. In addition to describing the parameters to the function, each entry in gl_API.xml also declares a static offset in the dispatch table for that function.

Python generator scripts

A series of Python scripts are used to generate both platform independent API files and platform dependent API files. For the API dispatch code, the most significant scripts that generate platform independent files are summarized in the following table.

Script Name Generated File Notes
gl_apitemp.py glapitemp.h C-code dispatch function templates
gl_offsets.py glapioffsets.h List of defines of dispatch offsets
gl_procs.py glprocs.h Tables used by glXGetProcAddressARB
gl_table.py glapitable.h C structure definition of the dispatch table

There are also several scripts that generate platform dependent files. In all cases the generated files are assembly language versions of the dispatch stubs for a particular platform.

Script Name Generated File Notes
gl_ppc_asm.py ppc/glapi_ppc.S PowerPC dispatch stubs
gl_SPARC_asm.py sparc/glapi_sparc.S SPARC (32-bit and 64-bit) dispatch stubs
gl_x86_asm.py x86/glapi_x86.S x86 (32-bit) dispatch stubs
gl_x86-64_asm.py x86-64/glapi_x86-64.S x86-64 dispatch stubs

Platform specifics

x86

x86-64

SPARC (32-bit and 64-bit)

Implementation of dynamic dispatch functions in Mesa

This may not be the same as the address of the static dispatch stub.