This is as attempt to document the Radeon architecture and make things things a little bit easier for those new/would by DRM hackers. It will describe some of the higher level hardware concepts and point to the DRM/X server code which manipulates that part of the hardware. Many of these concepts are similar on other graphics architectures, but this document will only make an attempt to document the Radeon architecture.
Registers & Framebuffer
Let's begin at the beginning. The radeon card typically exposes two regions of memory which all of the software on the system can manipulate. One region is the framebuffer, and the other (much smaller) region is the register region. We can view these resources by using lspci to display information about the devices on the PCI bus. My laptop has an ATI Radeon mobility, and here is its device information:
/sbin/lspci -v ... 01:00.0 VGA compatible controller: ATI Technologies Inc M22 [Radeon Mobility M300] (prog-if 00 [VGA]) Subsystem: IBM Unknown device 056e Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at c0000000 (32-bit, prefetchable) [size=128M] I/O ports at 3000 [size=256] Memory at a8100000 (32-bit, non-prefetchable) [size=64K] [virtual] Expansion ROM at a8120000 [disabled] [size=128K] Capabilities: <access denied> ...
In this case, the frame buffer is mapped into 0xc0000000 and the registers are mapped into 0xa8100000. These addresses are those viewed from the CPU. If you opened up /dev/mem, and wrote to offset 0xc0000000, you would be writing to the first entry of the framebuffer.
If you wanted to manipulate register 0x100, you would have to open up /dev/mem and write to 0xa8100100. There are other ways to map things, but basically you are always writing to the same memory location.
The Xserver and/or DRM kernel module will do the mapping of these registers and framebuffer so that it can manipulate the card.
The Register Region
The register region contains all the registers that are used to control the card. The registers are manipulated by the DRM kernel module, the X server, and sometimes even the kernel. (in the case of the Framebuffer driver.)
The function of some registers are well-know, and others have been reverse engineered and their functions are a little sketchy.
The registers and some of their values are defined all over the place. The X server has some definitions: http://gitweb.freedesktop.org/?p=xorg/driver/xf86-video-ati.git;a=blob;f=src/radeon_reg.h
There are some definitions which are only for the R300: http://gitweb.freedesktop.org/?p=mesa/drm.git;a=blob;f=shared-core/r300_reg.h
The kernel has some, too: http://www.gelato.unsw.edu.au/lxr/source/include/video/radeon.h
Why are things spread out? Mainly because each piece/definition grew up to support the area which is using it. For example, the mesa/dri version has information about the registers which control the 3-d portions of the hardware, which the kernel/X server mainly have the 2-d and modesetting stuff. (Although there is some overlap.)
With the radeon, the entire card can be controlled by manipulating registers.
If you want to color the screen green (for example), you can make a series of register writes and the card will turn the screen green.
It is important to note that EVERYTHING you are doing to radeon card ultimately boils down to a series of register writes. The X server basically takes the high level functions and translates them into a series of register writes. Mesa takes the OpenGL commands and (working with DRI) translates them into a series of register writes.
There are ways to make this more efficient (using the CP with the RING buffer and indirect writes), but it all basically boils down to a series of register writes.
The other region of memory contains the framebuffer (or area where the final pixels on the screen will end up.) However, sometimes, other things such as the GART table and textures are also put into this memory.
This memory is typically (but not always in the case of some embedded/laptop chipsets) on-board dedicated video memory for the graphics card.
Command Processor (CP)
As stated above, with the radeon the entire card can be controlled by manipulating registers.
However, if the CPU had to sit and wait for each and every register write to the Radeon to complete, the CPU would not be able to do anything else. This is where the Radeon's command processor (CP) comes in handy. It can read register/ value pairs from a ring buffer, and execute those commands while the CPU is off doing other things.
The radeon has a ring buffer which you can stuff full with a bunch of register/values which you want to write, tell the Radeon about it, and check back later to see if things have been completed. The command processor will work through the ring buffer until it has completed everything. One of the nice features is that the CPU can still add new work to the end of the ring buffer while the CP is processing things at the front of the ring buffer.
The CP is initialized in the kernel DRM module, in this file: radeon_cp.c
If DRM is directly writing all of the commands to ring buffer (called direct access), the application would need to make a system call for every register that they want to write. Another alternative is indirect access, where the application (or Mesa acting on its behalf) will create a list of commands in an indirect buffer, and hand that buffer to DRM. Then the DRM kernel module will issue a command to the CP which says "execute the commands in this piece of memory", (or in CP speak, "execute the commands in this indirect buffer".) That way, not only can the CPU be doing other things when the CP is running, but the creation of the list of commands for the CP can be done in user space.
Use of the CP
TODO (Head pointer, tail pointer, etc.)
Format of Messages
Memory mapping on the Radeon can make your head spin. A single location in memory can have as many as 4 different address based on where it is being accessed from:
- GART Table (physical address)
- DRM module (kernel virtual address)
- Radeon Card (bus address)
- User space app (user virtual address) This is similar to many hardware devices, and once you understand why and how it is done, things become much easier.