R300 To Do list

And here is the R300 to do list (at the end of this page); if you feel something is missing please add it. If you have done something and it no longer needs to be investigated, happily erase it from here :).

Hardware limitations

Should this be its own page? Probably not.

Some of these are errata, some aren't. You can't work around them in hardware, though.

  • Vertex shader limits

    ALU Const Temp Alt.Temp
    R300 256 256 32 20
    R400 256 256 32 20
    R500 1024 256 32 20

  • Fragment shader limits

    ALU TEX Const Temp Samplers TEX Indirections
    R300 64 32 32 32 16 4
    R400 512 512 32 64 16 4
    R500 512 512 256 128 16 Unlimited

  • Shader opcodes (unlisted opcodes are not supported)

    R300 VS R500 VS R300 FS R500 FS
    ABS (./) (./) (./) Source operand modifier
    ADD (./) (./)
    ARL (./) (./)
    ARR (./) (./)
    CND (./) (./) src0 > 0.5 ? src1 : src2 can be used for e.g. src0 ? src1 : src2
    CMP (./) (./) src0 < 0.0 ? src1 : src2
    COS (./) (./)
    DDX (./)
    DDY (./)
    DP2A (./) (./)
    DP3 (./) (./)
    DP4 (./) (./) (./) (./)
    DST (./) (./)
    EX2 (./) (./) (./) (./)
    EXP (./) (./)
    FRC (./) (./) (./) (./)
    KIL (./) (./)
    LG2 (./) (./) (./) (./)
    LIT (./) (./)
    LOG (./) (./)
    MAD (./) (./) (./) (./)
    MAX (./) (./) (./) (./)
    MIN (./) (./) (./) (./)
    MUL (./) (./)
    POW (./) (./)
    RCP (./) (./) (./) (./)
    RSQ (./) (./) (./) (./)
    SEQ (./)
    SGE (./) (./)
    SGT (./)
    SIN (./) (./)
    SLT (./) (./)
    SNE (./)
    *_SAT (./) (./) (./) Instruction modififer
    TEX (./) (./)
    TXB (./) (./)
    TXD (./)
    TXL (./)
    TXP (./) (./)
    IF (./) (./)
    ELSE (./) (./)
    ENDIF (./) (./)
    BGNLOOP (./) (./) (./) There are a lot of restrictions on loops.
    BRK (./) (./)
    CONT (./)
    ENDLOOP (./) (./) (./)

  • Geometry

    • Vertex formats GL_*INT and GL_DOUBLE are not supported. GL_*SHORT is supported only for 2- and 4-component vertex attributes, and GL_*BYTE only for 4-component attributes. All individual vertex attribute fetches must be DWORD-aligned.
    • Quads and quadstrips cannot have the first vertex be the provoking vertex.
  • Rasterizer
    • The interpolated colors have a range of [0, 1] and are limited to 12 bits of precision. Unlike texture coordinates, when multisampling is enabled, the colors are sampled at the centroid of the covered portion of the fragment. With PS3 mode enabled (GUESS), the interpolated colors have 32 bits of precision.
  • Textures
    • Floating-point mipmap LOD clamping is not supported. Min LOD is floor'd and max LOD is ceil'd.
    • NPOT textures are not available, just rectangle textures (with possibility of normalized texture coordinates and bilinear filtering).
    • Floating-point textures cannot be filtered, the only allowed filter modes are GL_NEAREST and GL_NEAREST_MIPMAP_NEAREST.
    • Floating-point textures cannot be used with the wrap modes GL_CLAMP and GL_CLAMP_TO_BORDER, except for R500, which does support these modes.
    • 16-bits-per-channel textures cannot be used with the wrap modes GL_CLAMP and GL_CLAMP_TO_BORDER.
  • Vertex shader
    • Vertex shaders evaluate 0^0 as NaN.
  • Fragment shader
    • Fragment derivatives are calculated at half resolution.
    • R300 fragment shaders evaluate 0^0 as NaN.
  • Color buffers
    • R300 maximum color buffer size is 2560x2560. R400 maximum color buffer size is 4021x4021. R500 maximum color buffer size is 4096x4096
    • R300-R400 cannot do blending, alpha-testing, and multisampling with floating-point colorbuffers. R500 can do these only for GL_RGBA16F.

Gallium3D Driver

This is a TODO list for r300g.

  • Driver enhancements
    • Add support for the centroid GLSL varying modifier (undocumented in the hw docs).
    • Add texture-stuffing-based things: smoothed points + lines + polygons, and stippled lines + polygons. This needs changes in the rasterizer, the RS block, and the fragment shading.
    • Enable Hyper-Z by default on r3xx/r4xx (needs additional testing).
  • Compiler enhancements
    • Improve the VS register allocator to be more useful with ifs and loops (We should be able to reuse most of the FS register allocator for this).
    • VS: Implement Dual-Op Vector/Math (4D+1D) instruction pair scheduling. Performance++.
  • State tracker enhancements
    • Implement accelerated pixel buffer objects (we can do the aligned copies at least).
    • Implement accelerated ?CopyBufferSubData.
  • Bugs