R600 To Do list

And here is the R600g to do list; if you feel something is missing please add it. If you have done something and it no longer needs to be investigated, happily erase it from here :).

LLVM

  • Testing on R600, R700 and Cayman GPUs
    • Difficulty: Easy
    • Who's working on it:
    • Date Started:
    • Status:
    • Description:
    • Most of the LLVM backend testing has been with Evergreen or Northern Islands GPUs. R600, R700, and Cayman GPUs are slightly different, so there may be bugs on these GPUs that aren't present on Evergeen/Northern Islands. The best method for testing is to compare piglit results with and without the LLVM backend and report the regressions. For easy switching between the two compilers, compile mesa with --enable-r600-llvm-compiler. This will make the LLVM backend the default compiler, and you can switch back to the old compiler by using the R600_LLVM=0 environment variable.
    • It would also be interesting to see if there are any performance gains from using the LLVM backend, so a related task would be to compare the performance of your favorite programs with and without the LLVM backend and then report any discrepancies.
  • Remove target specific intrinsics
    • Difficulty: Easy
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • The code in gallium/drivers/radeon/radeon_setup_tgsi.c uses a lot of target specific intrinsics for translating from TGSI to LLVM. Most of these are unnecessary and should be replaced with LLVM IR or LLVM intrinsics.
  • Add support for 4 x i8 stores
    • Difficulty: Medium
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • We may be able to implement this by using a 32-bit store instructions and having the vector packed into a single 32-bit register. I think LLVM may be able to do this for us, but we need to tell it that that some i8 operations are illegal.
  • Use LLVM analysis passes to clear the Barrier bit on Control Flow instructions
    • Difficulty: Hard
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • Clearing the Barrier bit on Control Flow instructions allows them to execute in parallel, which can improve performance. We can use the LLVM analysis passes to determine when it is safe to clear this bit. This task will also require some changes to the way code is generated by the backend, because we don't currently have much control over Control Flow instructions from inside the backend.

Compute

  • Enable constant address space
    • Difficulty: Medium
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • Clover currently uses the global address space for all memory even constant memory, which is not optimal from a performance perspective. To complete this task a developer would need to:
      1. Modify clover to create a module::argument::constant for pointers in the constant address space (See the XXX comment in build_module_llvm() @ llvm/invocation.cpp
      2. Add support to r600g for creating constant buffers for compute. (We should be able to resue evergreen_emit_constant_buffer() for this and we should create a separate constbuf_state atom for compute).
      3. Update the LLVM backend to handle loads from the constant address space.
  • Enable local address space
    • Difficulty: Medium
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • To complete this task a developer would need to:
      1. Modify clover to create a module::argument::local for data in the local address space (See the XXX comment in build_module_llvm() @ llvm/invocation.cpp
      2. Have r600g initialize the LDS with the correct size (see evergreen_set_lds() in evergreen_compute_internal.c)
      3. Add instructions to the LLVM backend for reading and writing from LDS.
  • Handle #include directive in OpenCL C source code
    • Difficulty: Medium
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: I think this works. It just needs to be tested and verified.
    • Description:
    • The Clang frontend supports the #include directive, but Clover is not taking advantage of this. gallium/state_trackers/clover/llvm/invocation.cpp is the place to start investigating how to complete this task.

General Optimization

  • [MEDIUM] Convert the remaining pipe_state into atoms

3D Features

  • Fast color clear on Evergreen-Cayman
    • Difficulty: Easy
    • Who's working on it: N/A
    • Note: CMASK is enabled already, but not being fast-cleared yet.
  • Polygon stippling
    • Difficulty: Easy
    • Who's working on it: N/A
    • Note: Take advantage of util/u_pstipple. It contains the shader transformation code and can create a stipple texture.
  • Polygon/line/point smoothing
    • Difficulty: Medium
    • Who's working on it: N/A
  • HyperZ support
    • Difficulty: Hard
    • Who's working on it: Jerome Glisse
    • Note: Patches available on the mailing list, but they currently cause hangs in a lot of cases.
  • Geometry shaders
    • Difficulty: Medium
    • Who's working on it: N/A
    • Note: Core Mesa/Gallium support might need some work too.

Bugs

Notes

  • R600 trig functions input must be normalized from radians by dividing by 2PI. Valid input domain is [-256, 256] which corresponds to an unnormalized input domain of [-512PI, 512*PI]. For SIN, an out of range input results in 0.0f, for COS, 1.0f.