R600 To Do list

And here is the R600g to do list; if you feel something is missing please add it. If you have done something and it no longer needs to be investigated, happily erase it from here :).

LLVM

  • Testing on R600, R700 and Cayman GPUs
    • Difficulty: Easy
    • Who's working on it:
    • Date Started:
    • Status:
    • Description:
    • Most of the LLVM backend testing has been with Evergreen or Northern Islands GPUs. R600, R700, and Cayman GPUs are slightly different, so there may be bugs on these GPUs that aren't present on Evergeen/Northern Islands. The best method for testing is to compare piglit results with and without the LLVM backend and report the regressions. For easy switching between the two compilers, compile mesa with --enable-r600-llvm-compiler. This will make the LLVM backend the default compiler, and you can switch back to the old compiler by using the R600_LLVM=0 environment variable.
    • It would also be interesting to see if there are any performance gains from using the LLVM backend, so a related task would be to compare the performance of your favorite programs with and without the LLVM backend and then report any discrepancies.
  • Remove target specific intrinsics
    • Difficulty: Easy
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • The code in gallium/drivers/radeon/radeon_setup_tgsi.c uses a lot of target specific intrinsics for translating from TGSI to LLVM. Most of these are unnecessary and should be replaced with LLVM IR or LLVM intrinsics.
  • Use LLVM analysis passes to clear the Barrier bit on Control Flow instructions
    • Difficulty: Hard
    • Who's working on it: N/A
    • _Date Started: _N/A
    • Status: N/A
    • Description:
    • Clearing the Barrier bit on Control Flow instructions allows them to execute in parallel, which can improve performance. We can use the LLVM analysis passes to determine when it is safe to clear this bit. This task will also require some changes to the way code is generated by the backend, because we don't currently have much control over Control Flow instructions from inside the backend.

Compute

  • *Enable constant address space *
    • Difficulty: Medium
    • Who's working on it: N/A
    • Date Started: N/A
    • Status: N/A
    • Description:
    • Clover currently uses the global address space for all memory even constant memory, which is not optimal from a performance perspective. To complete this task a developer would need to:
      1. Modify clover to create a module::argument::constant for pointers in the constant address space (See the XXX comment in buildmodulellvm() @ llvm/invocation.cpp
      2. Add support to r600g for creating constant buffers for compute. (We should be able to resue evergreen_emit_constant_buffer() for this and we should create a separate constbuf_state atom for compute).
      3. Update the LLVM backend to handle loads from the constant address space.
  • *Handle #include directive in OpenCL C source code *
    • Difficulty: Medium
    • Who's working on it: N/A
    • Date Started: _N/A
    • Status: I think this works. It just needs to be tested and verified.
    • Description:
    • The Clang frontend supports the #include directive, but Clover is not taking advantage of this. gallium/statetrackers/clover/llvm/invocation.cpp is the place to start investigating how to complete this task.

General Optimization

  • [MEDIUM] Convert the remaining pipe_state into atoms

3D Features

  • Polygon stippling
    • Difficulty: Easy
    • Who's working on it: N/A
    • Note: Take advantage of util/u_pstipple. It contains the shader transformation code and can create a stipple texture.
  • Polygon/line/point smoothing
    • Difficulty: Medium
    • Who's working on it: N/A
  • HyperZ support
    • Difficulty: Hard
    • Who's working on it: N/A
    • Who was working on it: Jerome Glisse
    • Note: See also HyperZ Meta Bug.

Bugs

Notes

  • R600 trig functions input must be normalized from radians by dividing by 2*PI. Valid input domain is [-256, 256] which corresponds to an unnormalized input domain of [-512*PI, 512*PI]. For SIN, an out of range input results in 0.0f, for COS, 1.0f.