Xe Configfs

Overview

Configfs is a filesystem-based manager of kernel objects. XE KMD registers a configfs subsystem called xe that creates a directory in the mounted configfs directory. The user can create devices under this directory and configure them as necessary. See Configfs - Userspace-driven Kernel Object Configuration for more information about how configfs works.

Create devices

To create a device, the xe module should already be loaded, but some attributes can only be set before binding the device. It can be accomplished by blocking the driver autoprobe:

# echo 0 > /sys/bus/pci/drivers_autoprobe
# modprobe xe

In order to create a device, the user has to create a directory inside xe:

# mkdir /sys/kernel/config/xe/0000:03:00.0/

Every device created is populated by the driver with entries that can be used to configure it:

/sys/kernel/config/xe/
├── 0000:00:02.0
│   └── ...
├── 0000:00:02.1
│   └── ...
:
└── 0000:03:00.0
    ├── survivability_mode
    ├── engines_allowed
    └── enable_psmi

After configuring the attributes as per next section, the device can be probed with:

# echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind
# # or
# echo 0000:03:00.0 > /sys/bus/pci/drivers_probe

Configure Attributes

Survivability mode:

Enable survivability mode on supported cards. This setting only takes effect when probing the device. Example to enable it:

# echo 1 > /sys/kernel/config/xe/0000:03:00.0/survivability_mode

This attribute can only be set before binding to the device.

Allowed engines:

Allow only a set of engine(s) to be available, disabling the other engines even if they are available in hardware. This is applied after HW fuses are considered on each tile. Examples:

Allow only one render and one copy engines, nothing else:

# echo 'rcs0,bcs0' > /sys/kernel/config/xe/0000:03:00.0/engines_allowed

Allow only compute engines and first copy engine:

# echo 'ccs*,bcs0' > /sys/kernel/config/xe/0000:03:00.0/engines_allowed

Note that the engine names are the per-GT hardware names. On multi-tile platforms, writing rcs0,bcs0 to this file would allow the first render and copy engines on each tile.

The requested configuration may not be supported by the platform and driver may fail to probe. For example: if at least one copy engine is expected to be available for migrations, but it’s disabled. This is intended for debugging purposes only.

This attribute can only be set before binding to the device.

PSMI

Enable extra debugging capabilities to trace engine execution. Only useful during early platform enabling and requires additional hardware connected. Once it’s enabled, additionals WAs are added and runtime configuration is done via debugfs. Example to enable it:

# echo 1 > /sys/kernel/config/xe/0000:03:00.0/enable_psmi

This attribute can only be set before binding to the device.

Context restore BB

Allow to execute a batch buffer during any context switches. When the GPU is restoring the context, it executes additional commands. It’s useful for testing additional workarounds and validating certain HW behaviors: it’s not intended for normal execution and will taint the kernel with TAINT_TEST when used.

The syntax allows to pass straight instructions to be executed by the engine in a batch buffer or set specific registers.

  1. Generic instruction:

    <engine-class> cmd <instr> [[dword0] [dword1] [...]]
    
  2. Simple register setting:

    <engine-class> reg <address> <value>
    

Commands are saved per engine class: all instances of that class will execute those commands during context switch. The instruction, dword arguments, addresses and values are in hex format like in the examples below.

  1. Execute a LRI command to write 0xDEADBEEF to register 0x4f10 after the normal context restore:

    # echo 'rcs cmd 11000001 4F100 DEADBEEF' \
            > /sys/kernel/config/xe/0000:03:00.0/ctx_restore_post_bb
    
  2. Execute a LRI command to write 0xDEADBEEF to register 0x4f10 at the beginning of the context restore:

    # echo 'rcs cmd 11000001 4F100 DEADBEEF' \
            > /sys/kernel/config/xe/0000:03:00.0/ctx_restore_mid_bb
    
  3. Load certain values in a couple of registers (it can be used as a simpler alternative to the cmd) action:

    # cat > /sys/kernel/config/xe/0000:03:00.0/ctx_restore_post_bb <<EOF
    rcs reg 4F100 DEADBEEF
    rcs reg 4F104 FFFFFFFF
    EOF
    

    Note

    When using multiple lines, make sure to use a command that is implemented with a single write syscall, like HEREDOC.

Currently this is implemented only for post and mid context restore and these attributes can only be set before binding to the device.

Max SR-IOV Virtual Functions

This config allows to limit number of the Virtual Functions (VFs) that can be managed by the Physical Function (PF) driver, where value 0 disables the PF mode (no VFs).

The default max_vfs config value is taken from the max_vfs modparam.

How to enable PF with support with unlimited (up to HW limit) number of VFs:

# echo unlimited > /sys/kernel/config/xe/0000:00:02.0/sriov/max_vfs
# echo 0000:00:02.0 > /sys/bus/pci/drivers/xe/bind

How to enable PF with support up to 3 VFs:

# echo 3 > /sys/kernel/config/xe/0000:00:02.0/sriov/max_vfs
# echo 0000:00:02.0 > /sys/bus/pci/drivers/xe/bind

How to disable PF mode and always run as native:

# echo 0 > /sys/kernel/config/xe/0000:00:02.0/sriov/max_vfs
# echo 0000:00:02.0 > /sys/bus/pci/drivers/xe/bind

This setting only takes effect when probing the device.

Remove devices

The created device directories can be removed using rmdir:

# rmdir /sys/kernel/config/xe/0000:03:00.0/

Internal API

void xe_configfs_check_device(struct pci_dev *pdev)

Test if device was configured by configfs

Parameters

struct pci_dev *pdev

the pci_dev device to test

Description

Try to find the configfs group that belongs to the specified pci device and print a diagnostic message if different than the default value.

bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)

get configfs survivability mode attribute

Parameters

struct pci_dev *pdev

pci device

Return

survivability_mode attribute in configfs

u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev)

get engine allowed mask from configfs

Parameters

struct pci_dev *pdev

pci device

Return

engine mask with allowed engines set in configfs

bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev)

get configfs enable_psmi setting

Parameters

struct pci_dev *pdev

pci device

Return

enable_psmi setting in configfs

u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class class, const u32 **cs)

get configfs ctx_restore_mid_bb setting

Parameters

struct pci_dev *pdev

pci device

enum xe_engine_class class

hw engine class

const u32 **cs

pointer to the bb to use - only valid during probe

Return

Number of dwords used in the mid_ctx_restore setting in configfs

u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev, enum xe_engine_class class, const u32 **cs)

get configfs ctx_restore_post_bb setting

Parameters

struct pci_dev *pdev

pci device

enum xe_engine_class class

hw engine class

const u32 **cs

pointer to the bb to use - only valid during probe

Return

Number of dwords used in the post_ctx_restore setting in configfs

unsigned int xe_configfs_get_max_vfs(struct pci_dev *pdev)

Get number of VFs that could be managed

Parameters

struct pci_dev *pdev

the pci_dev device

Description

Find the configfs group that belongs to the PCI device and return maximum number of Virtual Functions (VFs) that could be managed by this device. If configfs group is not present, use value of max_vfs module parameter.

Return

maximum number of VFs that could be managed.