Xe Device Wedging

Xe driver uses drm device wedged uevent as documented in Userland interfaces. When device is in wedged state, every IOCTL will be blocked and GT cannot be used. Certain critical errors like gt reset failure, firmware failures can cause the device to be wedged. The default recovery method for a wedged state is rebind/bus-reset.

Another recovery method is vendor-specific. Below are the cases that send WEDGED=vendor-specific recovery method in drm device wedged uevent.

Case: Firmware Flash

Identification Hint

WEDGED=vendor-specific drm device wedged uevent with Runtime Survivability mode is used to notify admin/userspace consumer about the need for a firmware flash.

Recovery Procedure

Once WEDGED=vendor-specific drm device wedged uevent is received, follow the below steps

  • Check Runtime Survivability mode sysfs. If enabled, firmware flash is required to recover the device.

    /sys/bus/pci/devices/<device>/survivability_mode

  • Admin/userpsace consumer can use firmware flashing tools like fwupd to flash firmware and restore device to normal operation.