Xe Device Wedging¶
Xe driver uses drm device wedged uevent as documented in Userland interfaces. When device is in wedged state, every IOCTL will be blocked and GT cannot be used. Certain critical errors like gt reset failure, firmware failures can cause the device to be wedged. The default recovery method for a wedged state is rebind/bus-reset.
Another recovery method is vendor-specific. Below are the cases that send
WEDGED=vendor-specific
recovery method in drm device wedged uevent.
Case: Firmware Flash¶
Identification Hint¶
WEDGED=vendor-specific
drm device wedged uevent with
Runtime Survivability mode is used to notify
admin/userspace consumer about the need for a firmware flash.
Recovery Procedure¶
Once WEDGED=vendor-specific
drm device wedged uevent is received, follow
the below steps
Check Runtime Survivability mode sysfs. If enabled, firmware flash is required to recover the device.
/sys/bus/pci/devices/<device>/survivability_mode
Admin/userpsace consumer can use firmware flashing tools like fwupd to flash firmware and restore device to normal operation.