00:00 ngcortes: i'll post a pathological example here in a moment
00:18 anholt: ngcortes: sounds like your kernel recovery might be busted?
00:39 ngcortes: anholt, thing is we're running a production kernel :/
00:39 ngcortes: maybe we have some weird kernel params set?
00:40 ngcortes: we're running 6.17.0 in CI
00:50 airlied: setting the iommu to on will either slow things down or stop some fault, but it's more likely a kernel driver bug than anything
01:07 ngcortes: airlied, guess i could file a bug on jira
02:10 ngcortes: anholt, is there are recommended way to set the "tests_per_group" param in deqp-runner? I'm dividing the total caselist count by the number of threads run. maybe that's not right? seems like my test runs slowed down a lot
02:11 ngcortes: compared to directly calling the binary on multiple child processes
02:12 ngcortes: i could also be running on slow hardware but i'll need to confirm that
02:13 anholt: definitely not right. Take a look through mesa ci -- we generally use 5000 for VK (amortizing its worse startup time), default (500) for GL. In deqp-runner git, we default to 5000 for VK.