IRC Logs of #dri-devel on irc.freenode.net for 2024-08-08

08:31 tursulin: sima, airlied: do you or don't tend to rebase drm-fixes on latest weekly rc?
08:37 sima: usually, but sometimes only when starting to process pr
08:38 sima: tursulin, just rolled it to -rc2
08:39 tursulin: sima: thank you, makes dim pull-request warnings less scary
08:48 tzimmermann: airlied, sima, last week's drm-misc-next has not been merged yet.
10:17 jani: another maintainer-tools doc update, redirects for moved pages: https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/64
10:17 jani: sima: ^
12:35 zmike: apinheiro jasuarez: any chance one of you can check out the broadcom trace crashes in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30524 and get me a backtrace/bisect?
12:36 MrCooper: zamundaaa[m]: remind me, how many microseconds before start of vblank does kwin aim to call drmModeAtomicCommit at the latest?
12:38 jasuarez: I'll try but can't promise. Let's see if apinheiro can do it
13:04 zamundaaa[m]: MrCooper: 1500
13:05 MrCooper: hmm, seems pretty high; how did you arrive at that?
13:05 zamundaaa[m]: Testing on various hardware until barely any frames were too late anymore
13:07 zamundaaa[m]: Note that when you move the cursor for example, it might still do some work before the actual atomic commit happens. So the real commit time can be a bit later
13:09 MrCooper: right, I mean specifically when drmModeAtomicCommit is called, but it sounds like you're not measuring that?
13:11 zamundaaa[m]: Right now we're not, but I plan to change that because we've received bug reports that on some CPU+driver combinations the atomic tests in those 1500us take so long we miss the deadline
13:13 MrCooper: ah, that's including test commits, makes sense then, thanks!
13:25 Hazematman: zmike did you check the stderr log from the crashes in the ci? https://mesa.pages.freedesktop.org/-/mesa/-/jobs/61954241/artifacts/results/summary/results/trace@broadcom-rpi5@supertuxkart@supertuxkart-mansion-egl-gles-v2.trace.html I suspect the the crash is in the app itself because it can't find `glXGetCurrentDisplay`. I have a pi5 & pi4 I can test with later (currently working from a coworking space and don't have the hw with me). But I
13:25 Hazematman: suspect one of the changes is stopping the extension `GLX_EXT_import_context` from being advertised (or prehaps the changing the glx version) which is preventing this function from being avaliable
13:28 DemiMarie: Which GPUs have decent fault isolation, in that they allow reliably killing one context?
13:31 DemiMarie: That could take the form of a VRAM-preserving reset combined with a driver configured to only allow one context to run at a time.
13:31 DemiMarie: I know that some hardware faults cannot be recovered from this way, but I also don't care, because hardware problems can always crash the system. I only care about software faults.
13:32 zmike: Hazematman: I don't think any of my changes affected anything related to versioning or (that) extension enablement
13:32 zmike: there is no diff in glxinfo output either
13:33 Hazematman: If apinheirois not able to do, in a few hours when I'm back home i can try to bisect it.
13:33 Hazematman: Is there any easy way to grab the trace file to run locally?
13:33 zmike: https://gitlab.freedesktop.org/gfx-ci/tracie/traces-db/
13:34 Hazematman: Thanks!
15:10 zmike: jenatali: any chance you can tell me what's wrong here https://gitlab.freedesktop.org/mesa/mesa/-/jobs/62033322
15:10 zmike: I did look at the raw log this time
15:14 jenatali: zmike: A heap allocation was leaked.
15:14 jenatali: That test runs with the Windows equivalent of valgrind (app verifier)
15:14 zmike: how is that possibly a new issue from this MR which just changes build stuff?
15:16 jenatali: zmike: why is there a link_whole in there?
15:16 jenatali: That... seems like a really bad idea
15:16 zmike: I was getting build errors
15:19 jenatali: That duplicates gallium into libva
15:20 zmike: I'm really not a meson expert, and people keep raising issues that I'm struggling to fix
15:20 jenatali: It's probably some thread local thing that's reporting the leak but it's just a symptom of things being in that library that shouldn't be, I think
15:21 jenatali: zmike: that's not really a meson thing. Link_whole means to take all the obj files and forcibly link them, instead of specifying a .a archive lib and letting the linker just pull what it needs
15:21 zmike: ...I'm also not a linking expert
15:21 jenatali: Heh
15:22 zmike: you ask a guy with a toolbox full of hammers for help, and you get what he's got
15:22 jenatali: But yeah that's the wrong change. I haven't been following the refactoring too closely to know what the right one is but it's definitely not that
15:43 zmike: I will pray that some kind meson god takes notice and posts a one-liner to fix everything
15:45 gpiccoli: Hey folks, a maybe silly question: do we have a "nodrm" parameter, to fully disable DRM and all potential GPU drivers that rely on that? Could be nographics, nogfx..the name of such parameter is a matter of choice
15:45 gpiccoli: I think we don't have that, I've achieved that (I think?) by initcall_blacklist'ing drm_core_init
15:46 gpiccoli: a bit hacky, but seems to work. In case we don't have such nogfx parameter, would that be useful/accepted?
15:53 jenatali: zmike: Got a link to the linker errors you were seeing without link_whole?
15:53 zmike: jenatali: you can check some of the earlier pipelines in the MR
16:04 jenatali: zmike: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/61940382/raw indicates that mesa_util isn't getting linked into the dynamic pipe loader targets I think
16:04 jenatali: But that also indicates that the dynamic pipe loader targets also have a full copy of gallium?
16:05 jenatali: Lemme see how bad we've messed things up so far
16:24 jenatali: zmike: On Windows, all of the targets that link to libgallium_dri should link to libgallium_wgl instead. If it's easier, we can rename the meson target to libgallium_dri (or maybe libgallium_shared would be better)
16:25 zmike: hm
16:25 jenatali: Then if you're still getting linker errors after that from unresolved externals, it should just be a matter of adding them to src/gallium/targets/wgl/gallium_wgl.def
16:25 zmike: so you're suggesting do that instead of link_whole?
16:26 jenatali: I think so. Let me double-check with Sil if that's a problem for va, but I don't think it's a problem for any of the other gallium-based targets
16:37 zmike: I pushed an update
16:57 sima: airlied, I guess we can wait with drm-next until monday so it starts with -rc3?
16:57 sima: re tzimmermann's question
16:57 sima: ah there's already an xe on in there
16:58 sima: I guess I'll go and merge something
17:03 Company: mareko: are you the right person to poke about https://gitlab.freedesktop.org/mesa/mesa/-/issues/11629 ?
17:09 karolherbst: uhhh.. another bug with printf..
17:24 jenatali: zmike: If you're okay with it, I can push some Windows tweaks. Chatted with Sil and we'd like to keep the gallium bits for video statically linked into the various frontends (for better or worse)
17:24 karolherbst: if somebody wants to review some simple u_printf change: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30574
17:25 jenatali: So we've got (OpenGL32.dll / libEGL.dll / libGLESv2.dll) dynamically linked against libgallium_wgl.dll, and then the video frontends each have a copy of gallium
17:25 Hazematman: <zmike> "https://gitlab.freedesktop.org/..." <- I'm not able to reproduce the crash on my pi :/ I suspect I'm not running eglretrace in the same way as ci. Maybe I'm missing some env vars or something
17:29 zmike: ughhhhh
17:29 zmike: jenatali: I'm 100000% okay with literally anyone pushing anything to that MR
17:31 zmike: Hazematman: I tried to repro using all the same env vars on a few setups and I got nothing
17:31 zmike: I don't get how it only triggers on those few jobs when there's so many more trace jobs
17:31 daniels: Hazematman: eric_engestrom can help you reproduce when he’s back in a week
19:57 airlied: sima: I didn't merge drm-misc-next yet because it had some amd stuff in there that was getting reverted
19:57 airlied: I just forgot to say that out loud until now
19:57 airlied: since it was a uapi change that needed reverting
19:57 sima: oops, but I guess tzimmermann will send the next pr tomorrow
19:58 sima: which should have the reverts
19:58 airlied: yup, just have to make sure nobody regens the mesa uapi headers for xe from drm-next :)
19:59 airlied: since they might get the amdgpu ones as well
20:04 sima: oh right, perfect timing :-/
20:04 sima: but I guess usually people only update the one header they care about ...
21:44 Lynne: gfxstrand: yay, your branch works perfectly
21:54 jenatali: zmike: Ok yeah I tried looking at it and from the Windows side of things it looks like it should be 036e75a9 (i.e. your second try in that MR). I think I don't have enough of an understanding of the Linux linking problems...
22:22 gfxstrand: Lynne: What are you testing with?
22:22 Lynne: ffmpeg
22:23 gfxstrand: Cool
22:23 gfxstrand: I've got one more CTS failure to track down and then I think it's pretty much ready to go
22:23 Lynne: there is an issue with multiplane images, you don't seem to consider them yet
22:24 gfxstrand: Oh? I don't see any reason why they wouldn't work.
22:25 gfxstrand: That branch is also fast-moving and I literally pushed 30 seconds ago, so...
22:28 gfxstrand: Which is to say that descriptor buffer should work just as well with multiplane as normal descriptors do
22:56 Lynne: you're right, I do get all planes
22:57 Lynne: but the brightness is 2 bits or so, not 8, so the output is almost fully dark (==green in yuv terms)
22:58 Lynne: multiplane or single-plane images don't matter, as long as they pass through a descriptor buffer for processing
23:01 gfxstrand: Weird. Maybe something's wrong with your samplers?
23:01 gfxstrand: Are you passing the sampler into GetDescriptorEXT?
23:01 gfxstrand: You have to with descriptor buffer even though it's an immutable sampler in the descriptor set layout
23:03 Lynne: no, I'm using combined image samplers
23:04 gfxstrand: Yes, you have to use combined image/samplers for YCbCr
23:05 gfxstrand: But there's a difference between descriptor buffer and legacy descriptor sets. With legacy descriptor sets, the sampler you pass in is explicitly ignored in favor of the one in the descriptor set layout. With descriptor buffer, you have to pass the sampler into GetDescriptorEXT because we don't know anything about the descriptor set layout at that time.
23:06 gfxstrand: Just me spit-balling as to why there would be a difference between the two paths
23:08 Lynne: just checked, I am passing the sampler properly
23:08 gfxstrand: :-/
23:09 gfxstrand: And the classic descriptor path works okay?
23:10 Lynne: I don't have a classic descriptor path, I deleted it with a passion
23:10 Lynne: if you've got ffmpeg 7.0, you could probably replicate
23:12 gfxstrand: Ah
23:18 Company: (GTK has a classic YUV path, but no descriptor buffers yet)
23:19 gfxstrand: I'm pretty sure YUV works. The CTS has a LOT of YUV tests and we pass them all.
23:20 gfxstrand: Why it's not working for ffmpeg, I don't know.
23:20 Lynne: thie issue happens on rgba too, just tested
23:20 gfxstrand: Okay, then it's not the YUV path
23:20 gfxstrand: That's very odd
23:20 Company: might be dmabufs
23:20 Company: imported ones I mean
23:20 Lynne: we don't use dmabufs in ffmpeg, we create basic pure vulkan images
23:21 gfxstrand: Not unless the format is wrong.
23:21 gfxstrand: What format are you using?
23:21 gfxstrand: Everybody's favorite 8-bit or something a bit more interesting?
23:21 Lynne: no, everyone's favorite UNORM 8bit RGBA
23:21 Company: what formats does nvidia do btw?
23:22 Company: NV12 and P010? More fancy ones>
23:22 Company: ?
23:22 gfxstrand: Okay, that's even weirder then
23:24 gfxstrand: I know RGBA8 works
23:24 Lynne: its weird, even fancy 10bit multiplane yuv images experience the same issue
23:24 Lynne: in the same way, so underneath, they do work, just that everything breaks in the same way
23:25 zmike: gfxstrand: did you test zink?
23:25 gfxstrand: zmike: not yet
23:25 zmike: the final boss awaits.
23:25 gfxstrand: lol
23:26 gfxstrand: Lynne: Are you using buffer views for anything?
23:26 ccr: "why did that autosave just happen and this ominous music started playing?"
23:26 Lynne: gfxstrand: false alarm, the filter I was testing with was bitrotten, everything works perfectly
23:26 gfxstrand: lmao
23:26 gfxstrand: \o/
23:27 gfxstrand: Once again, NVK has no bugs. :P
23:29 Lynne: download performance is legendarily slow though, 1fps on 1080p yuv420p
23:30 gfxstrand: Are you using HOST_CACHED memory or just HOST_VISIBLE?
23:30 Lynne: just host visible
23:30 gfxstrand: Then you're reading through a write-combein map
23:32 Lynne: HOST_CACHED is normal speed though
23:32 gfxstrand: Yeah, you want HOST_CACHED for downloads
23:33 Lynne: the thing is we readout the downloaded buffer to ram via the CPU-native non-temporal read instructions
23:33 Lynne: so non-cached paths would be more optimal
23:34 gfxstrand: I'm not sure how that stuff interacts with going across PCIe
23:35 gfxstrand: But I'm pretty sure non-temporal won't save you if you're going across the PCI bus.
23:35 Lynne: is there a way to tell that CACHED is more optimal?
23:35 gfxstrand: That's tricky
23:36 gfxstrand: On a discrete card, typically HOST_VISIBLE + DEVICE_LOCAL means that it lives in VRAM and you're getting a write-combine map
23:36 gfxstrand: On a UMA, everything is going to be HOST_VISIBLE + DEVICE_LOCAL + HOST_CACHED except for the uncached stuff which will be HOST_VISIBLE + DEVICE_LOCAL
23:37 gfxstrand: We don't have a bit for "Seriously, don't read from this map. You'll regret it"
23:37 gfxstrand: D3D12 actually does better at this because they simply advertise upload vs. download vs. device
23:37 zmike: see also https://www.basnieuwenhuizen.nl/the-catastrophe-of-reading-from-vram/
23:37 gfxstrand: Which isn't as flexible as Vulkan but at least makes it clear what you should do.
23:38 Lynne: interestingly enough, nvidia's proprietary driver on the same device deals better without CACHED
23:38 Lynne: I get 130fps (no cached) vs 124fps of nvk (with cached)
23:39 gfxstrand: They might be configuring their heaps differently. The no cached might not be write-combine. It might be uncached system RAM.
23:39 gfxstrand: I'm pretty sure they have a write-combined VRAM heap, though.
23:40 gfxstrand: When you cay no cache on the prop driver, is it DEVICE_LOCAL?
23:40 gfxstrand: (Not that that's 100% accurate. They might be hiding things from you.)
23:40 Lynne: ah, nevermind, I forgot to disable host mapping
23:41 Lynne: nvk is slightly slower than the proprietary drivers for the same codepath (140fps with cached)
23:44 gfxstrand: That's to be expected. We've still got a good bit of perf work to do.
23:44 Lynne: I think optimizing that is a lower prio though, a few percent or so isn't too bad
23:47 Lynne: so assuming HOST_VISIBLE and HOST_CACHED bits are the only ones set, would most driver implementations prefer GPU-visible system RAM for it?