05:09ity: Is there some way I can log page flips? Specifically, for the sake of tracking down random scanout freezes. I cannot tell if
05:09ity: - driver stopped scanning-out
05:09ity: - the display server (sway) stopped doing page flips
05:09ity: etc etc. Nothing in dmesg (besides some random, as far as I can tell unrelated iGPU "Atomic update failure" from an unrelated time several hours before the actual issue), so I don't actually know where to look, haha.
05:10ity: I know it's not a system freeze since I can ssh and do things just fine. Just cannot tell if it's sway locking up or the driver breaking, and unsure how to track that down.
05:24mareko: tarceri: have you wanted to look at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36023 ?
05:43tarceri: mareko: did you run the final patch through shader-db to be sure it doesn't do anything?
05:48tarceri: I have a feeling I tested it at some stage but it was still doing something
05:56Lynne: was the limit on FindSMsb only being usable on 32-bit variables lifted yet?
05:56Lynne: I remember there being an extension to remove this limitation
06:06mareko: tarceri: yes, there are 0 shader-db changes with radeonsi
06:14tarceri: thing might have improved :)
08:19glehmann: Lynne: no that ext only covered, only bitfield extract/insert/reverse
08:20glehmann: FindSMsb/UMsb/Lsb are harder because the actual spir-v instruction is defined as 32bit only, so it can't be changed by just changing a Vulkan VUID
08:21Lynne: oh, well, looking at the rdna3 instruction sheet, the hardware only accepts 32-bit numbers anyway
08:22glehmann: VALU is 32bit only yes, SALU has a 64bit version
08:22glehmann: Lynne: but if you want 64bit findmsb, don't hesitate to open a Vulkan-Docs issue
08:26glehmann: or you can use this lowering, I guess: https://gitlab.freedesktop.org/freedesktop/snippets/-/snippets/7849
08:36mareko: zmike: FYI, https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36023/diffs?commit_id=08b522d21e19a9891ce745aa9c9c8106ba064728#2c8b4a9f3ead3a7c4f085bd9b156e2923c838226_414_414
09:04karolherbst: Are there any reasons why a DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD + DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE might fail on a valid syncobj?
09:04karolherbst: (besides OOM)
09:05emersion: karolherbst: maybe if the syncobj has not yet materialized?
09:05karolherbst: mhhh, what does that mean?
09:05karolherbst: but it sounds plausible
09:07emersion: it's for wait-before-submit
09:08emersion: before GPU work has been enqueued, the syncobj sync_file for a given point will be NULL
09:08karolherbst: yeah, soo if I submit the syncobj then exporting works fine, but I need to export before anything waits on or submits it
09:09karolherbst: _but_ maybe I should just not use sync files then?
09:09karolherbst: it's just a lot of the code assumes sync_file...
09:09emersion: you need to wait for GPU work to be enqueued before being able to get a sync_file out of the syncobj
09:09karolherbst: mhhh
09:09karolherbst: well, it's not under my control
09:09emersion: sync_file is a special "privileged" fence
09:10emersion: which promises it's going to complete "soon"
09:10karolherbst: ahh
09:10karolherbst: mhhh
09:10karolherbst: annoying
09:10emersion: can only be created by the kernel
09:10emersion: that is, there is no user-controlled sync_file
09:11karolherbst: it seems like I only get sync files when anything uses VULKAN_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD and the code I'm working on (external semaphore in CL) need to be able to consume that one
09:11emersion: you can turn a sync_file into a syncobj, if that helps
09:11emersion: but i'm not sure i understand what you want to do
09:12karolherbst: so vulkan applications create an FD through VULKAN_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD, and I need to import that one, but I also need to be able to create fds that vulkan applications can import
09:12karolherbst: it's for cl_khr_external_semaphore
09:12emersion: ah, the last bit is the hard part
09:12karolherbst: cl_khr_external_semaphore_sync_fd specifically
09:13emersion: well, i wanted to add a vulkan ext to import/export syncobj for ages :o
09:13karolherbst: right... I have it working with iris and zink, just radeonsi is resisting a bit...
09:13emersion: but it just so happens that "opaque FD" on mesa gives you a syncobj
09:13karolherbst: mhhhh
09:13emersion: if you can assume mesa, maybe that'd be enough for you
09:13karolherbst: VULKAN_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD?
09:13karolherbst: ehh
09:13emersion: yea
09:14karolherbst: EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD
09:14karolherbst: mhhh
09:14emersion: eh
09:14emersion: right, the latter
09:14karolherbst: sooo.. the thing is, I don't know what's the vulkan impl :)
09:14karolherbst: though might as well assume mesa...
09:14emersion: ok, then that's toast
09:14emersion: i would fully support and help a new vulkan ext :)
09:14karolherbst: there is cl_khr_external_semaphore_opaque_fd
09:14karolherbst: and it's "implementation defined"
09:15emersion: but then that only works if CL and Vulkan are the same driver
09:15karolherbst: I see
09:15karolherbst: might as well use opaque then
09:15emersion: ;_;
09:15karolherbst: but I also wanted to get the sync_fd thing working
09:16emersion: what vulkan does is block (!) until the sync_file materializes in some places
09:16karolherbst: anyway.. on the mesa side I need to create some sort of semaphore where I can reliable get a fd for without having any queue
09:16karolherbst: yeah.. that's fine
09:16emersion: e.g. if you give a wait point, and submit work, the submission function will block until it can extract a sync_file
09:16karolherbst: not sure if clGetSemaphoreHandleForTypeKHR is allowed to block, but the signal/wait variants are
09:17emersion: i have no idea what the vulkan sync_file export func does
09:17emersion: maybe a good place to look?
09:17emersion: either fails or blocks, i'd say
09:18karolherbst: mhhh... yeah maybe I just need to place a wait somewhere, the amdgpu winsys might just assume the fence has already be attached to a queue (which would make sense given that the normal code flushes a context and uses that fence to export)
09:18karolherbst: and yeah.. vulkan blocks
09:18karolherbst: I had to uhm.. deal with it, because I had zink blocking trying to export the fence
09:18emersion: right
09:19emersion: this is quite annoying to deal with from the wlroots side :<
09:19karolherbst: but yeah, if I just need to wait, I just have to see how mesa waits on the vulkan side then
09:19karolherbst: yeah...
09:19emersion: there's an ioctl for WAIT_FOR_SUBMIT
09:19karolherbst: this external semaphore/memory business is sure fascinating
09:19karolherbst: ahh
09:19emersion: https://docs.kernel.org/gpu/drm-mm.html#host-side-wait-on-syncobjs
09:19emersion: the flags are a bit of a mess
09:20emersion: there are libdrm wrappers for these ioctls
09:20karolherbst: the world with cl_gl sharing was wonderful, because you knew the GL context on the CL side, and you could establish a back channel to do all this stuff behind the applications back (like sharing modifier information, stride, size, etc...)
09:20karolherbst: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT ahh
09:20emersion: ah no WAIT_FOR_SUBMIT is not the one
09:20emersion: WAIT_AVAILABLE is the one
09:21karolherbst: okay
09:21emersion: WAIT_FOR_SUBMIT waits for the sync_file to materialize, then waits on the sync_file
09:21karolherbst: ehh
09:21emersion: yeah
09:21karolherbst: that's not there for drivers without timeline semaphores
09:21emersion: not confusing at all (:
09:21karolherbst: aparently
09:21emersion: hm
09:21emersion: that sounds strange
09:21karolherbst: there is this handy spin_wait_for_sync_file function inside vk_drm_syncobj.c
09:22emersion: there are two variants of the wait ioctls
09:22emersion: one for timelines (with a wait point), one for binary
09:23karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/vulkan/runtime/vk_drm_syncobj.c#L212
09:24karolherbst: but yeah.. atm I'm looking at binary semaphores
09:25emersion: possible_flags = DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
09:25emersion: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT |
09:25emersion: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_DEADLINE;
09:25emersion: "ncie"
09:25emersion: "nice"
09:25emersion: (from drm_syncobj_wait_ioctl)
09:25emersion: kinda sad nobody sent the kernel patch for this
09:26karolherbst: heh
09:26emersion: tbh it looks like an oversight
09:26emersion: drm_syncobj_timeline_wait_ioctl is literally the same code
09:26emersion: but with the points passed in instead of NULL
09:28karolherbst: sooo.. guess I go spinning then
09:28karolherbst: which.. is annoying because it might as well spin forever 🙃
09:28emersion: so i think just adding the flag to the allow-list should just work
09:28karolherbst: well if you have a new kernel, yes
09:29emersion: with this reasoning we'll never fix any kernel issue :P
09:29karolherbst: I'm just trying out the spinny solution first and see if that helps
09:31karolherbst: ohh right the workaround I've added was to flush the last queue I saw that semaphore on...
09:32karolherbst: mhhh
09:32karolherbst: I could turn it into a wait, but I also don't want to risk dead locks
09:32karolherbst: but yeah.. it works..
09:32karolherbst: maybe I just spin after the flush, because it's gonna give me a proper fd at some point
09:36karolherbst: emersion: anyway, thanks for the help! Seems to be working correctly now
09:37emersion: cool!
09:42karolherbst: now I just need to figure out why the queue is randomly dying... but it's probably unrelated because the vulkan <-> CL interop tests does a few successful iterations until it breaks
11:47sima: karolherbst, yeah fixing the kernel would be nice for at least the future
11:47sima: like the fd compare function that's finally generally available and mesa is even going to use it now
11:48karolherbst: okay. If there are no technical reasons for this restriction I could try to figure out if it's possible
11:50sima: as long as it's an opt-in flag I really can't see how this would break anything
11:54emersion: yeah, i don't see any reason why binary syncobj should not support this
12:19sima: I mean userspace might deadlock if you use it wrong, but that applies to timeline syncobj too
12:21karolherbst: yeah.. I mean.. use a short enough timeout and then spin on the wait in case you don't
12:31zmike: mareko: 🤔
12:31zmike: why are there tests which exceed the max value?
14:47zmike: mareko / pepp: do you have plans to fix your deprecation warnings ?
14:49Sachiel: s/DEPRECATED// should be easy enough
14:49zmike: you have your own deprecation problems
14:49Sachiel: same solution applies
15:01phasta: my maintainer tools / dim is hanging at "Fetching linux-upstream (local remote origin)... ". Still worked this morning. Is there an outage?
15:03ukleinek: in another channel someone wailed about git.k.o being slow.
15:04jannau: phasta: maybe an infrastructure problem on kernel.org side? lore.k.o fails with "Error 503 Backend fetch failed" for me
15:04phasta:wants to finally merge what he's been working on since September <.< -.-''''''
15:05phasta: yoah, seems a server is dead or broken or sth "Fetching linux-upstream (local remote origin)... error: RPC failed; HTTP 502 curl 22 The requested URL returned error: 502"
15:08mareko: zmike: yes
15:11ukleinek: ..ooOO(There were some git security fixes released just yesterday. Coincidence?)
15:14phasta: Worked now. Maybe the server admin got a new cup of coffee ^^'
20:18anholt: daniels: can we delete old traces from traces-db again, yet? Would love to clean it up so that people can reasonably use it in perf testing outside of mesa ci.
20:56dcbaker: zmike: I'm looking at the pipe_surface changes, if that's what you're asking about
20:57dcbaker: I'm not planning to look at Crocus or i915 though
21:22zmike: dcbaker: who is responsible for those drivers?
21:50alyssa: zmike: is anybody?
21:50zmike: that's my question
21:50alyssa: I thought we just play hot potato (:
21:51zmike: why do we have drivers in the tree that are unmaintained is the question I always ask
21:51zmike: and usually someone pops out to scream NO IT'S ME I'M THE ONE IN CHARGE
21:51alyssa: CODEOWNERS lists anholt for i915 and nobody for Crocus
21:52alyssa: also that file seems pretty unmaintained
21:52alyssa: problematically, there is no listed code owner for the CODEOWNERS file
21:52alyssa: that explains the unmaintainedness
21:52alyssa: (((:
21:52zmike: I don't think I've ever opened that file
21:53HdkR: Sounds like this "nobody" can fix the problems when it happens :P
21:53Sachiel: I thought that was what "Community maintained" meant
21:53alyssa: HdkR: Great idea
21:53Sachiel: you change something that affects them, you are on the hook to fix them up
21:53zmike: that's not really how drivers are supposed to work though
21:53zmike: common code sure
21:57Sachiel: I may be wrong, but I'm under the impression that that's how it always had worked
21:58zmike: drivers are supposed to have maintainers
21:59zmike: otherwise there's nobody who is actually testing them and addressing issues