01:17Lynne: zzoon[m]: thanks, I'll test it now
01:17Lynne: did you figure out the seeking in hevc issue btw?
02:12zzoon[m]: Lynne: I remember I mentioned about that, it tries to get query results without query.
02:13zzoon[m]: That's also what happening for seek during av1 decoding
02:14Lynne: but h264 works?
02:14zzoon: yeah
02:26Lynne: I think the driver should just return not ready or invalid or something
02:29Lynne: and I don't even see how its possible we query results without a prior query, because the code explicitly checks
02:31zzoon: ok.. could u let me know where it is(explicitly checking)? I'm going to work on it again.
02:34Lynne: libavcodec/vulkan_decode.c, anywhere had_submission is used
02:34zzoon: ok thanks.
02:35Lynne: ah... I see what you mean, when you reset, the previous queries are gone
02:36Lynne: not sure how to solve that, but afaik the vulkan spec doesn't specify that anything should happen to queries when a video reset cmd is used
02:37zzoon: you mean,, it resets after query but before getting results?
02:37Lynne: yes, and then it gets results from a submission from before the reset
02:38zzoon: ok. I think I need to read spec first and I'll be thinking about that case.
02:39Lynne: we could remove the queries entirely, since they're only there to provide feedback for the user that something went wrong, though since we parse and verify the slices and headers ourselves, decoding on corrupt files doesn't generally happen
05:35zzoon: Lynne: or you don't set VK_QUERY_RESULT_WAIT_BIT so it couldn't lead to timeout ..
05:35zzoon: no idea other than that.
05:52Lynne: we don't set the bit because we wait on the fence before
06:24zzoon: Hmm... there is "ret = ff_vk_exec_get_query(&ctx->s, exec, (void **)&result, VK_QUERY_RESULT_WAIT_BIT); "
06:35Lynne: oh, we do, nevermind
06:35Lynne: I'll send a patch to remove the query, it was a hack anyway since what good is an error after a frame's been decoded
06:36Lynne: and probably presented
06:37zzoon[m]: ok
06:38Lynne: btw, llvmpipe regressed and errors out when trying to create exportable opaque FD semaphores
06:43Lynne: not sure who to blame, git log is unclear
08:43MrCooper: soreau: indeed it would, it would be a rather blunt solution though, so I'd avoid it if possible
08:44MrCooper: I'd rather argue this is a special enough case that compositors should handle it explicitly
08:45DemiMarie: MrCooper: Would this require compositors to special-case llvmpipe in any way? If so, please don’t do that. Sincerely, a Qubes OS developer who is often frustrated by stuff assuming GPU acceleration.
08:45soreau: MrCooper: Is there a way to detect the case of having no fences/sync mechanism?
08:46MrCooper: there is checking the renderer string for llvmpipe :)
08:46soreau: i.e. can it be done conditionally so that the 'normal' paths stay in tact?
08:46MrCooper: sure
08:46DemiMarie: checking the renderer string is not a good option
08:46soreau: ok, I'll look out for that MR ;)
08:46MrCooper: neither is glFlush behaving like glFinish
08:47MrCooper: it's a "pick your poison" kind of situation
08:48DemiMarie: What about an env var to select the slow behavior?
08:48soreau: MrCooper: DemiMarie: I just got wayfire working in a gh ci workflow runner using udmabuf+llvmpipe+headless: https://oshi.at/LjWP/screenshot.png
08:48DemiMarie: Also, is this just compositors or are apps also affected?
08:48MrCooper: soreau: I'm not planning to work on this, just trying to help finding a solution
08:48DemiMarie: soreau: udmabuf sounds great
08:48DemiMarie: especially if explicit sync can be used with it
08:49soreau: MrCooper: Aw, that's too bad. There's room for improvement, with your name all over it ;)
08:49MrCooper: not really near my core of interest I'm afraid
08:49DemiMarie: soreau: I just want to be able to run stuff in my no-GPU VMs and not have it break
08:50soreau: DemiMarie: I think maybe apps work because they use swapbuffers?
08:50soreau: but I have no idea really
08:50DemiMarie: MrCooper: Why would glFlush and glFinish be noticeably different with SW rendering? I thought the difference was because the GPU is separate from the CPU and running in parallel with it.
08:50MrCooper: yeah, this is only an issue if the compositor uses udmabuf to share its drawing with KMS or another entity
08:51MrCooper: DemiMarie: glFinish still blocks the calling thread and might prevent it from doing something else
08:52soreau: MrCooper: yea, an env var doesn't sound too bad.. LP_FORCE_SYNC?
08:52DemiMarie: MrCooper: is there something that is more important than running the llvmpipe compute threads?
08:52MrCooper: not sure I see the point, the compositor can just call glFinish as needed?
08:52DemiMarie: MrCooper: this is for compositors that don't check for SW rendering
08:53soreau: MrCooper: but every compositor has to special case this instead of doing it once in the driver?
08:53DemiMarie: soreau: exactly
08:53soreau: doesn't make sense
08:53MrCooper: and they expect what exactly with udmabuf?
08:53soreau: that it works as the other drivers
08:53DemiMarie: yup
08:53MrCooper: well it can't
08:53DemiMarie: why?
08:54DemiMarie: Is it the lack of dmafence finite-time signalling guarantees?
08:54MrCooper: dma-fences can't be backed by software
08:54soreau: MrCooper: that it works as where llvmpipe has fences
08:54MrCooper: it would allow user space to hang / deadlock the kernel
08:55soreau: this has 'driver bug' written all over it
08:55soreau: there's no reason to start specially handling the llvmpipe case in compositors
08:55MrCooper: it's deeper than that
08:56MrCooper: it's an unavoidable issue in the core kernel memory management
08:56soreau: well maybe we can start with a glFinish in glFlush patch to see if it provably works
08:57DemiMarie: MrCooper: It is unavoidable with implicit sync, but what about explicit sync?
08:57DemiMarie: Syncobjs explicitly have no finite time guarantees at all.
08:57MrCooper: false dichotomy
08:58MrCooper: it's rather dma-fence vs drm_syncobj, and as I wrote on GitLab, KMS only supports the former
08:58DemiMarie: Is it? Yes, it's all dmafence under the hood, but an already-signaled fence doesn't need to be backed by anything.
08:58DemiMarie: Ah, I was missing (and do not care about) KMS.
08:59DemiMarie: All of my use-cases are with nested compositors.
08:59soreau: but there are many more
08:59DemiMarie: soreau: I'm not saying their aren't.
09:00DemiMarie: Actually, now that I think about it, there are GPU passthrough situations on Arm where what you are talking about would matter.
09:01MrCooper: I guess adding drm_syncobj support to KMS should be possible
09:18soreau: MrCooper: reportedly, lavapipe works fine with udmabuf, do you happen to know what the difference is there?
09:28MrCooper: with KMS, or nested?
09:29MrCooper: if the latter, could be due to using drm_syncobj for synchronization
09:32soreau: MrCooper: do you know whether KMS drm_syncobj would require kernel changes?
09:33MrCooper: the K does stand for kernel, yes :)
09:33soreau: heh
09:33soreau: but I'm not sure what the case was for lavapipe
09:35soreau: is there a way to use zink on top of lavapipe btw?
09:53MrCooper: there is indeed, don't remember the magic incantation offhand though, it's tested in Mesa's CI FWIW
09:54soreau: ok, thanks
10:03MrCooper: just to be clear, if drm_syncobj support was added to KMS, the compositor would have to explicitly make use of that, it's not an automagic thing
10:04MrCooper: and since it would involve somehow waiting for the GL rendering to finish and then signalling the drm_syncobj timeline acquire point, not sure there's really a point vs just delaying the KMS atomic commit until GL rendering finishes
10:05MrCooper: maybe the drm_syncobj timeline acquire point could be signalled on another thread
10:53ccr: happy holidays etc to all of you :)
18:24snowflask: The access method itself is a big rumble. so 141+56+58-128=127 it's 69+58*2 is 69+69+58+58 so now 120+141+141−69−69−58−58−115 is 29+4
18:29snowflask: so 26+16-29-4=9 and 12-9=3 , and 141+3 is twice of 72
18:30snowflask: now we divide it with two or can be skipped if division was not needed due to format of entropy compression.
18:38snowflask: error 29+20-29-4=16 and 16-13=3 so the division method looks like this , we push 112 onto 512
18:46snowflask: error, it was 13-9 and 4+136=140