02:39 xeyler: i added https://gitlab.freedesktop.org/drm/misc/kernel.git as a remote to my local linux kernel git repository
02:41 xeyler: nevermind, i've answered my own question even as i'm in the middle of asking it...
02:42 xeyler: have a nice evening, graphics nerds
06:46 sima: tzimmermann, I wrote a long email in the private thread with linus, please take a look
06:47 sima: especially double checking whether my revert list is complete
06:47 tzimmermann: sima, reading right now
06:47 sima: I'll go try and wake up my brain now with some breakfast
06:49 tzimmermann: sima, i did not get your reply. only linus' comments
06:49 tzimmermann: sima, now i received it. never mind
06:51 sima: I'm suspecting that ->dma_buf is becoming invalid at times for imported buffer like for natively allocated ones, and we're blowing up on that
06:51 sima: and since dma_buf fd are generally short-lived with the usual X/wayland protocols, there's a race window
06:51 sima: but I'm definitely not awake enough to check that
06:55 sima: sent another mail with that wild guess
07:08 sima: tzimmermann, I sent the pile of reverts on top of drm-tip to the intel-gfx-ci trybot
07:09 sima: https://lists.freedesktop.org/archives/intel-gfx-trybot/2025-July/141971.html
07:09 sima: it's not yet on patchwork
07:17 sima: tzimmermann, thanks for completing the list, somehow I ignored the driver patches
07:26 tzimmermann: sima, thanks for the detailed replies
07:30 sima: tzimmermann, wrt the hangs, I'm honestly not sure what's happening there, could be gpu hang
07:31 sima: but also wondering whether something with the import/export goes out of sync and that then leads to rendering to the wrong buffer or something like that
07:31 tzimmermann: sima, i've not seen that. i don't think it's related, but i'm not going to take a chance here
07:31 sima: yeah same
07:32 sima: it could be related to the ->dma_buf change in prime maybe, and your patch to hold a handle_count reference moved the race around enough for things to blow up
07:32 tzimmermann: sima, if we cannot unify dma_buf and import_attach->dmabuf in the gem object; we should document that one is only for export and one is only for import
07:32 sima: yeah
07:33 sima: was also thinking that some kunit to repro all the corner cases we've hit would be good
07:33 sima: since pretty clearly we (you, christian, me) don't really understand what's all going on there
07:33 tzimmermann: i'm preparing a patchset that reverts all the commits
07:33 sima: I was also completely taken by surprise how christians patch in drm-misc-next a month ago blew up
07:33 sima: thanks a lot!
07:34 tzimmermann: yeah, i'm somewhat conerned that we missed all the corner cases
07:35 tzimmermann: can i see the result of that intel-gfx-ci run somewhere?
07:35 sima: should show up on patchwork, but maybe I'm stuck in a moderation queue?
07:35 sima: https://patchwork.freedesktop.org/project/intel-gfx-trybot/series/?ordering=-last_updated
07:40 sima: tzimmermann, I guess if you're prepping the full series just cc intel-xe or intel-gfx to get it into ci, if mine stays stuck
07:41 tzimmermann: i'll do
07:51 sima: tzimmermann, I thought linus' hang happens when logging in, not when booting up
07:51 sima: during bootup I don't have a theory for hangs either
07:52 tzimmermann: login? ok
07:53 tzimmermann: but even in multi-gpu setups, it's hard to see how gdm does complicated reexports during login. yeah, i don't have a good theory either
08:02 sima: yeah I'm lost too
09:06 sima: tzimmermann, mlankhorst https://lore.kernel.org/dri-devel/0a087cfd-bd4c-48f1-aa2f-4a3b12593935@oss.qualcomm.com/ I think the conversion to drm client broke something with locking?
09:06 sima: as a guess at least
09:08 tzimmermann: sima, i've seen this. i'll check it later
09:09 sima: hwentlan_, no amd fixes pull this week?
09:10 sima: agd5f seems not around
09:40 tzimmermann: sima, sent out the reverts. smoke testing on various drivers looks good. the week's PR of drm-misc-fixes is already out. coud you merge it, and i'll send out a second PR with the reverts later?
09:40 sima: tzimmermann, was thinking I'll just apply them to drm-fixes directly
09:41 tzimmermann: ok, sure
09:41 sima: drm-misc-fixes is stuck on -rc1 anyway, backmerge can't hurt :-P
09:41 tzimmermann: i've cc'ed the intel ci bot
09:41 sima: thx
09:41 sima: somehow mine didn't make it on patchwork, no idea
09:41 sima: but the one I cc'ed to intel-xe did work earlier this week
09:42 tzimmermann: hmm, ok
09:42 sima: https://patchwork.freedesktop.org/series/151494/ it's there from you
09:42 tzimmermann: that's the one
09:43 tzimmermann: i'll also send out reverts for the few patches that have made it into per-driver trees. but these don't seem to be in upstream yet
09:44 sima: yeah I guess we'll need to run after them for a bit, or reply to the patches with a heads-up to please drop them
09:44 sima: might want to check linux-next too
09:44 sima: tzimmermann, I also noticed some more additions of checks for bo->import_attach, which probably wants to use the is_imported() helper instead
09:45 sima: just for clearer code
09:45 tzimmermann: makes sense
09:46 tzimmermann: sima, and while testing, i also foudn that this series is also broken: https://patchwork.freedesktop.org/series/148809/
09:47 sima: tzimmermann, has this landed already?
09:47 tzimmermann: only in drm-misc-next
09:47 sima: I guess reply on-list with what's busted (I don't see it immediately) and maybe more reverts :-/
09:49 sima: and I guess we need to re-baseline in this entire area with improved docs with all we learned, and kunit to hit the corners
09:49 tzimmermann: it breaks dma_buf_vmap when sharing between amdgpu and udl
09:49 sima: and then restart :-(
09:49 sima: same NULL check as the others?
09:49 tzimmermann: yeah, it's really not going well recently
09:49 sima: yeah
09:49 sima: so yeah, I guess more reverts because clearly we collective forgot how this all works :-(
09:50 tzimmermann: i didn't debug closely, but the vmap'ed pointer in udl appears to be NULL
09:50 tzimmermann: the buffer comes from amdgpu
09:50 sima: uh
09:51 tzimmermann: it's another easy revert, but still sucks to hit the error
09:51 sima: yeah I think we need to figure out how to kunit the common cross-driver and sharing paths
09:52 sima: since igt also didn't hit these really, with the exception of the issue from Christians re-export cleanup (which turned out to just break that)
09:52 sima: since trying to cover it all with hw testing seems unrealistic, and we have a lot of tricky corner cases in this entire area
10:10 sima: tzimmermann, replied to one patch with a bit more text that I want to include when merging, I think for the others I'll just add Fixes: lines
11:34 tzimmermann: sima, thanks a lot
11:36 sima: thanks to you too!
11:37 sima: tzimmermann, also I guess the getfb/getfb2 issues christian pointed out again needs an igt and then either a quick hack or maybe redo the handle_count stuff as a fix for that (but only after we're a lot more confident we have good test kunit coverage imo for this all)
11:39 tzimmermann: sima, i've not seen them with the original code
11:39 tzimmermann: they seem rare
11:40 javierm: sima, tzimmermann: probably worth adding an entry to Documentation/gpu/todo.rst ? About the need for kunits on this area
11:41 tzimmermann: javierm, could do
11:41 tzimmermann: it's non-trival though. the number of possible candidates is small :(
11:42 javierm: tzimmermann: that's OK I think, there are other tasks already with 'Level: Expert'
11:42 sima: tzimmermann, it needs a specific squence of ioctl that doesn't make sense for real userspace
11:43 sima: but you can hit a WARN_ON in the kernel with it, so it's kinda a very mild CVE issue
11:43 sima: especially if you have oops-on-warn and reboot-on-oops enabled :-)
11:43 sima: javierm, might need a "Level: Too hard for maintainers" for this one :-/
11:44 tzimmermann: well, oops
11:44 javierm: sima: I see...
11:44 tzimmermann: javierm +1 :D
11:44 sima: tzimmermann, but it's also really old one, so imo not a good reasons to somehow try to squeeze the handle_count change in still
11:44 sima: tbh surprised syzkaller hasn't found it yet, it should be able to
11:49 tzimmermann: sima, i guess it could be solved by flagging the handle as gone, and later break the cycle during the framebuffer cleanup if flagged
11:50 tzimmermann: but that need to handle the situation without framebuffer as well
11:54 sima: yeah it all looks complicated and a bunch of cases
11:54 sima: I'm kinda leaning towards just detecting the case and bailing out with EALREADY or something
11:54 sima: since it's clearly userspace doing something funny, or we'd have gotten bug reports
12:03 zmike: mareko: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36069
12:35 javierm: tzimmermann: while reviewing your vesadrm 8-bit palettes patches, I rememberd that stumbled across the following while working on the st7567 support:
12:35 javierm: https://elixir.bootlin.com/linux/v6.16-rc5/source/drivers/gpu/drm/sitronix/st7586.c#L50
12:36 tzimmermann: javierm, thanks for the review
12:37 javierm: I wonder if that driver could use your new helpers too with a custom drm_crtc_set_lut_func callback
12:38 tzimmermann: that's a funny thing. i've seen it before
12:39 tzimmermann: but it encodes 3 gray-scale pixels in one byte. that's something else
12:40 tzimmermann: we need a more flexible way of writing these pixels to output buffers. right now it's just system or i/o memory. that leaves out all the devices on serial busses. i've been thinking about that though
12:41 tzimmermann: i first want to fix the 'read side' where we fetch the pixel data fro msource buffers.
12:42 tzimmermann: that funny format would likely be DRM_FORMAT_R3R3R2
12:42 javierm: tzimmermann: yeah, but it's still a C8 right? And it has this custom st7586_xrgb8888_to_gray332() too that might be worth to move to drivers/gpu/drm/drm_format_helper.c
12:43 javierm: or maybe not, since is too driver specific?
12:43 tzimmermann: javierm, it's a grayscale display AFAICT. and the first and second 3 bits index into a 8-entry grayscale lut
12:43 tzimmermann: the final 2 bits index into a 4-entry lut
12:43 javierm: tzimmermann: ahh, I see
12:44 javierm: such a weird format...
12:46 javierm: tzimmermann: anyways, your patches looked good to me. Even when I'm not an expert on these ancient displays and LUTs :) Sorry for taking me so long to review them
12:47 tzimmermann: thanks for reviewing. if it wasn't for you, i'd not make much progress with this
12:47 tzimmermann: javierm, if you want to help a bit more, you might take a look at https://patchwork.freedesktop.org/series/150889/ and https://patchwork.freedesktop.org/series/150888/
12:47 tzimmermann: those convert two drivers to use shadow-plane helpers.
12:48 tzimmermann: with gem-dma, it's possible to get the buffer's vaddr in kernel space directly. but it's not recomnended. the clean way is to call vmap and use the returned pointers
12:49 tzimmermann: we have a maintainer for those two drivers, but i haven't heard of him yet
12:49 javierm: tzimmermann: sure, let me take a look to those now
12:50 tzimmermann: thank you so much
13:04 javierm: tzimmermann: reviewing these patches reminded me that we talked at some point to get rid of drm_atomic_helper_damage_merged() callback (which both drivers use it)
13:05 javierm: jfalempe found that is much better to iterate over the damage clips instead of using a merged rect, that could be quite big in some cases
13:06 javierm: tzimmermann: that might be a good candidate for an entry in Documentation/gpu/todo.rst since the changes are trivial
13:06 tzimmermann: i think so. but you also mentioned that i2c is really slow. there might be corner cases where merging updates makes sense
13:06 javierm: tzimmermann: hmm, right
13:08 javierm: you are correct, better to keep in these drivers then since we don't know what the trade offs are for these particular display controllers
13:10 tzimmermann: javierm, IIRC the original quake software renderer had to avoid redraws of overlapping updates at all cost (because of pentium performance). it used a dedicated algorithm to do that. it draw the scenery and the monsters on top, but each pixel was only written once. maybe that is something to look into here as well. both problems seem related.
13:29 alyssa: jenatali: there's a dzn commit in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050.patch
13:30 alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050 rather
13:32 jenatali: alyssa: I saw, lgtm
21:47 sima: javierm, yeah the damage_merged() helper really should be only for hw upload where you might only have a single rectangle to program
21:47 sima: or a limited set where you need a fallback
21:47 sima: sw controlled upload should just iterate, but I guess it's more code