IRC Logs of #dri-devel on irc.freenode.net for 2025-01-30

00:39 daniels: mareko: ask ajax and MrCooper, but I think they only really need llvmpipe and spice
00:43 daniels: jenatali: thankyou!
00:44 jenatali: I'd been putting it off. I really hate building LLVM
00:45 daniels: me too buddy
00:59 jenatali: Apparently the Vulkan runtime no longer installs unattended with /S but now the SDK includes it?
01:54 zmike: tarceri: actually I assigned for you to make sure it goes in since it's blocking another MR from landing
02:15 mareko: MrCooper: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33211/diffs?commit_id=70398ff5140891899927590c46d27ef8c48c6898
02:45 jenatali: Ugh how do I see which LLVM module is needed but missing?
02:49 airlied: usually grep
07:45 MrCooper: mareko: technically I'm in a different team now (focus on mutter & Xwayland), as is ajax, so you rather need to ask airlied or José Exposito; AFAIR we do support amdgpu with acceleration on ppc64el in RHEL in principle though, so not having any CI coverage isn't great
08:45 sima: dakr, good mail, thanks for doing the wrestling
08:46 sima: also chatted with airlied and we're at 15+ years of dma-api maintainers randomly nacking stuff gpu drivers want/need
10:57 sima: DemiMarie, on the amd/virtio discussion, all these issues you point out is why I think there's either pup(FOLL_LONGTERM) or real hw support so that the iommu/gpu handles page faults/invalidations at the hw level
10:57 sima: and the mmu notifiers just pass tlb flush commands forward as needed
10:57 sima: anything else indeed just falls apart everywhere at the seams
13:25 zmike: mareko: is LINEAR really not supported for RGBA32F formats?
13:28 zmike: cuz it seems to work...
14:56 DemiMarie: sima: In this case pup(FOLL_LONGTERM) is even more attractive because device memory is just virtual memory.
14:57 DemiMarie: sima: Can the forced migration to device memory be done reliably?
14:58 DemiMarie: Also, time to bypass the DMA API maintainers and send something directly to Linus?
15:00 phasta: You should think long-term. Are then fixes and reworks also to be sent directly to him 3 years down the road?
15:02 sima: DemiMarie, I didn't really follow that part since it was about virtio specific things
15:03 sima: the kernel really can't, because if you do this like hmm you again need hw support for pagefaults
15:03 sima: plus hmm cannot guarantee migration to device memory
15:04 DemiMarie: sima: the idea I had is to move the pages to device memory and leave them there
15:05 sima: anon memory probably freaks out to no end if it's suddenly device memory without a struct page
15:05 DemiMarie: If you don't have HW support for pagefaults then it's up to the host kernel to fail the operation.
15:05 DemiMarie: What about device memory with a struct page?
15:05 sima: you could do it as coherent device memory, then anon memory works in your device memory (unlike device private memory that hmm uses)
15:06 sima: but you're again stuck on the core mm's inability to guarantee migration
15:06 sima: migration is all best effort
15:06 DemiMarie: stop_machine()? Only half joking.
15:06 sima: not enough
15:07 DemiMarie: Why can't migration be reliable?
15:07 sima: linux core mm does a lot of randomly grabbing a page/folio reference, and those all block migration
15:07 sima: with enough whacking it mostly works for stuff like cma or memory hotunplug with zone_moveable, but it's brittle
15:08 DemiMarie: What about make_device_exclusive_range() or similar, but without the exclusive part?
15:08 sima: pup(FOLL_LONGTERM) is one of the pieces to make it less brittle, so that you know whether an elevated refcount is temporary and more retrying should help
15:08 sima: or a permanent pin, and more retrying is only going to heat the world
15:08 sima: DemiMarie, that doesn't move anything
15:09 heat:the world
15:10 sima: DemiMarie, I guess you could try with coherent device memory and just migrating really, really hard
15:10 sima: then you're at the same peril like cma or memory hotunplug
15:10 DemiMarie: sima: could there be a way to lock out anyone who tries to grab a reference?
15:10 sima: but for per critical stuff like hmm migration it's fundamentally fallible
15:11 sima: DemiMarie, disable all the cool features like transparent hugepages
15:11 sima: numa load balancing
15:11 sima: ksm
15:11 sima: writeback too iirc
15:11 sima: constantly more getting added
15:11 sima: defo direct i/o
15:11 DemiMarie: sima: I meant "grab a mutex so they block"
15:12 sima: no
15:12 sima: DemiMarie, https://chaos.social/@sima/113911739075079093
15:12 heat: in theory you could do that but you'd create "heating the world" on the opposite, refgrabbing direction
15:12 DemiMarie: Why is that?
15:13 sima: see link but tldr is the linux core mm is designed on the principle that quicksand is awesome
15:13 heat: because if there was a refcount lock-out you'd spin on folio_get
15:13 heat: because there isn't, you spin on page migration (or fail)
15:14 heat: it's way easier to fail page migration than failing a normal-ass refcount
15:14 sima: it's also that core mm is lockless to the max
15:14 DemiMarie: For performance reasons?
15:14 heat: yes
15:14 sima: so even if you hold a reference and the lock for something, it's really surprising how little guarantees that often gives you
15:15 sima: like the entire pte walking is just pure yolo, and it happens absolutely everywhere all the time
15:15 DemiMarie: Why does it not crash? RCU?
15:15 heat: hey it's not pure yolo it's homebred RCU
15:15 sima: some of the best people in the world banging their heads at it for decades
15:16 heat: gup_fast generally just disables interrupts and doesn't use RCU
15:16 sima: heat, oh yeah it's a work of art
15:16 heat: to free a page table you need to do a TLB shootdown thus IPI thus if your IRQs are disabled it's safe to traverse
15:16 heat: it is in effect homebred RCU
15:17 sima: there's also so much fun due to locking inversions
15:17 sima: where you lookup a thing, grab the locks and then recheck whether you got the right one
15:17 sima: and there's fundamentally no way to just take a lock to make things stable
15:17 sima: and it's getting worse every year, like with lockless vma traversals and page faults
15:17 DemiMarie: I wonder at what point it would actually have been faster (dev time wise) to formally prove the whole thing correct and not have to do the debugging.
15:18 sima: DemiMarie, open random file in mm/ and stand back in awe at the if ladders
15:18 sima: especially anything handling pagetable entries
15:18 sima: but yeah formal proof probably good idea
15:19 sima: but the issue is also, what do you even want to proof
15:19 DemiMarie: "no memory corruption"
15:19 sima: because some things look very, very fish from a "will it livelock" pov
15:19 sima: not even close to enough
15:19 DemiMarie: no deadlocks, no livelocks, etc
15:19 sima: the livelocks are real pain
15:20 sima: and often stochastic stuff
15:20 sima: like the race windows align such that you win often enough to never pile up, but if you'd have consistently bad luck you'd pile up
15:20 DemiMarie:wonders if past a certain point people should just be using multiple machines, rather than trying to make mm scale to huge machines
15:20 sima: yes
15:20 sima: cloud didn't happen just for fun
15:20 heat: this is not just about making mm scale to huge machines
15:21 heat: small machines are also heavily impacted
15:21 heat: big locks suck
15:21 sima: yeah small cros tend to really thrash mm
15:21 heat: the per-vma locking patches address problems <checks notes> in android when apps create like 80 threads at startup
15:21 DemiMarie: Big locks suck unless you care about reliability and security way more than performance. I suspect that is why OpenBSD is so full of them.
15:22 heat: OpenBSD is full of them because it's a hobby kernel
15:22 sima: yup
15:22 heat: they would like to get rid of them and are slowly doing so
15:22 sima: that too
15:22 sima: like I think core mm is probably one place where rust wont help
15:23 sima: like some of the memory barrier comments in there are just pure nightmare fodder
15:23 DemiMarie: ATS might, though. That's full dependent & linear types.
15:23 sima: since it's not just about your cpu code, but also about stuff like how tlb fetches actually walk pagetables on your machine
15:24 heat: like, yes big locks make for simpler code, which is nice for security and reliability. but they also make you prone to suffer terrible choking on those huge locks, thus a reliability problem (and in effect, probably a security one, depending on what you're running)
15:25 sima: DemiMarie, I think more formal proofing would be good, afaik only rcu in upstream linux is fully formally proved
15:26 DemiMarie: sima: I was thinking of extracting core mm from F* or Coq.
15:27 DemiMarie: heat: I think safety critical systems prefer to use multiple components that are individually single-threaded. They can scale by having many cores that don't share memory.
15:27 sima: DemiMarie, e.g. https://lore.kernel.org/dri-devel/887df26d-b8bb-48df-af2f-21b220ef22e6@redhat.com/ last paragraph
15:27 sima: device-exclusive was added, but not everywhere, boom in way too many places
15:28 DemiMarie: Honestly I think userptr is rather cursed.
15:31 DemiMarie: Can migration be reliable enough to make uAPI depend on it?
15:33 DemiMarie: I also wonder if this could be dealt with using hypervisor magic: "hey, that page of mine is a blob object now"
15:42 mareko: zmike: why wouldn't it be supported?
15:42 zmike: mareko: I have an MR to fix
17:54 jenatali: Ugh. Meson 1.5.1 can't use CMake to find LLVM 19
17:54 jenatali: What a mess
17:55 daniels: jenatali: ...
17:56 jenatali: Means I need to rebuild the primary Windows container too to get a new meson apparently
17:58 daniels:twitches
17:58 daniels: that was a deeply unpleasant time of my life
17:58 daniels: the bit where I broke up with my long-term girlfriend was probably way less damaging than Windows + Meson + LLVM + CMake + CI
17:59 jenatali: Yeah... I got the build working locally with llvm19 so at least I'm pretty confident that just bumping meson should work
17:59 daniels: heading out now, fingers crossed for you tho :)
18:11 dj-death: daniels: and you do this for work...
18:49 jenatali: Aaaand new meson doesn't install without long paths enabled
18:50 jenatali: I hate dependency updates
18:57 mareko: wouldn't it be nice if LLVM wasn't required by Mesa
19:03 jenatali: Mhmm
19:04 jenatali: LLVM as a runtime dependency is terrible
19:07 kisak: mareko: hypothetically, how would you feel about delaying pulling llvm<18 support until after mesa 25.0-branchpoint and hopefully radeonsi/ACO is good to go for the newer AMD gfx generations by the time 25.1 rolls around? ~non-sequitor~ If the mesa build sees llvm 15 is around, but not usable with radeonsi/llvm, will it automatically build radeonsi/ACO or will it fail the build as requirements not met
19:07 kisak: for radeonsi/llvm?
19:09 kisak: jenatali: llvm being too new for meson autodetect is a chronic issue. Over in Debian land, the build system adds in the equivilent to
19:09 kisak: export PATH:=/usr/lib/llvm-15/bin/:$(PATH)
19:10 jenatali: Yeah but Windows doesn't do llvm-config :(
19:10 kisak: well, that's dandy
19:11 jenatali: Fun, LLVM 19 requires /Zc:preprocessor for MSVC to be able to compile its headers
19:11 jenatali: Hopefully Mesa likes that too
19:15 jenatali: Looks like yes, phew
19:20 dcbaker: jenatali: we shouldn’t require long paths in meson. That sounds like a bug on our end
19:21 jenatali: dcbaker: It was a test that got run during chocolatey install that was too long
19:21 jenatali: I'll grab the log, one sec
19:22 alyssa: mareko: llvmpipe's existence makes that kind of a nonstarter..
19:23 jenatali: dcbaker: Ah pip, not choco. Log: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/70278327#L391
19:24 jenatali: And I was wrong it's not meson it's numpy :(
19:24 jenatali: Oh it's meson's tests running as part of numpy's install. Gross
19:25 dcbaker: jenatali: of course it’s cmake… and of course it’s in numpy which has a vednored copy of meson while we get some of their stuff upstream…
19:25 dcbaker: I wonder if I can ask the numpy folks to not run our tests on install
19:26 jenatali: Seems like the right call
19:30 dcbaker: Although that’s also an old version of numpy and numpy >=2.0 should work
19:35 mareko: kisak: I can delay that. LLVM isn't required by AMD drivers and ACO is used when LLVM is disabled at build time, but it's also not a tested or optimized configuration on RDNA 1-4. It's possible that when you enable llvmpipe, it also enables LLVM for radeonsi.
19:36 mareko: radeonsi+ACO likely won't be ready by 25.1
19:37 jenatali: dcbaker: There's an issue with some of Mesa's scripts that prevent it from working with >= 2.0
19:38 dcbaker: Sigh. I guess i only fixed piglit. I probably fix that
19:38 jenatali: Oh maybe it was piglit, I don't remember. That same container gets used to build both
19:40 jenatali: Yeah https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29649#note_2493559 says it was piglit
19:40 jenatali: Should've checked if that constraint could be removed. Oh well
21:01 jenatali: Uh... glsl compiler warnings test is failing with access violation (segfault) and I don't repro it :(
21:04 DemiMarie: sima: Actually, there is another option: try to migrate the pages, and if that is not possible, either return an error to userspace or leave the pages on the CPU and try again later.
21:10 jenatali: Uh... and passed on re-run. That's not good