03:01airlied: zmike: sorry I screwed up in gitlab and canned one of your ci jobs
03:02zmike: 😬
04:35u-amarsh04: git bisecting
04:35u-amarsh04: git bisect skip 18 times in a row was tedious
06:40mareko: zmike: the vertex shader input is an untyped 32-bit or 16-bit number. The input type doesn't matter, but only the number of bits matters. R8_UINT fully determines the contents of those bits, so in this case, R8_UINT is always zero-extended to 32 or 16 bits for the shader input.
06:42mareko: zmike: while the shader input type doesn't matter for how the input is initialized, it does matter for algebraic instructions, for example, signed and unsigned integer comparison instructions behave differently if the top bit is 1
06:43mareko: zmike: so the full input type is really for the shading language itself, not for the input initialization
06:44mareko: zmike: so yes, it's legal to have mismatching vertex formats and shader input types
07:33countrysergei: https://github.com/uzh/signal-collect/blob/master/src/main/scala/com/signalcollect/util/SplayIntSet.scala it seems that W3C triplestore rdfa and sparql solutions have the needed structure in that case of signal collect. https://www.zora.uzh.ch/id/eprint/119575/1/20172959.pdf compressed splay root nodes. SPlayNode is indice to the list, and nodes contain the whole interval.
07:34countrysergei: there are tools to convert between triplestore and sparql
07:36countrysergei: same as there are tools to convert from xml like openmath to triplestore and sparql.
08:40u-amarsh04: git bisect so far: 91 good, 93 bad and 60 skipped commits
10:27u-amarsh04: still bisecting
11:46countrysergei: but intset of bitsets uses some division to understand the how to scan nodes from intervalfrom to intervalto.
11:50countrysergei: but it might be only for debugging also, not sure
12:53countrysergei: it however seems something interesting, last definition in bitset seems to suggest that this is encoding that is made from basevalue as 64, so 64+1 is first bit but next ones i dunno yet, likely 64+2 etc. but bytes are done separately, min max has some while loops but can be overwritten once understood about the encoding.
12:54countrysergei: This goes very close to what i had been suggesting, so i think the code is usable
13:03zmike: mareko: I was afraid you were going to say that 🤕
13:04zmike: what was your ask the other day? whether rgba8 with stride=1 was legal?
13:08countrysergei: the docs suggest that some primitive check-pointing is supported, have not inspected what it means, but technically one would want to get rid of the sparql or triplestore parsing overhead for openmath dictionaries.
13:17countrysergei: also docs suggest some limitations that are not seeming very relevant to use case on 64bit architectures, so the biggest limitation is that arbitrary precision is not supported, but refinement id of 2 in power of 31 seems already incredibly large
13:18countrysergei: fully bounded sparql queries do not support this splay cache , i looked from wikipedia what it means, so those are not needed either.
14:54Hazematman: Hey, does anyone here happen to know the status of mesa on android? I'm messing around with the latest aosp main and trying to build a cuttlefish emulator image that includes mesa with latest mesa master (not the mesa main that its aosp, the latest on fdo gitlab) and when I try to build I get the error "FAILED: ninja: unknown target 'MODULES-IN-external-mesa3d'"
15:19pq: hwentlan, did you notice https://lists.freedesktop.org/archives/dri-devel/2024-February/441285.html yet?
17:05sima: more people should know about drm_vblank_work
17:57pepp: sima: thx for your comments on the trace events series. Did you get a chance to look at v3?
17:57pepp: sima: because it could answer your chained fences question with the addition made to dma_fence_chain_init
18:03sima: pepp, oops missed that, was a bit chaos this week
18:03sima: looking now
18:07sima: pepp, doesn't really add any of the big design questions, since I still don't see how exactly you're going to tie it all together
18:07sima: like what if you have a pile of apps and compositors rendering
18:07sima: since I'm guessing you're guessing the actual dependencies through the processes that do stuff?
18:08sima: or it's _extremely_ amdgpu specific, and that doesn't sound very useful
18:09sima: or at least quite suboptimal design point since both atomic commit machinery and drm/sched is very generic by now and knows what's going on in driver-independent code entirely
18:10sima: pepp, or put another way: if the generic events are only of use with the amdgpu specific stuff, they're not really generic
18:11sima: (including existing amdgpu specific trace events imo)
18:11pepp: sima: they shouldn't be amdgpu-specific. But it's also possible that I baked amdgpu-specific assumptions because that's the only hardware I can test on
18:12sima: pepp, I mean if you can do the gpuvis dependency tracing with all amdgpu trace points disabled, then I think it's solid
18:12sima: if you need any amdgpu specific events, then it doesn't look like a solid design yet
18:12sima: e.g. https://lore.kernel.org/dri-devel/20240216151006.475077-6-pierre-eric.pelloux-prayer@amd.com/
18:13sima: that seems needed, and it definitely wont exist on other drivers which also use drm/sched, and so _do_ have the dependency information fully available in generic data structures
18:13sima: and so the generic trace events should be able to get that out to userspace
18:14pepp: no it's not needed; it also works fine without this. But the application doing the parsing needs to know how to transform a series of individual events into a list of jobs (= with a begin and an end)
18:14sima: yeah, that's the part which doesn't really work
18:15sima: and why I think we need a clear fence->fence trace event or it's just a mess
18:15sima: and we kinda have that
18:17pepp: even with a fence->fence event, the parsing app would have to determine which N-events form a job
18:29sima: pepp, https://paste.debian.net/hidden/ae666b5f/ some notes sprinkled around in generic code which should be all the places you need
18:30sima: won't cover i915-display because despite that that's atomic, it hand-rolls this stuff still
18:30sima: but not your problem imo
18:30sima: also we have really annoying tracepoints because they're not even close to consistent with dumping stuff like fences or crtc
18:31sima: but I'm not sure whether we can break them all or whether we need _v2 versions that are consistent
18:31sima: so it's a pretty solid mess, but zero guessing needed for which dependencies belong to which work, that info is all there
18:33pepp: sima: interesting, thanks!
18:34sima: pepp, also I /think/ but not entirely sure that on the drm renderD side of things (not amdkfd) all the ttm memory management also goes through drm_sched_job
18:34sima: so might instead want to annotate those better with "what is this" than add driver specific events
18:36sima: pepp, I think what would be really great is uapi docs about how you need to use those, and how to assemble them back into meaningful stuff, in the drm uapi section
18:36sima: so we can officiate this as "fully uapi tracepoints that we'll promise to never break
18:36sima: I think that would also help a lot in reviewing the overall design and whether it is something drm can sign up to support for a close approximation of "forever"
18:37pepp: sima: alright, makes sense. Getting feedback from gpuvis dev would probably be useful too
18:37sima: oh absolutely, if we make this forever uapi we need userspace that's reviewed by the userspace project
18:38sima: so if they're not happy about the bazillion different ways we dump dma_fence into tracepoints, that would be something we need to fix
18:39sima: pepp, oh a really fun testcase would be a multi-gpu system
18:39sima: like amd apu+discrete or so
18:39sima: can't be i915 because that is neither using drm/sched and also hand-rolling too much atomic
18:40pepp: sima: I've tested multi-GPU a couple of times, was working fine
18:40sima: but xe.ko should be on board with all this
18:40sima: pepp, like rendering on discrete and displaying on integrated?
18:40pepp: yes
18:40sima: that's the fun one ...
18:40sima: nice :-)
18:41sima: pepp, oh if it's not clear how to get the atomic commit dependencies - drm_atomic_helper_wait_for_fences has the authoritative answer
18:43pepp: sima: noted, thanks
18:45sima: or should have, and I /think/ all drivers except should be covered including any memory management fences
18:45sima: *except i915
18:46DemiMarie: sima: is it reasonable for native contexts to `mlock()` all buffers passed to the GPU?
18:46sima: if not that would be a good reason to fix these drivers by moving them to standard functions
18:46sima: DemiMarie, it wont work for drm buffer objects
18:46DemiMarie: sima: are those pageable?
18:46sima: yeah
18:46DemiMarie: can that be disabled?
18:47sima: where's the fun in that :-P
18:47sima: especially for discrete you'd probably break the world since this breaks vram swapping ...
18:48sima: so I'd expect any gpu buffer object mlock to be somewhat driver specific :-/
18:48sima: DemiMarie, my brain fails me and I can't remember why you need mlocked gpu memory?
18:49sima: we chatted about how hard preemption is, but I forgot why mlock matters
18:50DemiMarie: sima: First, it avoids taking a bunch of complex code paths in the driver that are likely to have bugs. Second, any memory that ever gets mapped into a Xen VM must be pinned. Third, any page _provided_ by a Xen VM will be (inherently) pinned.
18:52DemiMarie: I expect most users to have iGPUs primarily, simply because those are the majority of GPUs on the market.
18:53DemiMarie: So far, my understanding is that the two biggest concerns about virtio-GPU native contexts for Qubes OS are the rate of memory unsafety vulnerabilities in drivers and the lack of a mechanism to notify the security team when such a vulnerability is discovered.
18:54sima: hm so not sure how much bugs you avoid, because we're still going to run all the code to prepare&map the memory
18:54sima: maybe a few less corner cases
18:55sima: DemiMarie, wrt security team, airlied&me aren't on that list, but we do get pulled in as needed for any gpu security issue
18:55sima: so security@kernel.org should work for these too
18:56DemiMarie: sima: so the context is that Qubes OS issues a Qubes Security Bulletin whenever a vulnerability is discovered that affects the security of Qubes OS. This includes vulnerabilities in Qubes OS’s dependencies.
18:56sima: and aside from some really old hw horrors we do expect that if a drm driver has renderD nodes, each open file of that is isolated from the others
18:57sima: ah yeah that one you wont get, simply because there's way too many of these and analyzing them all is hard work
18:57DemiMarie: how many (give or take a factor of 2) do you mean?
18:58DemiMarie: Obviously the total number of kernel vulns is huge, but only a small subset of those are relevant here.
18:58sima: well I mean including all gotchas in drm code with uaf, locking fun that looks exploitable, input validation lolz and bugs you can probably break
18:58sima: I'd expect a substantial part of cc: stable patches are exploitable for drm
18:58DemiMarie: how many of those per year do you have roughly?
18:59sima: assuming they are for a driver you care about
18:59sima: let me ask git for some guesstimates
19:02Hazematman: <gallo[m]> "Hazematman: welcome to the club...." <- I was able to resolve that specific build using this MR https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27648
19:02Hazematman: I'm trying to get llvmpipe&lavapipe building so I also had to make some changes to include a LLVM build that made them happy
19:02Hazematman: Right now waiting for the image to build
19:05sima: DemiMarie, ok some from a rough inaccurate git log the 6.1 lts kernel has ~65 fixes to shared drm core
19:05sima: about every 5th looks real scary, most of the others are hw quirks and stuff like that
19:05Hazematman: gallo: also thanks for the compliment :)
19:05sima: real scary = might be exploitable, but I'm defintiely not going to make a fool of myself and guess
19:06sima: DemiMarie, 6.1 is a bit older than a year
19:07sima: I didn't look at drivers because a) that's much harder to asses and b) there's a lot more noise that's probably just fixing display issues but can't be exploited in any meaningful way
19:08sima: so 10 per year for drm core code sounds about right, plus whatever is for the drivers you're using
19:09DemiMarie: sima: 10 per year is about what we have for everything else in Qubes OS, plus the number of stuff in drivers which is about the same IIRC
19:09sima: (bit aside, but that's what I expect the firehose of CVE's will also match with once the new kernel CNA gets going)
19:10DemiMarie: hopefully that gets big companies (Google?) to throw more people at increasing overall code quality
19:10sima: I think the expectation is that there'll be on the order of hundreds of CVEs for each kernel release
19:10sima: ofc a really big amount only apply to specific hw support, but there's a _lot_
19:11sima: DemiMarie, there's also the issue that with all the kernel hardening enabled, a lot of these are a lot less exploitable
19:11sima: but since those are all Kconfig knobs, they all count
19:12sima: plus you probably still want to patch, just in case someone figures out how to knock out the hardening
19:12DemiMarie: sima: I wonder if companies will start pushing to rip out old code, simplify stuff, etc
19:12sima: DemiMarie, old code is generally dead code
19:13sima: and I think in practice the issue is a lot more that upgrading breaks too much, so I expect that users who care hopefully are a lot more motivated to build up _really_ good CI
19:13sima: so that they can validate new upstream release faster
19:13DemiMarie: which will in turn help everybody
19:13sima: a leisurely year or so that even the good android vendors take to upgrade is just not going to work
19:13sima: yeah
19:13sima: and hopefully also catch issues faster and before they hit a release
19:15sima: DemiMarie, for actual fundamental improvement I think weeding out the stupid "undefined behaviour lolz" in the linux kernel C flavor is going to help a lot more
19:15DemiMarie: I also suspect enterprise distros will start trimming their Kconfigs.
19:15sima: it's really hard, but a lot has been achieved already
19:15sima: oh yeah
19:15sima: plus probably enable a lot more hardening, even if it costs
19:15sima: since it doesn't help with the flood, but it helps with the severity
19:15DemiMarie: It is extremely obvious that a CVE does not affect a distro if the fix is to code that is not included in the build
19:16sima: so you have a bit more time since it's not obvious stuff that even fools can exploit
19:16sima: yeah
19:16sima: so maybe also some build tools that generate the actual list of CVEs impacting your build
19:16DemiMarie: For Qubes OS turning on GEM_BUG_ON() might be a good idea
19:16DemiMarie: looks like it would catch at least one OOB write
19:17sima: also I figure that stuff like CONFIG_VT=n will hopefully accelerate
19:17sima: there's some really horrible stuff there
19:17DemiMarie: syzbot is also going to start auto-assigning CVEs, IIUC
19:17sima: yeah maybe
19:17sima: although that one kinda boils down to "who pays for the work to fix stuff"
19:18DemiMarie: enterprise distro vendors
19:18DemiMarie: they will have no choice due to compliance requirements
19:18sima: yeah maybe someone will find some budget hopefully
19:18sima: plus disabling a lot of the old horrors like CONFIG_VT
19:18DemiMarie: at what point will non-enterprise distros be able to turn that off?
19:19sima: entirely depends upon how hard they care about the kernel console
19:20DemiMarie: the problem is that client hardware has no OOB management and no serial port
19:20sima: I think most desktop distros are actually leading enterprise distros, since there's some infra work missing still that needs really recent kernels
19:20sima: android/cros have it disabled since years
19:20DemiMarie: So until userspace comes up you are booting up blind
19:21sima: oh do not rely on drm for emergency logging
19:21sima: it's entirely busted
19:21sima: but we'll get a new drm panic handler to fix this properly for real, which is one of the infra issues
19:21DemiMarie: My long-term hope is for much of DRM and the GPU drivers to be rewritten in Rust
19:22sima: DemiMarie, also defacto desktop distros load the gpu drivers from initrd, and at that point you can have a small userspace logger running too
19:22DemiMarie: Heck, even a way to make use-after-free and OOB memory access defined at the cost of a 4x slowdown and serializing everything in the kernel might be a win for some setups.
19:22sima: because of the entire fw issues
19:23DemiMarie: sima: I think we will see stuff moving to a dm-verity volume for fw
19:23sima: DemiMarie, all the integer math stuff is getting fully defined at least
19:23sima: and I think range-validate arrays are coming, yay
19:23DemiMarie: sima: if there was a way to have overflow trap in release builds that would be awesome
19:23sima: compiler-checked range-validated arrays I mean
19:23sima: DemiMarie, it's coming
19:23sima: the issue is that there's a lot of integer math that intentionally overflows
19:23sima: to check userspace input
19:24sima: so you can't oops on that or you just made an exploit :-)
19:24DemiMarie: that still leaves UAFs and locking problems, though
19:24sima: yeah
19:24sima: although there's scope based automatic cleanup now
19:24DemiMarie: you can solve those with fat pointers and a global lock but it means a 4x or more slowdown last I checked
19:24sima: that should at least help a lot with bugs in error paths
19:24sima: but ofc, huge amounts of work
19:25DemiMarie: ultimately, though, I think C needs to be replaced
19:25sima: DemiMarie, yeah given that rcu use is growing steadily I don't think that'll work
19:25DemiMarie: C is just not a viable language in the 2024 threat environment
19:25sima: outside of some very niche cases
19:25sima: yeah, but the issue is a bit that there's too much C
19:25sima: so I think both improving C as much as possible and working on replacing it is needed
19:26sima: there's some good stuff coming in C standards discussions afaik too
19:26DemiMarie: And also deprivileging it by moving it to userspace
19:26DemiMarie: Unfortunately for DRM/GPU stuff that fails miserably, because (at least in the Qubes OS threat model) the GPU driver is privileged by definition!
19:26sima: oh yeah absolutely
19:26sima: like config_vt=n is just a must
19:27sima: DemiMarie, well if you look at the entire gpu stack we already have like 90% in userspace
19:28DemiMarie: sima: I also mean stuff like networking, USB, Bluetooth, Wi-Fi, filesystems, etc
19:29sima: yeah those are all fairly tricky
19:29sima: DemiMarie, I wonder whether per-gpu-ctx processes on the host side for virtio-gpu would be doable
19:30DemiMarie: sima: Maybe? How would it help?
19:30sima: that way if you can exploit the userspace part of the hw driver, it should be a lot more limited
19:30DemiMarie: sima: the current plan is native contexts, so the userspace part runs in the guest
19:30sima: yeah that's another one, also should be faster
19:30DemiMarie: The old virgl/venus stuff is never going to be supported in Qubes OS
19:33DemiMarie: In fact one major reason that work for GPU accel in Qubes OS has not started yet is that native contexts are still not upstream except for Qualcomm last I checked
20:51DemiMarie: Google said (at XDC 2023) that 80% of KMD vulnerabilities are not exploitable in the native context case. So that reduces the flood from 20 VM escapes per year down to less.
20:54 #intel-3d: mattst88: damn -- from #dri-devel:
23:12karolherbst: jenatali: sooo.. we need to support opencl-c.h with llvm-15+ afterall, because support for some extensions are missing if not using it (e.g. cl_intel_subgroups). Do you want to have a compile time switch to include it in the binary or should I just unconditionally embeded it with static llvm? It's like 800kb
23:13karolherbst: ehh wait..
23:13karolherbst: it's probably less after compression
23:13karolherbst: though we only compress libclc?
23:13karolherbst: anyway...
23:14karolherbst: do you want a flip for the file? Though given that some AI/ML stuff just depends on that intel subgroup ext might as well already ship it...
23:16jenatali: karolherbst: I don't care one way or another. It's a drop in the bucket compared to actual clang
23:17karolherbst: true
23:17karolherbst: makes it easier for me to always include it :)