00:00Lynne: how is that relevant then?
00:00jenatali: > I think what I really want is a proper linker for spirv
00:00Lynne: > I don't see why you couldn't use it as a standalone GLSL -> SPIR-V compiler
00:00Company: it allows me to write tiny GLSL components, turn them into spirv, and link them together before sending them off to Mesa
00:01jenatali: Lynne: I was saying you could use Mesa as a standalone compiler. My comments about spirv-tools were just about the linker
00:01unit327: Do you think this is a mesa problem or a godot engine problem? https://github.com/godotengine/godot/issues/94604
00:02Lynne: ah, ok
00:02Lynne: but sure, its just a lot of work and code
00:06Company: I actually like precompiled shader code - both because I don't trust random GLSL parsers to actually work and because I get proper syntax checking of my code at compile time
00:06Company: so spirv is cool with me
00:06Company: what isn't cool is the limited default featureset compared to what you can do with printf() and GLSL
04:22kurufu: Not sure if this is the right place, but is it possible to synchronize with the clock used for drmHandleEvent's page_flip_handler2? On my amd system it seems to generally report 500us ahead of CLOCK_MONOTONIC.
09:31Lynne: so RDNA3 decided to have FOUR 64-bit add instructions for memory, but not have one acting on registers...
09:33Lynne: just chuck one in next time, okay? there's 56 billion transistors in there, a few more aren't going to break the bank
09:36glehmann: rdna4 will actually have 64bit add scalar instructions
09:37glehmann: and for vector, what's wrong with 2 32bit add with carry-in/out? iirc there is ever some fast forwarding for the mask so total worst case latency isn't really worse than a "real" 64bit instruction
10:01Lynne: nothing about the latter, I was unhappy about scalar 64-bit adds missing
10:01Lynne: since it means that BDAs are even more expensive to index
10:14glehmann: don't get me wrong, 64bit SALU adds are nice to have, but in practice SALU is rarely a bottleneck
10:20enunes: psa there was a power failure this morning in the lima lab for mesa CI for ~1-2 hours, I think it went back to normal and is picking up pending jobs now. Sorry about that and I hope it didn't cause any block during that time
10:31zamundaaa[m]: kurufu: to translate between the two clocks, you can fetch the current time with both clocks and use the difference. But it should really be in CLOCK_MONOTONIC already... are you sure the 500us don't come from something else?
13:04eric_engestrom: zmike: please assign to marge all your dril/etc. fixes that are ready, I'll do the release in a couple of hours :)
13:05zmike: I'm working on the last one
13:05zmike: almost there
13:05eric_engestrom: I mean, this is true for everyone, but you have more urgent fixes than most :P
13:05eric_engestrom: ack
13:05eric_engestrom: ping me once you do?
13:05zmike: k
13:06eric_engestrom: thx
14:00mareko: Lynne: what is BDA?
14:00Lynne: buffer device address, e.g. real pointers
14:16mareko: the 64-bit memory ops must be 64-bit because they are atomic, can't be split
14:28Lynne: yeah, my point was that if they could put 64-bit multiply units which handle memory, they should've also put 64-bit units which handle registers
14:28Lynne: *adds, not mults
14:30glehmann: spealing of mults, rdna4 also add s_mul_u64 btw
14:50mareko: Lynne: it would just be a macro opcode that is decoded as 2 32-bit adds, there's probably not much point doing that
14:54Lynne: that's right, that's why I asked for 64-bit add units, which implies true 64-bit adds
14:55mareko: 64-bit add units are unlikely to exist in any hw, internally they are usually just 32-bit add units executing in 2 passes
14:56Lynne: really? even the memory 64 bit adds?
14:57mareko: it's not about what is real 64-bit or not, it's about what the throughput and latency is
14:58mareko: and pipelining
14:59Lynne: obviously, you'd prefer real 64-bit adds because then you have shorter dependency chains
15:00mareko: there is no dependency between low and high 32 bits, the carry out is forwarded as carry in to the next instruction without any delay
15:01zmike: eric_engestrom: I think I'm done, but there's nobody around to ack the updated version and the testers seem to have disappeared
15:01zmike: dunno what you wanna do about that
15:03mareko: that's an architectural optimization for carry out/in, and things like make it difficult to tell just from the ISA how fast it is
15:11eric_engestrom: zmike: which one is it?
15:11austriancoder: MESA_SHADER_FRAGMENT vs. PIPE_SHADER_FRAGMENT - which is the preferred one?
15:11zmike: eric_engestrom: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30311
15:11zmike: austriancoder: first one
15:12austriancoder: zmike: thx
15:28eric_engestrom: zmike: I can't see anything wrong, but I'm not confident I understand enough to say this is right
15:28eric_engestrom: it could use a code formatting pass though :]
15:28eric_engestrom: lots of mis-indentations and things like that
15:29eric_engestrom: but I think this is good enough to merge
15:29eric_engestrom: (also, I'd love fewer 1-letter variables ;P)
15:30zmike: smh loop iterators have to be a single letter
15:30zmike: everyone knows that
15:33eric_engestrom: hehe
15:55eric_engestrom: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/61458015
15:55eric_engestrom: "error: 'c' may be used uninitialized" and pointing at `unsigned c = 0;` -> I have never seen a more obvious gcc bug
15:55eric_engestrom: zmike: not sure how to "fix" that one
15:57daniels: well, you could fix the code to not trip the completely correct warning
15:58daniels: if (c) is hit via 'goto out' from places before 'unsigned c = 0' is declared
15:58tintou: c = 0u ?
15:58daniels: so c is technically not yet initialised
15:59eric_engestrom: oh
15:59eric_engestrom: thanks daniels
15:59eric_engestrom: I completely missed that
16:00eric_engestrom: zmike: ^
16:02zmike: eric_engestrom: if you want to squash it in, I've stepped out for a bit and won't be at a pc
16:03eric_engestrom: zmike: sorry no, I'll do the release now, there are other MRs being merged so it would take a while until this one is merged
16:03eric_engestrom: it will be in the next -rc
16:03zmike: eric_engestrom: errr
16:04zmike: but the release will be unusable
16:04zmike: without this
16:04zmike: lmao
16:04eric_engestrom: is it really that bad?
16:04zmike: just cancel marge for everything else and push it through
16:04eric_engestrom: ack
16:06zmike: eric_engestrom: or just do the rc tomorrow
16:06eric_engestrom: yeah, I think I'll do that
16:07zmike: 🤝
16:07zmike: should be smooth sailing after this one
16:07zmike: we've uncovered a lot of ci coverage gaps
16:07zmike: which will never be filled
16:08MrCooper: famous last words
16:10zmike: trust me buddy it's fixed
16:16DavidHeidelberg: anyone runs rusticl on nouveau? I would like to see if piglit/bin/cl-api-enqueue-fill-buffer pass
16:16DavidHeidelberg: gfxstrand: ^ ?
16:20kurufu: zamundaaa[m]: Yes but which clock to fetch, and how to fetch it is the issue. I did some digging and it does seem to generally just be ktime which should be CLOCK_MONOTONIC which seems it should be mostly aligned across cpus? So I guess the question is page flips getting signalled before they flip expected.
16:21zamundaaa[m]: > is page flips getting signalled before they flip expected
16:21zamundaaa[m]: yes
16:24zamundaaa[m]: About which clock to fetch, that's what DRM_CAP_TIMESTAMP_MONOTONIC is for
16:24zamundaaa[m]: On any not super insanely old kernel it should be set though
16:29kurufu: https://github.com/mikesart/gpuvis/issues/30 seems to cover many of the questions i had actually. Thanks I was expecting them to be signalled after.
16:34eric_engestrom: DavidHeidelberg: I don't know about rusticl on nouveau, but rusticl on zink on nvk yes, it passes
16:35eric_engestrom: api@clenqueuefillbuffer,Pass,2.076497
16:35DavidHeidelberg: eric_engestrom: zink is fine, it's handled differently, I wondering if the legacy nouveau code pass
16:35eric_engestrom: ack
16:35DavidHeidelberg: it looks good as a inspiration, but I'm afraid it may have same problem I have