IRC Logs of #dri-devel on irc.freenode.net for 2024-07-24

00:00 Lynne: how is that relevant then?
00:00 jenatali: > I think what I really want is a proper linker for spirv
00:00 Lynne: > I don't see why you couldn't use it as a standalone GLSL -> SPIR-V compiler
00:00 Company: it allows me to write tiny GLSL components, turn them into spirv, and link them together before sending them off to Mesa
00:01 jenatali: Lynne: I was saying you could use Mesa as a standalone compiler. My comments about spirv-tools were just about the linker
00:01 unit327: Do you think this is a mesa problem or a godot engine problem? https://github.com/godotengine/godot/issues/94604
00:02 Lynne: ah, ok
00:02 Lynne: but sure, its just a lot of work and code
00:06 Company: I actually like precompiled shader code - both because I don't trust random GLSL parsers to actually work and because I get proper syntax checking of my code at compile time
00:06 Company: so spirv is cool with me
00:06 Company: what isn't cool is the limited default featureset compared to what you can do with printf() and GLSL
04:22 kurufu: Not sure if this is the right place, but is it possible to synchronize with the clock used for drmHandleEvent's page_flip_handler2? On my amd system it seems to generally report 500us ahead of CLOCK_MONOTONIC.
09:31 Lynne: so RDNA3 decided to have FOUR 64-bit add instructions for memory, but not have one acting on registers...
09:33 Lynne: just chuck one in next time, okay? there's 56 billion transistors in there, a few more aren't going to break the bank
09:36 glehmann: rdna4 will actually have 64bit add scalar instructions
09:37 glehmann: and for vector, what's wrong with 2 32bit add with carry-in/out? iirc there is ever some fast forwarding for the mask so total worst case latency isn't really worse than a "real" 64bit instruction
10:01 Lynne: nothing about the latter, I was unhappy about scalar 64-bit adds missing
10:01 Lynne: since it means that BDAs are even more expensive to index
10:14 glehmann: don't get me wrong, 64bit SALU adds are nice to have, but in practice SALU is rarely a bottleneck
10:20 enunes: psa there was a power failure this morning in the lima lab for mesa CI for ~1-2 hours, I think it went back to normal and is picking up pending jobs now. Sorry about that and I hope it didn't cause any block during that time
10:31 zamundaaa[m]: kurufu: to translate between the two clocks, you can fetch the current time with both clocks and use the difference. But it should really be in CLOCK_MONOTONIC already... are you sure the 500us don't come from something else?
13:04 eric_engestrom: zmike: please assign to marge all your dril/etc. fixes that are ready, I'll do the release in a couple of hours :)
13:05 zmike: I'm working on the last one
13:05 zmike: almost there
13:05 eric_engestrom: I mean, this is true for everyone, but you have more urgent fixes than most :P
13:05 eric_engestrom: ack
13:05 eric_engestrom: ping me once you do?
13:05 zmike: k
13:06 eric_engestrom: thx
14:00 mareko: Lynne: what is BDA?
14:00 Lynne: buffer device address, e.g. real pointers
14:16 mareko: the 64-bit memory ops must be 64-bit because they are atomic, can't be split
14:28 Lynne: yeah, my point was that if they could put 64-bit multiply units which handle memory, they should've also put 64-bit units which handle registers
14:28 Lynne: *adds, not mults
14:30 glehmann: spealing of mults, rdna4 also add s_mul_u64 btw
14:50 mareko: Lynne: it would just be a macro opcode that is decoded as 2 32-bit adds, there's probably not much point doing that
14:54 Lynne: that's right, that's why I asked for 64-bit add units, which implies true 64-bit adds
14:55 mareko: 64-bit add units are unlikely to exist in any hw, internally they are usually just 32-bit add units executing in 2 passes
14:56 Lynne: really? even the memory 64 bit adds?
14:57 mareko: it's not about what is real 64-bit or not, it's about what the throughput and latency is
14:58 mareko: and pipelining
14:59 Lynne: obviously, you'd prefer real 64-bit adds because then you have shorter dependency chains
15:00 mareko: there is no dependency between low and high 32 bits, the carry out is forwarded as carry in to the next instruction without any delay
15:01 zmike: eric_engestrom: I think I'm done, but there's nobody around to ack the updated version and the testers seem to have disappeared
15:01 zmike: dunno what you wanna do about that
15:03 mareko: that's an architectural optimization for carry out/in, and things like make it difficult to tell just from the ISA how fast it is
15:11 eric_engestrom: zmike: which one is it?
15:11 austriancoder: MESA_SHADER_FRAGMENT vs. PIPE_SHADER_FRAGMENT - which is the preferred one?
15:11 zmike: eric_engestrom: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30311
15:11 zmike: austriancoder: first one
15:12 austriancoder: zmike: thx
15:28 eric_engestrom: zmike: I can't see anything wrong, but I'm not confident I understand enough to say this is right
15:28 eric_engestrom: it could use a code formatting pass though :]
15:28 eric_engestrom: lots of mis-indentations and things like that
15:29 eric_engestrom: but I think this is good enough to merge
15:29 eric_engestrom: (also, I'd love fewer 1-letter variables ;P)
15:30 zmike: smh loop iterators have to be a single letter
15:30 zmike: everyone knows that
15:33 eric_engestrom: hehe
15:55 eric_engestrom: https://gitlab.freedesktop.org/mesa/mesa/-/jobs/61458015
15:55 eric_engestrom: "error: 'c' may be used uninitialized" and pointing at `unsigned c = 0;` -> I have never seen a more obvious gcc bug
15:55 eric_engestrom: zmike: not sure how to "fix" that one
15:57 daniels: well, you could fix the code to not trip the completely correct warning
15:58 daniels: if (c) is hit via 'goto out' from places before 'unsigned c = 0' is declared
15:58 tintou: c = 0u ?
15:58 daniels: so c is technically not yet initialised
15:59 eric_engestrom: oh
15:59 eric_engestrom: thanks daniels
15:59 eric_engestrom: I completely missed that
16:00 eric_engestrom: zmike: ^
16:02 zmike: eric_engestrom: if you want to squash it in, I've stepped out for a bit and won't be at a pc
16:03 eric_engestrom: zmike: sorry no, I'll do the release now, there are other MRs being merged so it would take a while until this one is merged
16:03 eric_engestrom: it will be in the next -rc
16:03 zmike: eric_engestrom: errr
16:04 zmike: but the release will be unusable
16:04 zmike: without this
16:04 zmike: lmao
16:04 eric_engestrom: is it really that bad?
16:04 zmike: just cancel marge for everything else and push it through
16:04 eric_engestrom: ack
16:06 zmike: eric_engestrom: or just do the rc tomorrow
16:06 eric_engestrom: yeah, I think I'll do that
16:07 zmike: 🤝
16:07 zmike: should be smooth sailing after this one
16:07 zmike: we've uncovered a lot of ci coverage gaps
16:07 zmike: which will never be filled
16:08 MrCooper: famous last words
16:10 zmike: trust me buddy it's fixed
16:16 DavidHeidelberg: anyone runs rusticl on nouveau? I would like to see if piglit/bin/cl-api-enqueue-fill-buffer pass
16:16 DavidHeidelberg: gfxstrand: ^ ?
16:20 kurufu: zamundaaa[m]: Yes but which clock to fetch, and how to fetch it is the issue. I did some digging and it does seem to generally just be ktime which should be CLOCK_MONOTONIC which seems it should be mostly aligned across cpus? So I guess the question is page flips getting signalled before they flip expected.
16:21 zamundaaa[m]: > is page flips getting signalled before they flip expected
16:21 zamundaaa[m]: yes
16:24 zamundaaa[m]: About which clock to fetch, that's what DRM_CAP_TIMESTAMP_MONOTONIC is for
16:24 zamundaaa[m]: On any not super insanely old kernel it should be set though
16:29 kurufu: https://github.com/mikesart/gpuvis/issues/30 seems to cover many of the questions i had actually. Thanks I was expecting them to be signalled after.
16:34 eric_engestrom: DavidHeidelberg: I don't know about rusticl on nouveau, but rusticl on zink on nvk yes, it passes
16:35 eric_engestrom: api@clenqueuefillbuffer,Pass,2.076497
16:35 DavidHeidelberg: eric_engestrom: zink is fine, it's handled differently, I wondering if the legacy nouveau code pass
16:35 eric_engestrom: ack
16:35 DavidHeidelberg: it looks good as a inspiration, but I'm afraid it may have same problem I have