04:59DavidHeidelberg: dj-death: quick ping, just confirm ur happy with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29207
04:59DavidHeidelberg: it's latest master/main
05:16mareko: karolherbst: visit_store_global ignores the writemask
05:16mareko: it's a Mesa bug
05:28dj-death: DavidHeidelberg: great, thanks a bunch
05:40mareko: karolherbst: some NIR pass should trim stored vectors if the writemask skips trailing components, or visit_store_global should be fixed
05:46dj-death: DavidHeidelberg: you'll probably get a fail on CI due to https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19414/
05:46dj-death: DavidHeidelberg: we merged the old failing tests to the fail list just before your MR was assigned
08:08glehmann: karolherbst: what missing aco bits?
08:10glehmann: as far as I know aco supports everything needed for rusticl, radeonsi just needs to call some more lowering passes
08:26karolherbst: glehmann: some minor things, it's probably a days worth of work
08:29mareko: there will be shader_info::use_aco that you can set
08:31karolherbst: mareko: I'm kinda confused on why the wrmask matters here as the value to be written is a single component anyway?
08:32karolherbst: ohh wait
08:32karolherbst: not all of them
08:32karolherbst: "@store_global (%48, %79) (wrmask=x, access=none, align_mul=8, align_offset=0)" and %48 is a vec4
08:32karolherbst: okay...
08:34karolherbst: but yeah.. I think it's technically a radeonsi bug for ignoring the write mask, but here it would also make sense to have a single compnent value... though it's still a bug and I'm kinda surprised it's not causing issues elsewhere, but maybe the glsl input is more sane and it doesn't cause any practical issues
08:35karolherbst: mhh
08:35karolherbst: opengl doens't use load/store global anyway...
08:35karolherbst: (though I guess the _amd ones it does)
08:35karolherbst: mhhhh
08:39karolherbst: ngl, I kinda don't like that write mask stuff anyway
08:45karolherbst: ahhh
08:45karolherbst: there is nir_lower_wrmasks
08:54karolherbst: yeah, that fixes it
09:03karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29214
11:45mareko: karolherbst: ac_nir_to_llvm is shared with radv, so it has the same bug
11:50karolherbst: yeah.. maybe I should call it unconditonally even and in a more global place (or fix it in radv as well). I'm just a bit surprised it hasn't been hit before, but the lowering to global_amd might not run into those things. And I don't know what radv ends up using for bda
11:58glehmann: aco handles global write masks
11:59glehmann: radv+llvm is only used for debugging, so that could explain why nobody noticed the issue
12:05gardinsay: don't say I was not there for you, I was and though I am busy with compiler of modern kind, I am open to questions, the method of course works as long as you subtract 1times selected distance and banktimes-1 values and banktimes-1 distance from two times distance and banktimes-1 distance and banktimes-1 values, as banktimes-1 values cancel out , and banktimes-1 distances are now at 1times
12:05gardinsay: and banktimes distances at 1, times as well, in other words selected elements distance appears once less, I had some inconsistent elaborating before.
12:51gardinsay: so all in all as you add all distances after eliminating the selected element, the focal distance is present where as it was missing before, so it will flip the inverse which is fastest way. it's cubious amount of horsepower the commodity hw has yet to expose with storage being insane at least when you add some compressed file system format.
12:53zmike: daniels:
12:53daniels:
12:53zmike: ^
12:54daniels: dwfreed: ^
12:58karolherbst: glehmann: oh right, I wasn't thinking of that
13:00tzimmermann: jfalempe, i give up on the generic solution for the bmc problem. i'll submit an ast-style fix for mgag200 after the DDC patchset https://patchwork.freedesktop.org/series/133537/ got merged.
13:02tzimmermann: no matter what, the generic solution is too invasive and too complicated to be worth it for a workaround to userspace
13:27karolherbst: mareko, glehmann: is there any kind of common code for nir lowering for the LLVM path? Or should I just call it from radv as well? Maybe move the lowering into `ac_nir`?
13:41karolherbst: anyway... should be easy to call from radv now, just that I have no idea where I should add the call :)
15:08mareko: karolherbst: I think NIR should trim stores according to the writemask in general, but for the new pass, we need to decide for both ACO and LLVM whether we want to lower store writemasks or not, so that there are no differences in NIR that ACO and LLVM consume
15:38karolherbst: yeah... I'm just not looking forward in replicating the same logic everywhere. ACO seems to handle it itself with taking care of alignment and other details. But a bunch of other compilers seem to do something like that as well.
15:55mareko: karolherbst: it's really up to the ACO folks to decide, and ac_nir_to_llvm will just mirror that
18:12DavidHeidelberg: dj-death: I think the CI will fail for backports, since the original patch is not there yet (as it's not in the 4.6.4 release, or gles equivalent)
18:29dj-death: DavidHeidelberg: you mean backporting the angle bump?
18:30DavidHeidelberg: dj-death: nope, the EGL/X11/transp stuff
18:33DavidHeidelberg: hmm, but I mixed it up, Lorenzo sent the Khronos request for backporting
18:33DavidHeidelberg: is he on IRC? (check whoswho and haven't found irc nick)
18:35allineffective: I did my part very well already, it's your part that is lacking thought. if you do not understand the relation of index+2*distance+value=elementmaxconst and elementmaxonst-distance=index+value-1*distance than you should not deal with programming and by no means harassing people that do. the hardware paradigm is way easier than sw views, and you see why I tell this and always told so to
18:35allineffective: Sortie and other #osdev residents. the thing is that software views and os has been ready for a while now and ever since 2012 opengl es was ready, vulkan is only a trouble, compilers were stable for ages already, you are asking for trouble with opencl2.1 too a trouble there is no reason to ever ask for, vulkans idea is to assemble a context with multiple threads, that should never in
18:35allineffective: the first place utilized for multithreaded execution, but should live in systems os for faster context switch. so cgroups way of doing longer execution is correct indeed.
18:48zmike: DavidHeidelberg: any progress on that tomb raider trace
18:49DavidHeidelberg: zmike: sorry, didn't manage to look at it yet, I need to finish some prio stuff before :(
18:49DavidHeidelberg: but if you kick into me next week, I'll try
19:11airlied: dwlsalmeida: dang it I forgot the thing you told me in dm and lost backlog, ^
19:12airlied: oops dwfreed ^ sorry dwlsalmeida
19:19allineffective: vulkan does not give you faster execution on multiple CPU threads even if it's matter of life and death, processes are human pipelined. it is misconception it's time derivative as to how they get spawned, unless you call some batched jobs that they get launched together but that's not the intelligent way in my paradigm since it gives not a single benefit.
19:30daniels: dwfreed: ^
20:02allineffective: I will never play any vulkan game let me elaborate the bottleck is not on the pipeline fill, if you go too aggressive there's more discard, even Bitcoin mining would not benefit from vulkan miner that spawned more cou threads.
20:03airlied: dwlsalmeida: I think I might have asked you previously, but any ideas on the feasiblity of writing a vulkan video driver on top of v4l2 kernel interfaces?
20:08allineffective: cause the job of Bitcoin is blocks get fully occupied and you have to search for blocks that have room to claim a new receipt, you do pointless work if you try to upfront execute the same block the code does not know where blocks start and end. server responds that.
20:10allineffective: and locks or mutexes semaphores would also break operand forwarding
20:21dwfreed: airlied: thanks
21:03kisak: This has got to be some kind of sysadmin curse, looked over at a system updating, but it's not fatal: [882/928] cd /var/tmp/portage/dev-libs/libclc-17.0.6/work/libclc_build && /usr/lib/llvm/15/bin/llvm-spirv --spirv-max-version=1.1 -o spirv64-mesa3d-.spv builtins.link.spirv64-mesa3d-.bc
21:04kisak: (libclc 17 built with llvm 15)
21:56DavidHeidelberg: folks, who broke RPi? :P https://mesa.pages.freedesktop.org/-/mesa/-/jobs/58794048/artifacts/results/summary/results/trace@broadcom-rpi4@behdad-glyphy@glyphy-v2.trace.html
21:59DavidHeidelberg: this is the second pipeline failing on it :'(
23:02airlied: agd5f: linus on fire situation with the 6.10 mr