00:12anholt: mazarax: no, other than glehmann's comment above.
00:59mazarax: There is some weirdness going on. When I use INTEL_DEBUG="ann cs mda" then I get a TAR file with the shader source code at different optimization steps. And I get the shader code on stdout.
01:00mazarax: So, the TAR files produced by the debug/optim builds are identical. But the assembly code on stderr differs. E.g. the shader is 10 instructions longer for the debug run.
01:00mazarax: What does MDA stand for?
03:51karolherbst: mazarax: maybe it's just a different order of compilation?
03:52karolherbst: some drivers/applications also do on the fly shader optimizations and replace shaders when a more optimized variant finish compiling
03:53karolherbst: though maybe you also run into some form of UB that hasn't been caught before
05:48lumag: I tried enabling piglit job on arm32 target in Mesa CI, but then I got an error:
05:48lumag: [04:52:50.000] Loading module '/usr/local/lib/arm-linux-gnueabihf/libweston-14/gl-renderer.so'
05:48lumag: [04:52:50.000] Failed to load module: /install/lib/libEGL.so.1: undefined symbol: wl_display_dispatch_queue_timeout
05:48lumag: [04:52:50.-1097980428] fatal: failed to create compositor backend
05:48lumag: did I miss anything?
05:48lumag: It seems libwayland-client0 is too old, but I'm not sure where does it come from
05:50lumag: https://gitlab.freedesktop.org/lumag/mesa/-/jobs/90345948
06:05lumag: mupuf ^^ any ideas?
06:06mupuf: lumag: Will check when I get back to work. In the mean time, you can check what etnaviv does
06:07mupuf: it is possible it doesn't use Weston, just Xorg or straight gbm
06:35mupuf: lumag: I don't think you've missed anything. It is just that the build system for our images is based on wishing for ABI stability and for the build and test containers to be relatively in sync.
06:37mupuf: The issue is that we don't want the test containers to have complete toolchains... so we try creating packages that we install somewhere else. But Debian being so out of date, we have to build many projects from source and that leads to funky behaviours like that
06:40mupuf: Not sure if the solution is to drop debian altogether in favour of a more up-to-date distribution, eat the cost of the toolchain in the test containers (only affects container-based testing), or find a way to build the test images based on the build images without the toolchain.
06:42mupuf: oh, another option: switch our containers to zstd:chunked, then use the podman composefs backend to deduplicate the toolchain and effectively make the presence of the toolchain in the test container a non-issue as it would remain until the next debian release (which is... forever in the future)
08:51MrCooper: mupuf: for "forever" in "~2 years" :)
08:51mupuf: MrCooper: This was a bit tongue in cheek ;)
08:52mupuf: the reality is clear though: it is too slow for our needs since we end up recompiling packages we really shouldn't have to
09:01MrCooper: I get where you're coming from, I doubt it's as simple as "shouldn't have to" though, unless the alternative is a bleeding-edge rolling distro, which incurs different kind of pain instead
09:03MrCooper: e.g. Fedora would be a more up-to-date trade-off, it doesn't really support cross-compiling though
09:04soreau: gotta say, installing needed deps for compositors with arch on rpi is a lot easier on the processor than with raspbian :P
09:08soreau: and it's usually much easier to downgrade a bleeding package via the package manager than build an updated one
09:30karolherbst: maybe we should just use debian sid...
09:31karolherbst: or maybe testing is good enough
09:36karolherbst: anyway, we could also just have our own CI internet deb repo where we push builds into instead of making it part of the mesa pipeline
09:38valentine: https://gitlab.freedesktop.org/gfx-ci/ci-deb-repo
09:39valentine: I don't think there's a better compromise than using the latest Debian releases, testing/sid would be too unstable
09:43valentine: the containers need to be able to be rebuilt reliably without random unrelated changes, and that pretty much rules out rolling distros
09:53MrCooper: FWIW, snapshot.debian.org could help for that concern
09:54MrCooper: still means moving to a newer snapshot could always bring random breakage though
10:38karolherbst: valentine: there seems to be tagged testing/unstable/sid containers tho
10:39karolherbst: they seem to make a new tag every two weeks
10:40karolherbst: though I guess that wouldn't help with package updates..
10:41karolherbst: ohh maybe in combination with the snapshot repos that would be good enough
10:41karolherbst: and we just pin the base container + the repo
16:27lumag: karolherbst, valentine, mupuf: I tried using snapshots.d.o, it was a nightmare because of expiring signatures :-(
16:27karolherbst: great :)
16:27lumag: I ended up using datefudge, but then it breaks https
16:44lumag: So, yes, most likely the best option is stable (or backports) + overlaying packages.
16:48lumag: valentine: we ended up using debdiff instead of completely duplicating the debian/ dir, see https://github.com/qualcomm-linux/qcom-deb-images/tree/main/overlay-debs
17:54cwabbott: is there any reason there's both a common NIR pass for alpha to coverage and an intel pass that has a better calculation for the mask to use?
17:54cwabbott: can we just use the better intel thing in the common code?
19:59anholt: cwabbott: it does seem like we're (turnip side) coming to a consensus that the spec-minimum atoc is not good enough and we want something like intel's in common.
20:00cwabbott: anholt: yes, I have something almost done to emulate it
20:01mareko: karolherbst: FYI https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38802/diffs?commit_id=30153ea1e067cceae90876c00d0328baee0a4b39
20:02cwabbott: seems like only agx passes has_intrinsic = false and uses the common code to calculate the mask
20:07glehmann: mareko: how can an optimization be required?
21:24mareko: glehmann: it's gallium, frontends lower and optimize everything before shader caching and before passing NIR to drivers
21:26mareko: GLSL -> NIR (link, lower, optimize) -> (cache NIR to skip GLSL compilation and linking) -> driver NIR compilation -> (cache binary to skip driver NIR compilation)
21:27mareko: drivers can require passes to be run to prevent running them redundantly
21:28mareko: it's also why many NIR passes get options from nir->options and not from parameters
21:39mazarax: So, some more info on my optim/debug discrepancy: The NIR code is identical, but the native code is not.
21:40mazarax: debug: Native code for unnamed compute shader (null) (src_hash 0x0d31090f) (sha1 73bb404b6699c18de43e5b91582ad578a183ca01)
21:40mazarax: SIMD32 shader: 331 instructions. 1 loops. 14738 cycles. 0:0 spills:fills, 33 sends, scheduled with mode top-down. Promoted 2 constants. Non-SSA regs (after NIR): 120. Compacted 6384 to 6160 bytes (4%)
21:40mazarax: START B0 (964 cycles)
21:40mazarax: optim: Native code for unnamed compute shader (null) (src_hash 0x0d31090f) (sha1 6bc8733ef985f20cc6970cdbf22007aa15b20f7b)
21:40mazarax: SIMD32 shader: 321 instructions. 1 loops. 14030 cycles. 0:0 spills:fills, 33 sends, scheduled with mode top-down. Promoted 2 constants. Non-SSA regs (after NIR): 120. Compacted 6224 to 6000 bytes (4%)
21:40mazarax: START B0 (970 cycles)
21:42mazarax: What subsystem in Mesa converts NIR to XE assembly?
22:11glehmann: brw (src/intel/compiler/brw)
22:15mazarax: Thanks. What is the difference between brw and elk? I suspect my path hits elk instead, but I will verify that.
22:16glehmann: elk is for ancient gpus
22:16glehmann: brw for skylake (gen9) and newer
22:16glehmann: confusingly, brw doesn't actually support broadwell anymore
22:16mazarax: huh.. interesting. So my log contains the message: " Native code for unnamed compute shader"
22:18mazarax: "Native code for unnamed compute shader" is a message from elk, not brw.
22:19mazarax: Are you sure it is not the other way around? elk for new gpus, brw for the old ones?
22:19mazarax: My gpu is a B580.
22:19glehmann: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/compiler/brw/brw_generator.cpp#L1531
22:20mazarax: I don't think that is the message.
22:20mazarax: The "unnamed" is only fed into that by elk:
22:22mazarax: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/compiler/elk/elk_vec4_generator.cpp#L2254
22:22glehmann: https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/intel/compiler/brw/brw_compile_cs.cpp#L287
22:23mazarax: Ah, yes.
22:23mazarax: Sorry about that.
22:24mazarax: So brw it is. I will try to take out those NDEBUG tests one by one, to see which one triggers different code. Thanks.
22:24dj-death: you might want to just run with NIR_DEBUG=print_cs
22:25dj-death: and check if there is a difference in NIR
22:25dj-death: then with INTEL_DEBUG=mda
22:25dj-death: that will generate a tarball of the shader
22:25mazarax: It prints the NIR, and I saw identical code.
22:25mazarax: I also ran mda tars, and those are identical.
22:25dj-death: with each step in the backend compiler
22:25mazarax: What does MDA stand for?
22:25dj-death: I don't know
22:26dj-death: the sha1 being different seems to indicate a different NIR
22:26dj-death: or whatever generated the NIR
22:27dj-death: then if the tarballs are identical, but the shaders aren't, you likely screwed up something in capturing the tarballs
22:28dj-death: since the tars include the final assembly
22:33mazarax: I am pretty sure I did the correct tar ball generation, I tried multiple times, and they are identical. The NIR is identical too, but the output from INTEL_DEBUG="ann cs mda" is not. I will try to find the assembly in question in the tar, and find out why the one on stderr is different.
22:34mazarax: I also made sure the mesa cache was disabled with MESA_SHADER_CACHE_DISABLED=1
22:41mazarax: I have 3 kernels in my shader module. I wonder if the tar balls are identical because it contains only one of them.
22:42dj-death: ah yeah my bad the sha1 is generated from the ISA, not taken from NIR
22:42dj-death: but still the tarballs should differ, since they contain the printed stuff
22:48mazarax: I think the tar balls are incomplete. Only 1 of 3 kernels shows up as CS/ASM32/0