00:01cwabbott: seems gitlab is still very unhealthy atm
00:01cwabbott: "gql.transport.exceptions.TransportServerError: 500 Server Error: for url: https://gitlab.freedesktop.org/api/graphql"
00:13robclark: cwabbott: probably fallout from the upgrade/migration? #freedesktop is a better channel for gitlab admin stuff
00:13cwabbott: ah yeah, wrong channel whoops
00:40ngcortes: anholt, thanks again with your help with deqp-runner. had another (hopefully) quick question; do you know if deqp-vk has some way to include errors that occur during test set up?
00:40ngcortes: deqp-vk: ../src/intel/isl/isl_surface_state.c:267: isl_gfx12_surf_fill_state_s: Assertion `isl_format_supports_sampling(dev->info, info->view->format)' failed.
00:40ngcortes: eg. ^
00:41ngcortes: seems like that error occurred, the test crashed, and it didn't end up getting written to the qpa file for that test
00:41anholt: that's output on stderr, so it won't be in the qpa. your deqp-runner log should have pointed you to where to look if you read it...
00:44ngcortes: right, so I see "Writing test log into /tmp/build_root/m64/opt/deqp/modules/vulkan/c94.r4.qpa"
00:45ngcortes: but that file doesn't seem to exist on the tester's fs, I guess that gets split out into logs for individual tests maybe?
00:46anholt: what did deqp-runner output tell you?
00:46anholt: start from there. it tries to hold your hand.
00:51ngcortes: anholt, https://paste.centos.org/view/9ae8cf8b here's a snippet from the log output from the job run
00:51anholt: if you read the log, up at where the failure was caught, it tells you the files to read.
00:53ngcortes: got it, thanks!
01:53mareko: jannau: yeah It's expecting desktop GPU performance (~500 GB/s) to finish tests relatively quickly
02:06mareko: an option could be added to reduce the number of iterations and sizes
02:58FireBurn: robclark: I got things loading on my a840, looks like the issue was due to what library name, using plain old libvulkan_freedreno.so sorted things
02:59FireBurn: I am getting weird checkerboard style glitches but least it loads now
03:06FireBurn: And the same goes for the a830 too
03:11robclark: I'd expect a830 to have more problems at least in gmem rendering because of missing params.. but if a840 is having issues I'd guess there is still something missing in the turnip kgsl layer.. I've only been testing against upstream kernel
03:17robclark: actually, tbh checkerboard glitches sound a bit like a gralloc mismatch type problem
03:20FireBurn: https://pasteboard.co/X6OT7yWSMyx7.jpg
03:22FireBurn: Same binary works great on my a740 and they're all Android 16
03:23FireBurn: OnePlus 11, 13 & 15
03:24robclark: android version != vendor version
03:27robclark: there's still some things I'm working on fixing on tu/gen8 mr.. but it is down to more esoteric vk features.. I generally wouldn't expect the driver to be fundamentally broken.. as far as what changed in qcom vendor stuff between bsp drops for various generations of devices... that is anyones guess.. I have hw docs but not access to closed driver stuff
03:28robclark: this sounds (and looks) more like an android integration issue than a driver issue
05:11MoeIcenowy: well an interesting question on-list here to raise attention: https://lore.kernel.org/dri-devel/2183e580.8b98.19b5531263f.Coremail.andyshrk@163.com/
05:11MoeIcenowy: I created a DW-HDMI glue for T-Head TH1520 (which uses Verisilicon display controller IP), currently I just put the glue in drivers/gpu/drm/bridge
05:12MoeIcenowy: however Andy Yan say I should put it under synopsys/ (which I don't think appropriate) or create a thead/ (T-Head already sold their AP SoC business)
05:13MoeIcenowy: I can't agree with either ideas (or should I put it to drivers/gpu/drm/verisilicon/ ? Although the problem is that T-Head is also just a customer of VS DC)
05:15MoeIcenowy: if create a thead/, possible things there is dw-hdmi glue, dw-mipi-dsi glue and the mux for dsi/dpi (if it's made a dedicated driver)
06:15jannau: mareko: a M1 Ultra is in that range and I even would expect at least factor 20 better results from a 5 year old iphone. vk.bufbw finishes quickly after removing DEVICE_LOCAL_BIT from the disallowed flags in vk_find_heap
07:42tzimmermann: lumag, ping
08:25eric_engestrom: dcbaker: yes I did, and I see you did the same :)
09:45pq: emersion, that's probably true. As a developer, having a good test framework gives me confidence in modifying the code, and I can concentrate on the fun side rather than try to figure out if I'm breaking something. Using the community as a QA team is unavoidable, but I'd still like to not annoy users with bugs if I can catch them.
13:08alyssa: jannau: maybe there's something wrong in hk?
13:14jannau: alyssa: in gallium/asahi you mean. hk seems fine (ignoring device local memeory)
13:40alyssa: or that
13:57mareko: jannau: why is DEVICE_LOCAL_BIT so slow?
13:59mareko: or rather host memory
14:00jannau: mareko: gl.bufbw is slow. that is unrelated to DEVICE_LOCAL_BIT. most vulkan tests fil to start because hk has no memory type without DEVICE_LOCAL_BIT
14:00mareko: I see
14:01mareko: it could be fixed-func HW being used for fills and copies
14:03mareko: I know that on early GCN, the fixed-func fill HW is so slow that the kernel thinks it's a timeout and triggers a GPU reset
14:06mareko: I think it runs at 4 bytes/clock, which is crazy slow, but that tells you that maybe the driver should use a compute shader instead
14:07eric_engestrom: mesa devs: reminder to write everything you would like to see mentioned in the release notes for 26.0 here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14495
14:17karolherbst: ... LLVM added a OffloadArch::UNUSED enum variant and yeah...
14:40alyssa: mareko: there's no ff copy/blit on apple
14:40alyssa: it's all compute/3d
14:42alyssa: jannau: gl.bufbw seems cpu bound
14:44alyssa: yeah the whole benchmark is running on the CPU in util_resource_copy_region via bufferobj_copy_subdata
14:45alyssa: because the gl driver doesn't implement GPU based copies even though VK does
14:45alyssa: um.. awkward.
14:50alyssa: ditto for clears
15:50jannau: alyssa: I'll take a look
16:04karolherbst: mhh with llvm-22 the clc compiler doesn't find atomic_fetch_add ...which is weird...
16:10alyssa: jannau: cool, thanks
18:42alyssa: austriancoder: why does etnaviv call both lower_bool_to_int32 and lower_bool_to_bitsize?
18:42alyssa: Please drop the lower_bool_to_bitsize call, you don't need both
18:44alyssa: e63a7882a0a ("etnaviv: call nir_lower_bool_to_bitsize") looks like mixing up the passes
18:44alyssa: you want bool_to_int32 a second time (I guess)
18:44alyssa: bool_to_bitsize is something more complicated that only really Bifrost has a good reason to call
18:48glehmann: ugh, looks like I need to make sure every backend calls nir_opt_algebraic_late in a loop for fcanonicalize
18:49glehmann: or I have to remove the late fneg(fneg(a)) pattern, I guess
18:55alyssa: glehmann: how does your MR make this worse..?
18:55alyssa: you already had to loop for fneg(fneg(a)) squash
18:55alyssa: incidentally it would be cool and good if we just didn't get fneg(fneg(a)) chains late..
18:57glehmann: we end up with fcanonicalize in backend instruction selection for lima and nv50
18:58alyssa: oh, I see.
18:58alyssa: and probably also agx
18:59glehmann: maybe, but it's probably missing CI
18:59alyssa: well yes
18:59alyssa: we could probably do a shaderdb smoke test in CI for it, anyway
18:59alyssa: the late fneg(fneg(a)) pattern smells really off tbh.
19:00glehmann: I can try removing it
19:00alyssa: I guess it's cleaning up after the comparison-with-zero -> fneg patterns above
19:00glehmann: yeah that's my fear
19:00alyssa: yeah I guess this is a very complicated way of saying "turn a-b<0 instead a<v"
19:01alyssa: incidentally im not totally sure why those lowerings are late either
19:01alyssa: is a-b<0 canonical form?
19:02glehmann: wouldn't think so
19:02alyssa: this whole thing deserves a :clown:
19:03lumag: tzimmermann: pong, I will take a look ASAP
19:03alyssa: the fundamental problem with the fneg(fneg) pattern is that algebraic is the wrong tool to do this with
19:03alyssa: the algebraic loop is O(n * chain length), backend copyprop is O(N) for any chain length
19:03alyssa: no looping
19:03alyssa: which is why AGX specifically *doesn't* loop opt_alg_late
19:04alyssa: and I think that was the rigth call
19:06glehmann: special shout out to midgard, which calls nir_opt_algebraic_distribute_src_mods, but exactly once
19:07glehmann: that has the same issue, I think
19:07glehmann: ugh
19:11glehmann: alyssa: removing the late fneg(fneg()) affects 4 shaders... but positively?
19:11glehmann: wtf
19:12alyssa: i love how many broken backends are my fault
19:14glehmann: ah no, it's actually negative, but somehow RA creates less movs
19:16glehmann: so it's either accepting small regressions, fixing up all the backends
19:16glehmann: or giving up
19:16alyssa: I vote accepting small regressions
19:16alyssa: ..
19:17alyssa: we shouldn't need to be looping here, and if someone wants to fix this "properly" it'd be by moving rules or adding rules or something, not by looping
19:17glehmann: that still leaves nir_opt_algebraic_distribute_src_mods in midgard
19:18glehmann: I don't think removing the pattern there is a good idea
19:18glehmann: but it also seems like looping will have vastly different results for midgard?
19:20glehmann: I have an idea
19:20alyssa: why does midgard call that different from everyone else
19:21alyssa: glehmann: keep in mind I wrote that code when I was 16 and was just cargo culting other 2018-era NIR backends
19:21glehmann: we can just change both the late and the distribute_src_mods to use is_only_used_as_float
19:21glehmann: that should cover all of the use cases
19:21glehmann: and then we don't have to create new fcanonicalize
19:48glehmann: and now I get to rebase this on the opt_algebraic test MR :))))
19:53austriancoder: alyssa: it's a mess .. will be soon better
20:14glehmann: sigh, why are trace jobs not running when I use ./bin/ci/ci_run_n_monitor.sh --target ".*" --include-stage ".*"
20:14glehmann: do I really have to use marge for that when I know some a going to fail?
20:16glehmann: trying to do core NIR floating point work is already hard enough
21:13jannau: gl buffer bandwidth is now 350 GB/s fill and 625 GB/s copy on a M1 Ultra
21:43alyssa: <oé
21:44alyssa: wait wrong keyboard layout
21:44alyssa: \o/
21:47airlied: glehmann: does --force-manual help? I've no idea what it does
21:48glehmann: rebasing on main seemed to have helped
21:48glehmann: I think the issue was that the jobs weren't even in the pipeline, not that the script couldn't start them
22:08karolherbst: alyssa: any idea about errors like "../src/asahi/lib/agx_bg_eot.c:250:48: error: ‘LIBAGX_HELPER’ undeclared (first use in this function)" or "../src/asahi/lib/agx_helpers.h:300:8: error: ‘struct libagx_decompress_args’ has no member named ‘images’" (where libagx_decompress_args doesn't seem to exist at all)
22:09karolherbst: I _think_ that's somehow related to me trying to get llvm-22 working, but it's not apparent to me where things are breaking
22:15jannau: that sounds clc related
22:19jannau: karolherbst: arguments for the libagx_decompress kernel in src/asahi/libagx/compression.cl
22:22karolherbst: jannau: ahh and the tooling is supposed to generated the struct with the args, right?
22:22jannau: yes
22:22karolherbst: which file is it supposed to be generated into?
22:23karolherbst: "src/asahi/libagx/libagx.h" or somewhere else?
22:24karolherbst: anyway yeah.. no struct definitions, only entry points
22:26karolherbst: ohhh.. I wonder if something is up with the metadata...
22:28karolherbst: ehh yeah.. it looks different...
22:29jannau: karolherbst: in ${BUILD}/src/asahi/lib/libagx_shaders.h
22:29karolherbst: yeah... that file looks pretty empty
22:33karolherbst: okay.. it's a change in the translator...
22:39karolherbst: https://github.com/KhronosGroup/SPIRV-LLVM-Translator/commit/5458eb2511d618426a752bb7e986c5d6940a9bf8 yeah... that's gonna need some work