04:24orowith2os: karolherbst: was poking around the Mesa source tree, rusticl specifically, and noticed that rusticl is currently on Rust 2018. Have you not gotten around to updating to 2021 yet, or do you want to stay on 2018 for backwards compat, or?
04:27orowith2os: I'd love to give it a small attempt myself to see if I can manage it, test my skills, but want to check with you first.
05:07orowith2os: never mind, git logs were goofy 😛
05:07orowith2os: it seems like it uses Rust 2021 already? At least, it uses the stdlib of 2021.
07:35MrCooper: pq: "If FB_ID is non-0, solid_fill blob is ignored" is backwards compatible, isn't it?
08:22lordheavy: Any way to debug "Couldn't create Clang invocation." with rusticl ? already tried RUSTICL_DEBUG=clc
08:22pq: MrCooper, I suppose so, if FB_ID=0 is not the (only) way to disable the plane.
08:23MrCooper: hmm, good point, that may be the case
08:23pq: thinking about someone leaving a non-0 solid_fill blob behind
08:24tarceri: Anyone got a setup with nvidia binary drivers installed and can test a piglit shader_test for me?
08:25MrCooper: pq: maybe CRTC_ID=0 disables the plane as well
08:30pq: MrCooper, I guess, but what does userspace rely on?
08:30pq: maybe DRM core required both CRTC_ID and FB_ID to be 0 together right now? That would work.
08:32pq: OTOH, I don't even care about this kind of "new userspace left-overs" compatibility with old userspace, because it seems no-one else does either. And it gets exponentially more difficult when new things get added if done this ad hoc way.
08:37MrCooper: looks like drm_atomic_plane_check does enforce that both are (non-)0
08:42pq: it would need extending to solid_color blob, too
08:42pq: to be future-proof
08:54MrCooper: that would break currently existing user space though
09:03pq: MrCooper, yeah, OTOH when someone then adds a third way of putting content to planes, we would not be able to ad hoc make that backward-compatible, because the "all are guaranteed 0 for disable plane" card was already used and discarded.
09:03MrCooper: right
09:20karolherbst: orowith2os: should be 2021
09:21karolherbst: atm we set the rustc req to 1.60, but I think there might be reason enough to bump that soon, now that the kenrel is at 1.68.2 and firefox ESR at 1.65
10:57funtoomen: Hi, I have recently read Phoronix article about you switching to BLAKE3 instead of SHA1. If BLAKE3 is a cryptographic hash function wouldn't it be faster to use a non cryptographic hash function or even a checksum function? Do you need the benefits of cryptographic hash functions over other hash/checksum functions for the purpose of uniquely identifing Vulkan shaders?
10:58pendingchaos: yes, it needs to be cryptographic
10:58funtoomen: Why so?
10:59pendingchaos: because collisions aren't handled
11:01HdkR: Some internal hashing uses xxhash for things that don't need cryptographic
11:11funtoomen: pendingchaos: what do you mean? wouldn't collisions happen almost never?
11:12pendingchaos: with a cryptographic hash, yes
11:12pendingchaos: not with a non-cryptographic hash
11:13pq: But how much is not too much? There is a difference when you have an adversary that is intentionally attempting to cause a collision, and hitting one just by sheer bad luck.
11:14dottedmag: funtoomen: Any performance suggestions should come with benchmark information, thnks
11:16pendingchaos: pq: I'm not sure I understand the question
11:16pendingchaos: a single collision is bad, because then the shader cache will use the wrong shader binary
11:16funtoomen: dottedmag: i mean, im just asking. i dont know nothing about graphics driver development, but i know enough cryptology to know that cryptographic hash functions come but some drawbacks
11:17funtoomen: s/come but/come with/
11:17pq: All hash functions with hash shorter than input have collisions. With cryptographic hash functions it is just much harder to intentionally cause collisions, but they can still happen accidentally.
11:19pq: when does that theoretical concern become a practical concern, I have no clue
11:20pendingchaos: probably never
11:20pq: why?
11:22pq: Is the goal of making intentionally finding collisions as hard as possible equivalent to the goal of reducing the possibility that two shader texts collide by accident?
11:22pq: funtoomen, any idea?
11:23pendingchaos: because cryptographic hash functions are very good and have a relatively large output size
11:23pendingchaos: I would expect the former goal to help in the latter
11:25funtoomen: pq: in my opinion even checksum function shoud make the collision *very* unlikely, and would come with quite some performance. but as i said i know nothing about graphic driver development, im just cryptology enthusiast.
11:26pq: funtoomen, really nice to hear from that side :-)
11:28psykose: shader cache is trusted input so it does have the implication of someone wanting to intentionally collide it under some scenario
11:33pq: psykose, but if an adversary is able to run shaders, whould they also not have write access to the cache files? Maybe not on WebGL perhaps?
11:33psykose: other way around, can't run shaders but can write to cache
11:34pq: how?
11:34pq: you mean the cache is poisoned whatever ways, and then a legit app falls prey?
11:34psykose: perhaps
11:34psykose: i mean it's obviously a very niche scenario
11:35psykose: hm, though it is possible how you say it too
11:35pq: in that case, why wouldn't the attacker just look at the original cache what hashes have been used, and simply replace their contents?
11:35pq: guaranteed hit the legit app starts the next time
11:36pq: regardless of hash functions
11:36psykose: that's true :)
11:38funtoomen: so you use cryptographic functions just because the make almost impossible thing (collision) even more imposible?
11:38funtoomen: s/because the/because they/
11:39HdkR: Targeted collisions are very much a concern
11:40pq: I suppose saving the original shader text in the cache for detecting collisions would be prohitive from both performance and legal perspective? and storage?
11:40pq: *prohibitive
11:41funtoomen: HdkR: but if someone could try and target a collision wouldnt be the machine already comprimised?
11:42pq: not with WebGL, I suppose - unless you consider WebGL itself to compromise the machine in the first place
11:43HdkR: Remote execution doesn't necessarily mean the full system is compromised
11:43HdkR: Shutting down attack vectors is good :)
11:45karolherbst: yeah soo.. if it comes to hashes, the _correct_ way of using them is verify the key matches the data set, though that would blow up our disk cache and makes it more expensive, but... yeah, atm we have this current potentialy problem of a key loading a different cache item
11:45funtoomen: HdkR: Heah, i guess you are right. Im curius what are the performence drawbacks, would be cool if someone more competent then me did a benchmark.
11:45dottedmag: Also Amdahl's law. How much time does hashing take after switching to blake3? That's why I asked for benchmarking: if hashing cost is now in the noise, then replacing the hash with the one that executes in no time won't improve anything.
11:45karolherbst: but a cryptographic "safe" hash kinda mitigates that problem enough so people rely on it being sane
11:46funtoomen: ok, i think i get it now.
11:47HdkR: Blake3 is actaully quite quick for being a cryptographic hash which helps :)
11:47karolherbst: but anyway.. sha1 is already broken, soo...
11:48funtoomen: yeah, thats why they switched i guess
11:48karolherbst: I am wondering if any use actually hit a cache collision already...
11:48karolherbst: *user
11:48funtoomen: same
11:48karolherbst: *hash
11:49pq: they probably tried a different mesa version next - does that invalidate the whole cache?
11:49funtoomen: would love i someone calculated the probability based on the size of hash cache
11:49funtoomen: s/love i/love if/
11:49karolherbst: pq: yes
11:49pq: so that would blow the problem away
11:50karolherbst: well... we use the build-id of the so files
11:50funtoomen: thank you all, i have to go
11:50psykose: dottedmag: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22387 is like 50% already just over sha1 for full cache hits
11:50psykose: not sure what that 2.4s ratio is of cache:everything-else
11:51psykose: but it's really just low territory at that point i guess
11:51karolherbst: I wonder if it's better to just depend on a lib implementing blake3 instead of shipping on ourselfes....
11:51psykose: well, it is a lib
11:51psykose: it's blake3 reference copied into the tree
11:51psykose: :D
11:51karolherbst: yeah...
11:52psykose: just like sha1 was and xxhash also is i think
11:52karolherbst: could also just depend on libblake instead or something :P
11:52psykose: aye
11:53pq: good luck adding any dependency to mesa, or even bumping existing ones
11:53karolherbst: ohh seems like blake3 is so fast, because it actually bothered to keep SIMD in mind
11:54dottedmag: I remember maintaining libsha1 (a copy-paste from somewhere else, SHA1 only) in an embedded distro just to build kdrive (or was it fontconfig?). It wasn't that fun.
11:54karolherbst: so the speed mainly comes from the fact you can run stuff in parallel
11:55karolherbst: fun.. seems like the rust impl of blake3 is the "main" one
11:56psykose: eh, i'd say the C one is also maintained and meant to be used
11:56psykose: not like some thing someone threw in
11:56karolherbst: yeah, but you can't run it with multiple threads
11:56psykose: well, on the same input no
11:57psykose: but shaders are parallelisable per-shader anyway, no?
11:57psykose: i.e. multiple hash threads, each takes 1, ..
11:57karolherbst: it's not limited to one operation
11:57karolherbst: you can run hashing one thing in parallel
11:57karolherbst: (at least with the rust impl)
11:58karolherbst: `b3sum` e.g. does this
11:59psykose: yeah, i'm referring to the C version and what you can do anyway, abstractly for things that don't parallel on one input
11:59psykose: personally i always found that to be an easier model
11:59karolherbst: right
11:59karolherbst: it probably doesn't make much sense if compilation of shaders happens in parallel already
11:59karolherbst: but it seems like most of the speed actually comes from doing it multithreaded
12:00karolherbst: yeah...
12:01psykose: are you sure? with b3sum it's actually slower with more threads on a random thing i tested
12:01karolherbst: huh.. weird
12:02psykose: ah, no
12:02psykose: i was misreading
12:02karolherbst: I wonder how its speed compares to sha1sum? with 1 thread
12:02karolherbst: but yeah.. it looks like 8-16 threads is somehow the sweet spot
12:03karolherbst: at least according to the paper
12:03karolherbst: ohh.. the "5.2 Multi-threading" section explains it quite well
12:04karolherbst: makes it sound like you can actually also do it on the GPU fairly easy
12:04psykose: https://img.ayaya.dev/N3dIgJKYNQFu
12:04karolherbst: okay
12:04psykose: personally though i'd say the User: time is the most interesting one, and how it's 2x faster than sha1sum
12:04karolherbst: yeah, so it's not _that_ much faster single threaded
12:04karolherbst: still faster tho
12:04psykose: more threads is 'nice' but you can't in general rely on it because it is more actual cpu resource even if wall is lower
12:05karolherbst: yeah
12:05psykose: (compression algorithms usually have a scaling issue there, like if you do more than -T4 on zstd you'll get less time but at like -T16 you're not even halving the time and using over 4x the cpu, so it becomes very wasteful unless you have a dedicated reason to do it, etc)
12:06karolherbst: yeah so in blake3 you chunk the input and each thread can operate on those chunks, and then the main thread chains it all together in order
12:06psykose: yeah, it scales quite efficiently
12:06karolherbst: you could probably calculate all those chunks on a GPU and just chain it on the CPU :D
12:07karolherbst: but kinda cool
12:08karolherbst: being able to make full use of SIMD is nice
12:08karolherbst: at least it's a way more sane programming model they had in mind it seems
12:08psykose: nerd :p
12:08karolherbst: :P
12:09karolherbst: well.. most other things vectorize a linear operation which doens't get you anywhere
12:31dottedmag: karolherbst: We put a shader into shader so you can hash while you hash?
12:33karolherbst: yes
12:37javierm: tzimmermann: nice cleanup series. I only read the cover-letter for now but agree on the direction. I'll try to review the patches tomorrow
12:38javierm: tzimmermann: btw, I rebased last night the RFC to split FB in FB_CORE and FB. After your recent fbdev cleanups, I could drop two patches and now are only Kconfig and makefile changes :)
12:40tzimmermann: thanks, javierm
12:41tzimmermann: i wanted to use the firmware edid with simpledrm, but found that the rsp code is slightly chaotic. hence the cleanup
12:42javierm: tzimmermann: right. But having screen_info defined only for the arches that use it would be great. I remember having build issues due some arches missing ifdefery for that
12:44tzimmermann: yes. and we can even do better, i think. i outline in the cover letter that we could enable it only when there are actual users. it's not in the patchset, but a follow-up would be straight forward.
12:44javierm: tzimmermann: yup, I read that. That's why I said agree on the direction :)
12:45tzimmermann: javierm, BTW have you seen https://gitlab.freedesktop.org/drm/amd/-/issues/2649
12:46tzimmermann: for some odd config, the fbdev console doesn't set up correctly. i'm trying to wrap my head around it, but it's confusing
12:46tzimmermann: if the primary display is off, the usb-attached monitors remain off as well.
12:46tzimmermann: it's a regression
12:46javierm: tzimmermann: hmm, no I haven't seen that bug before. Let me read it...
12:52MrCooper: tzimmermann: I suspect the lid being closed might just matter indirectly as well, e.g. via timing; AFAICT the fundamental issue is that the fbdev emulation doesn't correctly handle the hot-plugged DP MST connector
12:54tzimmermann: i does work if the old output_poll_changed callback has been set. i cannot get how this affects fbdev state. maybe i'll fill the code with printks and let the reporter run it before and after.
12:59javierm: tzimmermann, MrCooper: if amdgpu is using the generic fbdev emulation then setting .output_poll_changed should not be needed indeed
12:59javierm: I guess will have to dig that driver code
13:01MrCooper: "should" being the key word :)
13:01javierm: MrCooper: right :)
13:01tzimmermann: right. it "should" not be needed. so there's a bug somewhere
13:18pq: Has anyone else had problems that when program and Mesa are built with ASan, ASan itself segfaults on exit (radeonsi), or everything futex-deadlocks before the program finished (llvmpipe)?
13:19pq: the program is Weston, fwiw
13:21pq: That's Mesa build with glvnd. Without glvnd I get reports of leaks inside Mesa with radeonsi, and the same futex deadlock with llvmpipe.
13:22pq: and I'm pretty sure it's not Weston leaking anything, since eglTerminate + eglReleaseThread should clean up.
13:45lordheavy: Ok, i have more informations about clang failure and rusticl - adding this simple patch https://paste.xinu.at/6CTg/
13:45MrCooper: pq: I think the EGLDisplay is inevitably leaked without EGL_KHR_display_reference (which Mesa doesn't support yet: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10118)
13:45lordheavy: gives me error: unknown argument: '-no-opaque-pointers'
13:46lordheavy: i think it's related to archlinux - so not a mesa bug - but think king of information is usefull intead of just the message 'Couldn't create Clang invocation'
13:47psykose: lordheavy: that looks like the mesa is using clang16 which doesn't recognise the no-opaque mode anymore
13:48psykose: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7468 intel clc is not ported yet, so only works with 15 afaict
13:48psykose: that said i'm not a developer for that, just what i know of of that combination more generally :)
13:49psykose: code part is https://gitlab.freedesktop.org/mesa/mesa/-/blob/9ca1bb3cf8f2f4d9378ceb8ae39e6f853fb900b0/src/compiler/clc/clc_helpers.cpp#L787
13:49psykose: and you're right, error should be logged i think
13:50lordheavy: psykose: oh, thanks - now it's time to patch and test
13:50psykose: i don't think there will be a trivial patch unless it was accidentally left in there, as being opaque-pointers compatible usually takes a bunch more work
13:50psykose: good luck however
13:57pq: MrCooper, oh. In that case I'd kinda expect to see more leaks than I do. Would leaking the display also leak shader bits?
13:58MrCooper: that's been my assumption, not sure though
13:59MrCooper: valgrind always reports tons of leaks in Mesa code for me with a Wayland compositor or Xwayland; I've been assuming they're mostly due to this, might be wishful thinking though :)
14:00pq: https://gitlab.freedesktop.org/-/snippets/7648 are the leaks I see from the main thread. There were more in other mesa created threads.
14:00lordheavy: psykose: great, removing '-no-opaque-pointers' at least fixed my issue with clinfo ;)
14:00psykose: :D
14:23javierm: MrCooper, tzimmermann: after staring the amdgpu code for a long time, the only thing that I can't think of is that drm_fbdev_generic_setup() is only called when there are available connectors
14:23javierm: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c#L2169
14:23javierm: if after probing the driver, a connector is added then the generic fbdev won't be set-up ?
14:24tzimmermann: javierm, indeed. that's some weird code
14:25tzimmermann: that whole condition should be removed IMHO
14:25javierm: which might cause the issue since the drm_fbdev_generic_setup() -> drm_fbdev_generic_client_hotplug() won't happen
14:26MrCooper: javierm: sounds plausible, if it wasn't for the eDP connector existing even with the lid closed here (even with connected status)
14:27MrCooper: also, fbdev emulation works on the internal panel if I open the lid
14:27javierm: MrCooper: that's not what the shared dmesg log says though... at least how I read it
14:27javierm: [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:78:eDP-1] status updated from unknown to connected
14:27tzimmermann: javierm, for the output_poll_changed to work, you'd still need generic_setup
14:27javierm: tzimmermann: hmm, right
14:27tzimmermann: still, that branch should probably go
14:28javierm: tzimmermann, MrCooper: but just by reading the code, I can't see a reason why drm_fb_helper_output_poll_changed() is needed that drm_fbdev_generic_setup() doesn't already
14:28MrCooper: javierm: FWIW, the lid of this laptop (which is affected by the same or at least very similar issue) is currently closed, and drm_info says eDP status is connected
14:29javierm: MrCooper: yeah, my suspicious was wrong if drm_fbdev_generic_setup() is needed for mode_config->output_poll_changed
14:29tzimmermann: javierm, MrCooper. IMHO there's something with the handling of deferred_setup https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fb_helper.c#L2329 as if it fails once and then cannot recover
14:29tzimmermann: but i cannot really point to the issue
14:33tzimmermann: at probe we call generic_setup()
14:33tzimmermann: it simulates a hotplug to initialize the display: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fbdev_generic.c#L343
14:34tzimmermann: this apparently worked, as there's no error message in that debug log
14:36javierm: tzimmermann: yeah, some debug log in drm_fbdev_generic_setup() when it succeedes would be useful
14:37tzimmermann: the output_poll_changed callback is only called by the two functions starting at https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_probe_helper.c#L691
14:37tzimmermann: it's immediately followed by client_dev_hotplug(), which calls our client code
14:38tzimmermann: and it should end up in the same place as output_poll_changed, namely drm_fb_helper_hotplug_event: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_fbdev_generic.c#L272
14:39javierm: yeah
14:39javierm: that's my conclusion as well so I don't understand why is failing...
14:40tzimmermann: and i'm pretty sure that we take this branch, because our initial simulated hoptplug did not fail. so dev->fb_helper should be set as this point
14:40javierm: yes
14:43tzimmermann: javierm, i don't find this line in the debug logs: https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L2085
14:43tzimmermann: grep for hotplug_event and there's only the sysfs stuff
14:44tzimmermann: maybe it fails in this condition: https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L2077
14:44tzimmermann: that would still return an err of 0
14:47macromorgan: anyone have experience with tinydrm panel drivers? I'm confused about the rs pin and how I make it work with my SPI controller...
14:50javierm: tzimmermann: which would be !fb_helper->fb since is unlikely that the mutex grabbing in drm_master_internal_acquire() would fail
14:50javierm: so it seems your intuition is correct and the problem is in the delayed outplug path
14:51javierm: *hotplug
14:51tzimmermann: javierm, there's x11 running! so drm_master_internal_acquire() should fail
14:51tzimmermann: x11 is the drm master already
14:52tzimmermann: after do the described vt-switch (alt+f2), the code would run last_close IIRC
14:53tzimmermann: ans last_close ultimatively brings us to https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L232
14:53tzimmermann: where delayed_hotplug is being handled
14:53tzimmermann: i still don't see the issue, though :/
14:53MrCooper: in my case, gnome-shell is most certainly not running yet at that point, plymouth might be though
14:58javierm: tzimmermann: yeah porque if (do_delayed) then drm_fb_helper_hotplug_event() should be called
14:58javierm: err, because. I don't know why I mixed spanish and english in the same sentence haha
14:58javierm: tzimmermann: https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L262
15:00tzimmermann: de nada
15:00javierm: :D
15:01javierm: tzimmermann: so I'm also not seeing the issue... MrCooper maybe you can add some debug logs in drm_fb_helper_hotplug_event() and figure out whether is called or not on switch to VT ?
15:03tzimmermann: maybe we have set deferred_setup https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L1945
15:04javierm: tzimmermann: but wouldn't that be the case too for the drm_fb_helper_output_poll_changed() -> drm_fb_helper_hotplug_event() path ?
15:05MrCooper: can do
15:05tzimmermann: then we'd return early: https://elixir.bootlin.com/linux/v6.1.35/source/drivers/gpu/drm/drm_fb_helper.c#L241
15:09tzimmermann: MrCooper, thanks
15:10javierm: MrCooper: great. I'm out of ideas
15:17MrCooper: thanks for the brainstorming guys
16:21agd5f: javierm, we don't set up the fbdev code if the GPU doesn't have any display hardware on the GPU.
16:23agd5f: some GPUs may not have any display IPs at all, others may have display IPs, but no physical connectors on the board.
16:38javierm: agd5f: yeah, I think that understood the rationale of that logic. But that wasn't the issue anyways as tzimmermann mentioned. It's likely something in the deferred setup
16:38javierm: macromorgan: sorry, I missed your message before. What problem do you have, with what panel driver ?
17:01macromorgan: I'm trying to create a new panel driver for tinydrm based on this: https://github.com/FunKey-Project/linux/blob/FunKey_S/drivers/staging/fbtft/fb_st7789v.c
17:02macromorgan: I'm having issues understanding though how to handle the RS pin (which is hardwired to the MISO pin)
17:03macromorgan: if I define the pinctrl for the SPI bus, won't that block me from using the MISO pin as the RS pin? Is there a helper function that does that in Linux I'm missing?
17:08javierm: macromorgan: there's already a panel driver for this chip I see: drivers/gpu/drm/panel/panel-sitronix-st7789v.c
17:10macromorgan: that's for initing the panel via SPI but displaying it via DPI, if I'm not mistaken
17:10macromorgan: Mine is to both init and display via SPI
17:10macromorgan: I assumed that needed a different setup
17:13javierm: macromorgan: ah Ok. But still I wonder if wouldn't be better to extend that driver to support both DPI and SPI transports
17:13javierm: mripard ^
17:14macromorgan: If that's the route we want to go. Honestly I'm just trying to get the panel to work, then I can worry about making it mainline conformant
17:15macromorgan: this is the first time I've worked with a pure SPI panel that used the MISO pin as a "switch" to note if we're sending data or commands
17:23javierm: macromorgan: yeah, that's normal for some of these SPI panels. It's usually called D/C (data or command) and not RS though (which sounds more like reset?)
17:24javierm: macromorgan: and what you do usually is to use a GPIO to toggle that pin
17:28javierm: macromorgan: it seems is called D/CX in your chip datasheet, by looking at "8.4 Serial Interface" section in https://newhavendisplay.com/content/datasheets/ST7789V.pdf
17:29javierm: macromorgan: "In 4-lines serial interface, data packet contains just transmission byte and control bit D/CX is transferred by the D/CX pin"
17:29macromorgan: yep... sadly in my implementation it's hooked to the MISO pin
17:31javierm: macromorgan: I see... and you must use 4-wire, you can't support a 3-wire SPI setup?
17:32javierm: because with 3-wire you can have the D/C bit as a part of the 9-bit payload
17:32macromorgan: I honestly don't know
17:32macromorgan: new to SPI displays honestly
17:33javierm: macromorgan: look at the "8.4.2 Command write mode" section in the datasheet I shared
17:33macromorgan: okay will do
17:34javierm: you either can send a 9-bit payload (where the first bit is the D/CX to let the controller know whether the payload is data or a command) or a 8-bit payload (where the D/CX is out-of-band using a pin)
17:35macromorgan: so if I send an 8 bit payload over a 3-wire interface I should be golden right?
17:36macromorgan: I guess I can try that and see if it falls flat on its face or not
17:36javierm: macromorgan: yeah, that won't work because the chip won't know was you are sending it...
17:36macromorgan: ohh, wait, you mean send a 9 bit payload over a 3 wire interface
17:37javierm: macromorgan: yes
17:37macromorgan: okay, let's try that :-)
17:37javierm: the 9-bit is <1-bit D/CX, 8-bit payload>
17:37javierm: macromorgan: but your chip has to be configured to use that interface
17:37javierm: see "6.2 Interface Logic Pins" section
17:38javierm: pins IM[3-0] are used for that but I don't know whether those are accessible on your design
17:38javierm: you wan't those to be 0 1 0 1 according that datasheet
17:39macromorgan: they are not available and I don't know how they are configured... let me check the panel display sheet in case it has something on it
17:39javierm: err, it seems 1 1 0 1. I misread
17:39macromorgan: https://cdn.hackaday.io/files/1649347056536256/ALIBABA_SAEF_SF-TC154B-8377A-N_annote.pdf
17:41javierm: macromorgan: I've to leave but I had to implement something like that for the ssd130x driver: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/solomon/ssd130x-spi.c#L21
17:41macromorgan: okay, thank you for your help. Gives me something to look at more
17:41javierm: but if you can't use a GPIO, then that's not an option for you :(
17:42javierm: macromorgan: you are welcome. Now that I think about it 3-wire SPI should also be supported for the ssd130x, maybe I should implement that too
18:18anholt_: daniels: thanks. looks like I was already picking the radv runners, so I just need to --stress and we'll be good.
18:44daniels: nice
19:11hussam: Hello. Does OpenCL 3.0 work with mesa on a 620 intel hd?
20:29alyssa:glares at dEQP-EGL.functional.render.multi_context.*
20:29alyssa: I'm running it in a loop and it's passing reliably.
20:29alyssa: But I've SEEN it flake T_T
20:30karolherbst: hussam: yes
20:30karolherbst: well.. mostly
20:31alyssa: apparently my stress test isn't stressful enough
20:31karolherbst: you need help hitting flakes? tried running 200 threads in parallel?
20:31alyssa: That's an idea..
20:31hussam: karolherbst: What meson options do I need?
20:31karolherbst: that's how I fixed all the CL flakes I had left 🙃
20:31karolherbst: well.. maybe not 200, but...
20:32karolherbst: using "stress" to keep your CPU busy does help with finding CPU related flakes 🙃
20:32karolherbst: hussam: gallium-rusticl=true
20:33hussam: will that generate the icd file?
20:33karolherbst: yes
20:33hussam: Thank you. I will try now.
20:41karolherbst: jenatali: ever saw a "Attribute does not match Module context!" error?
20:42hussam: karolherbst: Done. hashcat says no devices found.
20:42jenatali: karolherbst: Sounds like you've got two LLVM contexts?
20:42karolherbst: jenatali: so I have a user seeing this: https://pastebin.com/eiQn4R21
20:42karolherbst: mhh but yeah..
20:42karolherbst: could be something silly inside gentoo again...
20:43karolherbst: but also makes no sense really..
20:43karolherbst: hussam: need to set RUSTICL_ENABLE=iris
20:43hussam: yay. that worked.
20:43karolherbst: it might or might not work correctly yet
20:43jenatali: Weird
20:44karolherbst: it's good enough to pass the CTS and run random stuff... but.. there are still issues I want to tackle before enabling anything by default :F
20:44karolherbst: jenatali: yeah...
20:44karolherbst: I'll ask for a LD_DEBUG=libs...
20:45karolherbst: jenatali: this "llvm::compression::zstd::compress" confuses me....
20:46psykose: how so
20:46karolherbst: why would zstd::compress be called when reporting an error?
20:47karolherbst: I'm sure it's just some optimized build and figuring out the symbols is screwed up
20:47psykose: hhmm
20:47psykose: yeah, you're right
20:48psykose: that does not track at all so it's wrong symbols or similar brokenness
20:49psykose: well, maybe there's some magic path where the function call inside report_fatal_error is so broken it jumps into zstd::compress which then aborts on something :D
20:49psykose: corrupted memory can do anything
20:49karolherbst: jenatali: https://pastebin.com/JcLzijD3 .... mhhhh
20:49karolherbst: I don't know, but having two llvms....
20:50jenatali: Yeah 100% that's the problem
20:50karolherbst: `libigdfcl` that's intel, no?
20:50karolherbst: yep...
20:50karolherbst: mhhh
20:51karolherbst: :pain:
20:51psykose: i thought having two llvms just aborted on init
20:51psykose: or does that not happen on glibc
20:51karolherbst: good question
20:51psykose: on musl at least the second one will abort due to some Options thing being double registered, inside llvms constructor
20:52karolherbst: yeah...
20:52karolherbst: that's usually what happens
20:52karolherbst: maybe something else happens now
20:52karolherbst: but anyway.... uhhh
20:52karolherbst: can we torch llvm? :D
20:53psykose: but cute dragon :(
20:53karolherbst: we keep the dragon
20:54karolherbst: I wonder if we really have to runtime load llvm....
20:54karolherbst: and just load it in a way it's only private to us
20:54jenatali: Like static linking...?
20:54karolherbst: but uhhh.....
20:54karolherbst: mnhhh
20:55karolherbst: I don't know if I'm in the mood for that kind of bikeshed comming to use
20:55karolherbst: *us
20:55karolherbst: _but_
20:55karolherbst: we could tell distributions if they support multiple llvm versions at once, then they either static link or we close all bugs
20:55psykose: it's a distribution issue to make sure everything in mesa path has the same llvm version, yes
20:56psykose: but it really is very obvious 99% of the time, you just get a load abort..
20:56karolherbst: ehh it's not inside mesa
20:56psykose: i dont' know why this case is different
20:56karolherbst: mesa is fine
20:56karolherbst: well
20:56karolherbst: soo
20:56karolherbst: ever heard of the Vulkan ICD thing? so this was an idea they took from CL and CL does the same thing
20:56psykose: i am completely clueless about anything ICD :D
20:56karolherbst: the issue with CL is, that... all (like.. almost all) cl impls use LLVM
20:57karolherbst: loading multiple implementations at once
20:57psykose: yeah, all gotta match
20:57karolherbst: so the user has Intels CL stack (on LLVM-15) and mesa (on LLVM-16)
20:57karolherbst: and the ICD dlopens those CL impls
20:57psykose: ah, i think that's the issue
20:58psykose: if you dlopen the conflicting llvm much later you get into this state
20:58karolherbst: yep, the user already confirmed it :)
20:58psykose: but yeah, it's a pain
20:58karolherbst: sooo.. the icd loads with `RTLD_LAZY|RTLD_LOCAL`
20:58karolherbst: _but_
20:59karolherbst: there is a `// | RTLD_DEEPBIND` in the code...
20:59karolherbst: and I wonder if that would fix it...
20:59karolherbst: but really.. this part is really broken on linux
21:00psykose: i dunno, it sounds like it would just somewhat hide it more
21:00karolherbst: maybe
21:00karolherbst: but this is something we have to figure out I guess
21:00psykose: famously musl also does not have that, but distro side that's a 5 second patch for me so idc personally
21:00karolherbst: yeah so some distros support multiple LLVM versions
21:00karolherbst: like gentoo
21:00psykose: and similarly on distro side "match all the llvms" is just what i do
21:00karolherbst: and ubuntu
21:00airlied: and fedora :)
21:00psykose: e.g. in alpine mesa is 15 because blender is 15
21:00karolherbst: and I think fedora as well in theory
21:01karolherbst: I think dlopening llvm ourselves is probably the way to go here :/
21:01karolherbst: but llvmpipe devs will hate us
21:02karolherbst: but uhhh...
21:02karolherbst: why is it such a huge issue with llvm
21:02psykose: it would be the same with any multiple-abi-versions dep
21:02psykose: llvm is just the famous one here :)
21:02karolherbst: ahh right.. because symbol versioning is broken or something
21:03psykose: clearly what we need is more market fragmentation
21:03psykose: so others start using definitelynotllvm that doesn't conflict
21:03karolherbst: yeah, maybe we just have to make more people run into this issue so it finally gets addressed
21:04psykose: libc side i don't think there's anyone super interested in addressing this in some meaningful way that i've seen for the loader
21:04karolherbst: but honestly.. why couldn't the icd dlopen our libraries in a way that dependencies are private to the library or something :/
21:04psykose: could be wrong
21:06karolherbst: khronos own loader uses RTLD_NOW mhhh
21:13karolherbst: so the user had used khronos loader which is using RTLD_NOW mhhh
21:13karolherbst: I wonder if the loader should be fixed here...
21:16karolherbst: does anybody know if "RTLD_LOCAL" gets applied to dependencies as well?
21:17karolherbst: so if one loads Intel's CL impl with RTLD_LOCAL, would its LLVM dep be only local to it?
21:24psykose: if you mean things that are DT_NEEDED on the cl object and not something it itself also later dlopens then i think so
21:24psykose: didn't test though
21:24psykose: this stuff is so broken though i wouldn't be surprised if it's actually the opposite of that :D
21:25karolherbst: yeah...
21:25karolherbst: sooo.. I'm checking if it works with how ocl-icd loads things
21:25karolherbst: and if it does, I just file a bug against khronos loader or change it to RTLD_LOCAL
21:26karolherbst: because on fedora intel's stack also installs with llvm-15 :')
21:26psykose: yeah idk how that works at all
21:26psykose: i imagine mesa being 16 on any distro isn't actually functional for any later deps
21:26psykose: but there's always magic
21:27karolherbst: what magic
21:27psykose: magic of magic, the unknown :D
21:27psykose: (it's bedtime for me)
21:27psykose: nini karol
21:27karolherbst: rude
21:27psykose: <3
21:27karolherbst: :3
21:37karolherbst: yeah sooo...
21:37karolherbst: with ocl-icd it just works it seems
21:38karolherbst: or at least I think it does... mhh
21:38karolherbst: ehh no, both are built against llvm-15.. uhhh
21:57karolherbst: yeah dunno.. on fedora it just works
21:57karolherbst: must be a gentoo bug then
22:05karolherbst: mattst88: so apparently on gentoo if a process loads llvm-15 and llvm-16 it crashes in weird ways. On fedora the same thing seems to work. Maybe fedora is something dodgy to make it work. Maybe gentoo also needs to do something dodgy. I have no idea, but just wanted to let you know
22:06karolherbst: this can happend if a user has intel's and mesa's CL impl installed and are compiled against different LLVM versions
22:15mattst88: karolherbst: ugh :(
22:15mattst88: oh, separately, did you make an MR with the patch Gentoo is carrying? the clang resource dir one
22:26alyssa: anholt_: any pointers about dEQP-EGL.functional.render.multi_context*?
22:26anholt_: alyssa: nope
22:26alyssa: Wheeee
22:26anholt_: disable any job reordering?
22:26alyssa: We're hitting very rare flakiness for it on Asahi.. but I see it's also on freedreno flake list so I'm thinking it's not driver specific
22:27alyssa: disabling the shader disk cache seems to make the flakes all but disappear, though I think I still hit.. the flake once with cache disabled
22:29alyssa:should try to reproduce the flakiness on panfrost
22:33alyssa: looking through Mesa CI Daily Reports, I see a similar test (`wayland-dEQP-EGL.functional.color_clears.multi_context.gles1.rgba8888_window`) flaked(?) on llvmpipe
22:33alyssa: which seems like a harbinger
22:34alyssa: another related test is on the rpi3 flake list but I suspect the whole test group (not just that one) is flaky https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20452/diffs
22:35alyssa: I can't really imagine what could be going wrong for the multi_context tests (which AFAICT are still single-threaded?) to flake so rarely across... every driver that's running them in CI, seemingly?
22:36karolherbst: mattst88: I planned to do so tomorrow
22:53karolherbst: uhh multi_context?
22:54karolherbst: I think I was also seeing flakes in nouveau with that...
22:55karolherbst: but yeah, multi_context is single threaded
22:58alyssa: karolherbst: I don't understand how a single threaded test can flake on every driver
23:01alyssa: https://rosenzweig.io/flaker.xml if anyone is curious
23:02alyssa: I notice some interesting rectangular corruption
23:02alyssa: IDK if tile boundaries but.. doesn't seem natural
23:03alyssa: it just flakes /so rarely/ that I don't know how to debug this monster
23:03anholt_: MSAA only?
23:04alyssa: not sure, will try to capture more qpa's
23:05alyssa: also sometimes see GPU timeouts (though not faults). this seems distinct symptom from the fails.
23:06anholt_: wonder how much happier everything would be if we did multisample render to single sampled in the winsys.
23:06alyssa: heh
23:08alyssa: this would be a lot easier if I could actually reproduce the damn flake
23:09alyssa: got another fail, this time config 48, EGL_SAMPLES=4
23:12alyssa: 47, EGL_SAMPLES=2
23:13alyssa: another 47
23:13alyssa: a lot of 47
23:18alyssa: 47 twice more
23:18alyssa: while that's a little odd
23:18alyssa: it's testing 24 configs, only 3(?) of which are eGL_SAMPLES=0
23:19alyssa: so while I have not observed a non-MSAA failure, that's not wholly unexpected
23:22alyssa: hacked up the driver to pretend not to support MSAA, let's go
23:28alyssa: without MSAA, so far no fails observed after over 4000 iterations
23:29alyssa: going to let this keep going just in case, but I think this is indeed solid evidence that yes, it's MSAA related.
23:29alyssa: anholt_: nice one :+1:
23:35alyssa: ok, 10k iterations with no fail
23:35alyssa: yeah, I'd say this is indeed MSAA only.
23:59airlied: src/compiler/nir/nir_opt_algebraic.c: In function ‘nir_opt_algebraic’:
23:59airlied: src/compiler/nir/nir_opt_algebraic.c:1374082: note: ‘-Wmisleading-indentation’ is disabled from this point onwards, since column-tracking was disabled due to the size of the code/headers
23:59airlied: 1374082 | nir_foreach_function_impl(impl, shader) {
23:59airlied: nothing to see here, only 1.4M loc file :-P