00:06Venemo: karolherbst: if I knew that I wouldn't be asking you
00:07Venemo: karolherbst: I also can't help but wonder if we can get away with just not supporting ARB_compute_variable_group_size on GFX6
00:21zmike: it's optional, so why not
06:42tzimmermann: mripard, mlankhorst, shall i do the PR for drm-misc-next this week?
07:04mripard: tzimmermann: yeah, please, I won't have the time today
07:12tzimmermann: ok
07:57sima: mripard, dim: ERROR: 68a7c52fa9e7 ("drm/dp: clamp PWM bit count to advertised MIN and MAX capabilities"): Mandatory Maintainer Acked-by missing., aborting
07:57sima: false positive on the drm-misc-next pr?
07:57sima: honestly no idea why it's complaining ...
07:58sima: I also get this one
07:58sima: ./scripts/get_maintainer.pl: file '&STDIN' doesn't appear to be a patch. Add -f to options?
07:59sima: tzimmermann, processing that rn, in case you're wondering
08:06sima: mripard, https://paste.debian.net/hidden/b010fec0/ quite a bunch of hits really and most looks like they really shouldn't have happened
08:09tzimmermann: sima, no idea about these, but i regularly see such issues. i usually use -f to ignore them. dim' checks are not very exact, i think.
08:09sima: yeah this is not great, either we'll drop the check because too much noise, or we'll fix the check (or maybe MAINTAINERS entries need to be fixed)
08:10sima: but training people to ignore checks is no bueno at all
08:11sima: tzimmermann, it also fired for fbdev work and that should have a MAINTAINERS entry for drm-misc
08:12sima: so maybe that needs to be fixed somehow or the check is just too brittle
08:16sima: mripard, but yeah that check defo misfires a lot, for whatever reasons
08:21Venemo: zmike: is it also optional for CL?
08:29tzimmermann: sima, IIUC you're currently merging the last PR drm-misc-next?
08:29tzimmermann: 'from'
08:29mripard: sima: 1beee8d0c263 has been properly applied by Thierry, but by a mail address that is neither in mailmap nor MAINTAINERS
08:30mripard: e3f4bdaf2c5b and c08c931060c7 had no ack from the reported maintainers
08:31mripard: same for 6a1977472f6b
08:31mripard: well, all of them, really
08:32mripard: I don't know what we can do about people not using a documented address to commit
08:33mripard: the fbcon, ALSA, and dt-bindings are all legit
08:34mripard: the only ones where there's an issue I guess are the shmem and drm/dp ones
08:35mripard: I can't find a good reason why they would report an issue, because it's committed on drm-misc
08:42karolherbst: Venemo: I have no idea what uses that extension
08:43Venemo: karolherbst: did you say it was necessary for rusticl or did I misunderstand you?
08:43karolherbst: well rusticl needs it supported, yes, but then the question is how important is CL support on gfx6
08:43karolherbst: what GPUs are gfx6 btw? 🙃
08:43karolherbst: however
08:44karolherbst: the issue is less problematic with CL because the driver can just say "only 32 threads supported"
08:44karolherbst: the issue is GL with its 512 threads requirement
08:44Venemo: karolherbst: GFX6 are the first generation GCN cards. AFAIU rusticl already works well on these so I'd hate to take it away from users
08:45karolherbst: right....
08:45karolherbst: rusticl calls into si_get_compute_state_info to get the information and I suspect this is the reason it works well for cl
08:47karolherbst: could maybe set max_variable_threads_per_block to 256 or something for GFX6. Rusticl never reads that one out and I think the GL ext gets disabled if it's below 512
08:47Venemo: yes, afaiu we can support up to 256 sized workgroups without hitting the HW bug
08:47Venemo: if that's enough for rusticl then that sounds like the right approach, unless something important uses this gl ext.
08:47karolherbst: that cap is irrelevant for rusticl because it uses get_compute_state_info to get more reliable information anyway
08:47Venemo: mareko: do you have any thoughts on this?
08:48sima: the alsa and dt ones seem at least reviewed by maintainers, which I thought was good enough for the script?
08:48sima: or am I blind
08:48sima: tzimmermann, yeah just pushed it out
08:49sima: the fbcon is probably because the drm-misc entry isn't wide enough
08:49mripard: sima: reviewed-by doesn't mean a maintainer agrees with merging the patch through another tree
08:49sima: mripard, for the tegra one I guess more mailmap entries needed?
08:50sima: mripard, yeah but if that means tons of false positive we just train people to ignore dim
08:50mripard: then let's just remove that check
08:50mripard: if we don't want to enforce anything and people will just ignore it anyway
08:50mripard: what's the point?
08:51sima: it's not a matter of not enforcing, but fixing things so it's at a point where we can enforce
08:51mripard: but also, let's not complain about people being pissed about how liberal drm-misc is when merging stuff
08:52sima: so most likely that means chasing MAINTAINERS and .mailmap entries for a while until the s/n is a lot better
08:52mripard: sima: the vast majority of those warnings are legit except for three, and then everybody thought that it was worth ignoring
08:52sima: at least what I've checked all the hits looked like process was followed, it's just dim being a bit too dense
08:52mripard: (and nobody thought about reporting an issue to dim or whatever)
08:53sima: yeah, you by default train people to just ignore dim
08:53sima: that's the reality of this stuff
08:53sima: which means maintainers need to chase this a bit proactively, at least at first
08:57mripard: if you feel like this check is a net loss and is harming how people use dim in general, that's what I said, let's just revert it. I can't do hand-holding right now, and it won't matter anyway because we'll still get people that will just choose to ignore it and we will discover it after the facts
08:57mripard: it's a failed experiment
08:57mripard: let's just wait for gitlab ci until we can properly fix this mess.
08:57sima: the thing is, gitlab ci won't fix this either
08:57sima: because looking through that entire list, in my maintainer eyes the entire pile of patches is fine
08:58sima: so it's either a case of the tool being too strict or .mailmap/MAINTAINERS not reflecting reality
08:58sima: and yeah if there's no one pushing to sort this fallout out, we have to revert the check until someone volunteers
08:59sima: adding checks without sorting out the fallout does not work
08:59sima: no matter how fancy your ci ui
09:00tzimmermann: sima, mripard, do these errors actually show up on 'dim apply-branch'. i seem to only hit them pull-request, when it's already too late
09:00sima: and I do still thing this is worth it as an experiment if, but only if, someone does the fallout handling until it's at a good state
09:02sima: tzimmermann, quit test says no
09:03sima: mripard, ^^ I guess that's your explanation
09:04sima: but also given how many hits this has I think we can't roll this out until there's much, much less noise
09:04tzimmermann: sima, mripard, another observation: i just had "dim: WARNING: afce57a71dd7 ("drm/kmb: Switch to drm_atomic_get_new_crtc_state()"): Mandatory Maintainer Acked-by missing., but continuing dry-run" from pull-request
09:05tzimmermann: that commit was made by maxime and reviewed by vsyrjala. i'd say that these two give enough legitimacy to the commit to not require a maintainer ack.
09:05tzimmermann: can this be adressed somehow?
09:06mripard: sima: you insist on calling it noise, but again, there's three patches in there that should be sorted out
09:06tzimmermann: maybe by adding more reviewers to drm-misc?
09:06sima: mripard, which three?
09:06mripard: so it really doesn't look like, if we were to sort out the actual issues, it would still be good enough for you
09:07sima: and even if it's hitting 3 real issues, there's way too much noise for that rn still
09:07mripard: c08c931060c7, e3f4bdaf2c5b and 3a9cf301794c
09:08sima: c08c931060c7 is shmem which defo is maintained in drm-misc in practice
09:08mripard: tzimmermann: I think those are the cases that should be rightfully ignored
09:08sima: same for dp helpers
09:08mripard: it really is supposed to be maintained outside of drm-misc, but maintainers are ignoring those patches
09:08mripard: sima: they are, and there's something to fix in dim for those
09:09sima: mripard, oh I thought you've found 3 legit cases where dim rightfully complained
09:10mripard: sima: no, these three are the bugs
09:10sima: even ignoring that MAINTAINES/.mailmap are not reflecting reality
09:10mripard: all the other are legits
09:10sima: yeah, but only at a "dim is right, the world is wrong" understanding of tooling
09:10sima: which doesn't work
09:10mripard: I mean, yeah, if you want to frame it that way sure
09:11mripard: I guess it also could be "the kernel is right, drm-misc is wrong"
09:11mripard: it's how *everybody* else operates in the kernel, it's documented as such, but somehow, dim should be the source of truth
09:11sima: like one reality thing is that people often don't bother to record the ack if there's an r-b there already
09:12sima: then there's mismatches that aren't captured in .mailmap
09:12mripard: and again, I'm fine with declaring it unfixable and reverting it
09:12sima: and then there's MAINTAINERS not being complete
09:12mripard: but then, people *should* complain about how drm-misc operates
09:13mripard: I don't buy the r-b vs a-b thing
09:13mripard: we manage to handle it just fine within drm
09:13mripard: I get ping'd by intel maintainers to give my ack to merge stuff on a regular basis for example
09:13sima: it's a matter of training a ton of people to really follow it
09:13mripard: it's how we should operate, and we do operate that way internally
09:14sima: which is inertia and needs someone to handle the fallout and train people
09:14sima: my point is that adding the check in dim and debugging it is probably 5% of the total effort to roll this all out
09:14sima: at least judging by that one drm-misc-next pr and how much it hits
09:16sima: the same with making sure that MAINTAINERS entries really capture all files and don't misfire on e.g. headers that are not included
09:16sima: and that people dutifully record all their aliases in .mailmap
09:16mripard: sima: then if I was expected (and still is) to do it all by myself, I'm sorry I didn't understand it, I'm sorry I wasn't able to do that task in the last few months and it created chaos and reduced the trust in dim, and I won't be able to do it in the next couple of months (hopefully) either, so we should acknowledge that and just remove the thing.
09:17sima: and only if we have that (and fix whatever bugs the current version seems to have) do we get to a level where the noise isn't drowning out the real stuff
09:17sima: mripard, yeah I guess that's what it boils down to :-(
09:17sima: also to make it clear: I've been on this not to give you a hard time, not at all
09:18sima: but to make sure that I still think it's something we should strive for, ideally
09:18sima: just turned out a lot more work than I thought
09:18sima: since ideally this helps to keep MAINTAINERS up to date and all that
09:21sima: mripard, also since this seems to have only fired for maintainers I don't think you reduced trust in dim for committers
09:24mripard: it fired everytime someone applied a patch, and chose to ignore those warnings instead of asking on irc, by mail, or reporting a bug to dim.
09:32sima: mripard, I didn't debug it, but I just applied one of the patches to drm-fixes and it didn't fire
09:32sima: so at least on a quick test it seems to not fire for apply-branch somehow
09:34sima: same result on drm-misc-fixes
09:35sima: bit too lazy to reset drm-misc-next back and try there, don't want to risk something stupid
09:40mripard: it's being executed in checkpatch_commit_push, which runs in dim_push_branch and dim_apply_pull
09:40mripard: ... but then dim_apply_pull isn't used
09:41mripard: or is it?
09:41mripard: but yeah, apply_pull doesn't seem to run any of the push checks
09:42sima: hm just tried a dry-run push-branch, also doesn't fire
09:42sima: no idea what's going on
09:42mripard: with "bad" patches?
09:43sima: yeah the alsa one that fired in the drm-misc-next pr
09:43sima: so defo outside of all drm trees, just in case it's a drm.git vs drm-misc.git difference
09:45sima: mripard, https://paste.debian.net/hidden/d97d6043/ trying to summarize all the points, if you still want to just revert for now
09:45sima: I guess could also just disable the check, but not sure that's better since it might bitrot either way
09:46mripard: ack
09:54sima: mripard, there's also pending mr from narmstrong ...
09:57narmstrong: I mean, it could be a warning or moved to a command like `dim check-acks`, but blocking the push is not good
09:58sima: narmstrong, https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/93
10:00mripard: narmstrong: this experience shows that a warning or an extra command doesn't work.
10:01narmstrong: agree
10:03mripard: sima: there's also https://gitlab.freedesktop.org/drm/maintainer-tools/-/merge_requests/92
10:03sima: mripard, yeah I've added a comment to that in the mr, want me to include it in the commit message?
10:04sima: let me just do that
10:06sima: demarchi, I guess dim doesn't have marge yet?
10:08sima: narmstrong, so also ack from you on the revert?
10:10narmstrong: sima: yes ack
10:15sima: narmstrong, quick comment on the mr for the record?
11:51molinari: mairacanal, Hi Maíra. Just wanted to let you know that I've just sent a patch series involving V3D that you might be interested in ("[PATCH v5 00/12] drm: Reduce page tables overhead with THP"). It ends up adding THP support to Panfrost and Panthor like in V3D so I added GEM shmem helpers to simplify huge mountpoint handling in drivers.
12:19tzimmermann: tomba, ack on https://patchwork.freedesktop.org/patch/679745/?series=155688&rev=1 for omapdrm?
13:11demarchi: sima: why would we need marge?
13:55tzimmermann: tomba, thanks
13:56tomba: tzimmermann: np
13:56sima: demarchi, adds Link: tag back to mr, which might be useful for discussions
13:56sima: I think that's about the only thing that'd be useful for dim
14:25mripard: tzimmermann: thanks a lot for the PR
14:40Venemo: karolherbst, mareko: so, about those variable workgroup sizes. do you think rusticl would be fine if I lower the maximum workgroup size to 256 on the GPUs affected by the hw bug I mentioned above? or would you prefer if we still supported the current minimum 512? are there any compute apps that users are running that need 512?
14:41tzimmermann: mripard, np
14:42karolherbst: Venemo: rusticl doesn't care about the variable workgroup size cap
14:43Venemo: karolherbst: awesome, thanks
16:56alyssa: Venemo: how do I pronounce nlwgs
16:56Venemo: alyssa: nlwgs = nir_lower_workgroup_size
16:57alyssa: yes I know
16:57alyssa: how do I pronounce it in my head
16:57Venemo: I pronounce it as en-el-vee-gee-es
16:58alyssa: hmm.. quite a bit longer than "nlaw-giss"
16:58Venemo: hahaha
16:58karolherbst: it's wigs obviously as the nl is silent
16:58dwfreed: I would just spell it out
16:58dwfreed: n l w g s
16:58alyssa: "nil-wigs" is pretty good
16:58mlankhorst: haven't heard of a silent NLAW, but maybe tht's just me. :-)
16:59alyssa: "why is he bald?" "nil wigs" "oh ok"
16:59Venemo: :D
17:00Venemo: I see I made the right choice when naming those functions
17:00alyssa: Venemo: anyway I am deffo not reviewing that pass but it looks really slick. great work :)
17:00Venemo: thanks
17:00Venemo: why not? I thought the laughs would inspire you to review it
17:02alyssa: I've got a lot on my plate right now
17:02Venemo: I see
17:02alyssa: and by that I mean I meant to review Lionel's MRs, like, a month ago
17:04Venemo: I understand
17:56pac85: Venemo: awesome work! I have a question: I had scratched my head about this emulating wgs problem and my thinking was that any cf that wraps barriers needs to be uniform across the whole workgroup (by spec) so doesn't need any manipulation and any cf that doesn't contain barriers could just be replicated. So why do you manipulate cf wrapping barriers?
18:00Venemo: pac85: that's exactly how it works, any CF that doesn't have barriers is replicated. CF that contains barriers needs special handling because all logical subgroups need to reach the barrier at the same time. so basically all work they do before the barrier should go before the barrier, and vice versa
18:03Venemo: pac85: when a barrier is inside CF, that's tricky, but the same thought applies recursively. parts that don't have a barrier are just replicated, and the part that does have the barrier is split again
18:06Venemo: pac85: you are correct that the spec mandates that all workgroup barriers need to be in workgroup-uniform CF, but that fact isn't helping the job of the pass. though you are also right that some of the emitted code could be simplified if we had a workgroup divergence analysis. I opened a NR for that years ago which unfortunately isn't reviewed yet. I plan to revisit it soonish.
18:07Venemo: so, in this pass, I didn't want to rely on the divergence info not only because we don't have it but also because some game and app developers have a very loose understanding of the spec
18:08Venemo: pac85: does that answer your question?
18:50pac85: Venemo: thanks, yeah it does make a lot of sense that you split the cf recursively like that. What I meant is that in `nlwgs_augment_if` you do this manipulation where you or together the conditions for all subgroups then re emit ifs per subgroup inside of it. I felt like, since `nlwgs_augment_if` is only called for ifs wrapping barriers, then all the predicates in the recursion stack to each call would have to be uniform and so this
18:50pac85: manipulation wouldn't do anything.
18:54Venemo: pac85: first, thanks for looking at the code. regarding the augment_if, you are correct that it wouldn't be necessary for just workgroup-uniform CF. (and likely will be optimized back to how it was before). however I'd like to not break apps that do stupid stuff and put barriers inside divergent CF.
18:55Venemo: pac85: for a example, having a workgroup barrier inside "if (elect())" would work in practice, even though it's technically not in uniform CF. I don't want to break that.
18:59Venemo: pac85: a secondary motivation for writing the code that way is that I wanted it to be correct even without the "just replicate everything" strategy, and I prefer it to work with divergent CF as well for the sake of testing.
19:00pac85: Venemo: aah ok I see, fair enough! it's not much complexity anyway. And no problem I read it out of curiosity! Nice stuff
19:01Venemo: pac85: thank you. would you like to add a Rb?
19:01pac85: Sure!
19:06glehmann: how is intel even vk/gl conformant with all of their txd lowering? not having aniso for tex arrays sounds really bad, for example
19:17Venemo: thank you, pac85 this is much appreciated :)
19:19pac85: Venemo: No problem at all! Was an interesting convo!
22:18mareko: Venemo: OpenGL requires 512 as the minimum
23:02Venemo: mareko: wdym?
23:07Venemo: mareko: my understanding is that it should be fine to reduce max_variable_threads_per_block to 256. in this case, st_init_extensions will automatically drop support for ARB_compute_variable_group_size but otherwise everything should be fine, right? and based on my conversation with karolherbst above, this is also fine for rusticl. please correct me if I'm wrong
23:07karolherbst: I think that just opens up the question if support for that ext should be dropped on gfx6 or if it should be fixed
23:08karolherbst: which isn't my call to make anyway
23:09airlied: well do we also want CL on gfx6?
23:09Venemo: I can fix it to support 512, but I would be more motivated to fix it if something or someone actually needs it
23:09Venemo: airlied: as far as I understand CL will be happy with 256 too
23:10Venemo: karolherbst: did I misunderstand you earlier? will it regress rusticl if the maximum variable workgroup size is 256?
23:15karolherbst: Venemo: rusticl doesn't care about that limit
23:15karolherbst: it's really only affecting GL_ARB_compute_variable_group_size
23:15Venemo: does this mean that rusticl doesn't use variable workgroup size, or that rusticl will happily create larger workgroups than the limit?
23:16karolherbst: it uses the `get_compute_state_info` function to query the maximum workgroup size and drivers are free to report whatever they want given a compiler compute shader
23:16karolherbst: *compiled
23:16karolherbst: the pain point with GL is that it only has a global limit that has to work across all shaders
23:17karolherbst: in CL those limits are per compiled kernel and don't have any minimum values a driver need to support
23:17karolherbst: so the cap just doesn't make any sense to use
23:17karolherbst: the driver can already say "for this given shader, you only get 128 tops" and that's fine
23:17Venemo: so, if si_get_max_workgroup_size tells rusticl that the max is 256, is that fine? or is that going to limit some apps in some ways?
23:18karolherbst: well it does limit apps as they can only use 256 big workgroups, but they do have an API to query this
23:18karolherbst: no idea if there are applications that strictly need more
23:18karolherbst: but they can also compile with a fixed workgroup size anyway
23:18karolherbst: in which case they might even be able to run bigger workgroups
23:20Venemo: alright. so my takeaway from that is still that it's fine to reduce that limit to 256 then
23:20karolherbst: in regards to rusticl yes
23:22Venemo: karolherbst: when rusticl uses variable workgroup sizes, is there a way for it to know the maximum size that the app may use?
23:22karolherbst: nope
23:22Venemo: ok
23:22karolherbst: well..
23:22karolherbst: not more than reported tho :D
23:23karolherbst: but it wouldn't know what the application expects
23:23Venemo: you said, "in CL those limits are per compiled kernel" - hence my question
23:24karolherbst: well apps either specify the size in the source code or they have to deal with whatever they get
23:24Venemo: I assume when you say specify in the source code you mean that's an exact workgroup size then and not variable
23:24karolherbst: correct
23:25karolherbst: well...
23:25karolherbst: they can also specify it as a hint
23:25karolherbst: which I have no idea if many do that
23:25Venemo: what happes when they specify it as a hint?
23:26karolherbst: driver could make use of that information to create an optmized variant
23:26karolherbst: which rusticl does
23:26karolherbst: so rusticl compiles a variable shader and a fixed one in that case
23:26Venemo: I see
23:27Venemo: okay, I think it's clear now. thanks karolherbst
23:28karolherbst: from an API perspective the difference is that with a required workgroup size, the application can only use that specific workgroup size to launch the kernel, while with a hint it can still use whatever it wants, it just tells the runtime that the hint size is most likely used, but might not always be
23:28Venemo: got it
23:28Venemo: mareko: so, is it ok if I limit the maximum variable workgroup size on gfx6 to 256? afaiu we then only use GL_ARB_compute_variable_group_size and everything else will be fine.
23:30Venemo: oops, typo. I mean, we then only lose* GL_ARB_compute_variable_group_size