IRC Logs of #wayland on irc.freenode.net for 2023-12-12

00:27 JoshuaAshton: riteo: Without commit_queue, FIFO is completely broken which results in this behaviour, there is no overlap
00:28 JoshuaAshton: That's why we are shipping it in SteamOS so early, it's required in order to not regress massively. You just stall in QueuePresent (before present happens) if your last frame has not been presented yet
00:28 JoshuaAshton: That's how "FIFO" is implemented right now :P
00:29 JoshuaAshton: It also depends on the compositor only sending `frame` once it has presented a new buffer for your surface, and not every vblank
00:30 JoshuaAshton: > If it works fine on XWayland but not Wayland then something is really wrong
00:31 JoshuaAshton: Yes it works fine on XWayland, as it has it's own internal queue in the XServer -- Mesa Wayland WSI does not do such, I suggested that in the past as a stop-gap and daniels told me it was not possible as there are applications expect QueuePresent to do a wayland commit, and that's basically ABI at this point, which makes sense but is also bleh
00:36 JoshuaAshton: Anyway, the best way to check this sort of thing is GPUVIS
00:36 JoshuaAshton: Highly recommend checking that out if you haven't, it's also very easy to use on the SteamOS Devkit Client.
01:06 riteo: JoshuaAshton: Thanks a lot, now I realize what you meant all this time
01:06 riteo: also thanks a lot for the tool, I definitely need better tooling if I want to start making more sense out of the black magic of graphics
01:08 riteo: so uh, this is pretty much a lost battle from the start without a proper queue. I'll consider my 2000Hz hack more than enough for the time being then :P
01:11 riteo: once that works out I'm sure that we can find a better solution. It's hard to measure any difference if the frame times are that messy and I seriously don't feel like going "blind" on these kinds of optimizations, especially as I don't have much experience
01:11 riteo: I might lower the tickrate though, let's see
08:29 MrCooper: Company: KMS is designed for bare metal, not for nested passthrough, something else such as Wayland is better for the latter
08:31 Company: MrCooper: that still leaves the question on how you do a good VM interface
08:34 Company: I mean, you obviously can get by with the simple swap API, but at that point you need to copy asap on the host
08:36 Company: but if you can figure out an API like GL, you're pretty close to a swap API but you have multibuffer
08:37 Company: I suppose that's hard though because the client allocates the buffers?
08:38 wlb: weston/main: Philipp Zabel * libweston-desktop: Work around crash when opening popup menu https://gitlab.freedesktop.org/wayland/weston/commit/b72785e1f651 libweston/desktop/seat.c
08:38 wlb: weston Issue #853 closed \o/ (Segfault on right-clicking weston-terminal https://gitlab.freedesktop.org/wayland/weston/-/issues/853)
08:38 wlb: weston Merge request !1416 merged \o/ (libweston-desktop: Work around crash when opening popup menu https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1416)
08:50 Company: MrCooper: btw, we talked about software rendering and using 565 instead of RGBA8 - I tried that recently to see if it does anything with GTK and ran my benchmark with LP_NUM_THREADS=1 LIBGL_ALWAYS_SOFTWARE=1 with both
08:50 Company: I got 35fps with RGBA8 and 30fps with RGB565
08:52 Company: so there's not much of a benefit to be found inside GTK itself I guess
08:52 MrCooper: heh, I guess 565 isn't as optimized as 8888 in llvmpipe
08:53 Company: I suspect the bottleneck isn't memory with GTK
08:53 Company: though most of the work we do is blitting glyph masks
08:55 Company: but my assumption is that setup overhead is more relevant than the actual blitting
08:59 Company: my main concern with going with 565 is the overhead of setting it up
08:59 Company: the whole fbconfig selection machinery inside GTK needs to choose it - and it very much doesn't ever want to right now
09:00 Company: plus you run into the issue that you have no alpha channel, so you need to restyle your UI and get rid of rounded corners and shadows
09:02 Company: so my gut says it's not worth doing, but I might change that opinion with profiling from gnome-shell that shows that its compositing gets loads faster
09:03 Company: I think the time is way better spent on minimizing damage regions and optimizing shaders
09:15 kennylevinsen: Company: re: kms commit, you don't need to copy ASAP, just before the next completed pageflip
09:16 Company: kennylevinsen: I have no control over the buffer if I wl_surface_attach() it
09:16 Company: so I need to copy before
09:17 kennylevinsen: the virtio-gpu driver decides when the pageflip completes, so who says it couldn't wait until buffer release? :)
09:17 Company: it could - but at that point you run into all the problems you have with a 2-buffer swapchain in a regualr app
09:22 kennylevinsen: Not quite - the two buffers locked there are not part of a two-buffer swapchain
09:23 kennylevinsen: it's two locked buffers out of an N-buffer swapchain owned by some KMS client
09:24 Company: I was more thinking about the host side
09:25 Company: virt-manager or gnome-boxes
09:26 kennylevinsen: yeah, but the need for 3 buffers in the swapchain is to account for the next render buffer - which exists somewhere inside the VM
09:27 kennylevinsen: the need for 4 is a bit hairier, I'd have to draw stuff to understand what would happen :P
09:27 Company: you're also assuming that all apps in the chain immediately release the buffer when they don't need it anymore
09:28 kennylevinsen: No I was assuming they'd release it some time after it was superseded, with pageflip delayed until that point
09:29 Company: I was thinking that one buffer is stuck in qemu, one in boxes, one in the host's compositor and suddenly that's quite a few
09:31 kennylevinsen: this is all hypothetical of course, but only the host compositor should be holding buffers if this was to support scanout
09:31 kennylevinsen: qemu and boxes would just be forwarding a dmabuf
09:31 Company: sure
09:32 Company: but once your system is under load, that forwarding might take a bit
09:32 pq: JoshuaAshton, "games are not event driven", yeah, that is getting old. Maybe one day someone designs a game engine in a more modern design that separates screen updates from all the rest. I did read the game-loop pattern article, and I think it stops where we would only begin.
09:33 kennylevinsen: Company: well, if load is causing forwarding that message to take significant time, then I think it's fair to consider the system unresponsive and performance to be best effort
09:33 kennylevinsen: significant time being some large fraction of the refresh cycle
09:34 Company: sure - but that's the 3/4 buffer cases in a regular app, too
09:34 kennylevinsen: no, the 3/4 buffer case is due to latching
09:35 kennylevinsen: 2 of those buffers are stuck waiting exactly a refresh cycle, the 3rd is the next buffer in the compositor, 4th the next render buffer
09:35 Company: actually, you're assuming that doesn't happen in boxes or qemu
09:35 Company: GTK syncs to the frame cycle, so you have latching there, too
09:36 kennylevinsen: there could possibly be issues there - it would be up to the client to send off the dmabuf as soon as possible of course to not delay things
09:36 pq: JoshuaAshton, "games are discrete simulations" - yes, but what I and emersion imagine does not conflict with that at all. That article you mentioned already has all the pieces. The only thing missing is to take the rendering out of the fixed course of the event loop and stop doing blocking platform function function calls that cannot be woken up arbitrarily.
09:36 kennylevinsen: my idea would fall apart if it decides to wait entire refresh cycles with it
09:37 Company: I think it's unreasonable to assume clients don't do that
09:37 kennylevinsen: well, it's not an arbitrary client here, it's specifically a qemu viewer client designed for the task
09:37 kennylevinsen: using qemu-specific APIs, with contracts to uphold
09:38 kennylevinsen: could just say "don't grab the buffer until you're ready to show it" for example
09:38 Company: I don't think that works
09:39 kennylevinsen: I don't see why it wouldn't, but we're also arguing over a half-baked idea
09:39 Company: you want your viewer to use a toolkit so that you can manage your vm window
09:39 kennylevinsen: honestly, a much bigger issue in scanout would be that the buffers from the client might not be scanot capable and getting dmabuf hints working right would be a pain
09:41 Company: it would be pain to get it right for all drivers I suppose
09:41 Company: but for the big 3 drivers almost everything is scanoutable, no?
09:41 Company: (intel, amd, nvidia)
09:41 kennylevinsen: Company: tbh, a toolkit-less VM viewer would be fine for me, but there's a difference between something being impossible and it being hard to get Gtk to adhere...
09:42 kennylevinsen: Company: hahahaha, no it's much safer to assume that almost nothing is scanoutable
09:42 Company: I read the code and it looked like the Vulkan wsi doesn't even look at the scanout flag
09:42 kennylevinsen: if the buffer doesn't match the *current* buffer format you might need a modeset for example
09:42 Company: yet it was difectly scanouted fine
09:43 Company: so I'm not too worried
09:44 emersion: for these drivers, the scanout flag is ignored, but some modifiers are scanout-capable and some not
09:44 Company: it's probably an 80/20 thing where 80% of the stuff just works and 20% are so hard to get right that people just give up
09:45 emersion: historically, the scanout flag has been ignored when using modifiers, because supplying modifiers to mesa would mean "if KMS supports this modifier, assume scanout"
09:45 emersion: not in mesa we properly propagate that flag, but drivers still ignore it
09:45 emersion: now in mesa*
09:45 Company: in GL
09:45 Company: but not in Vulkan
09:46 emersion: in vulkan we could use the private mesa ext
09:47 Company: I'd much rather the flag went away and it'd be encoded in modifiers
09:48 emersion: the flag cannot be encoded in modifiers, the modifiers only represent the tiling
09:48 emersion: buffer layout
09:48 emersion: not things like placement, alignment, etc
09:48 Company: sure
09:49 MrCooper: I did mention that before :)
09:50 Company: people aren't even using modifiers properly yet, I don't think anyone's gonna deal with yet another thing in the near future
09:50 kennylevinsen: whether a modifier can scanout also depends on the format and plane and such
09:50 MrCooper: pq: yeah, it's mind-blowing that games relying on rendering to have certain performance characteristics for correctness to be considered state of the art in 2023
09:51 kennylevinsen: e.g., most modifiers on my old intel can scanout ARGB8888 but some can also scanout NV12 and YUYV
09:51 kennylevinsen: and some planes only accept some modifiers
09:51 Company: kennylevinsen: to me as a dmabuf consumer modifiers are per-fourcc anyway
09:51 kennylevinsen: with e.g. my last primary plane supporting fewer modifiers than the first two
09:52 kennylevinsen: so even if an app can scanout on my first screen it might not work on my third
09:52 Company: basically a dmabuf comes with a 96bit GUID that identifies it and I have to carry that thing around
09:52 kennylevinsen: fair, I'm looking at the modifier and format identifiers directly
09:53 MrCooper: think of scanout as the 97th bit
09:53 Company: MrCooper: that doesn't work, because all the APIs want just 96bits
09:53 Company: well, the good ones, v4l still just wants 32
09:53 emersion: the scanout flag is only relevant for allocation
09:54 kennylevinsen: even if you had that bit, such buffer is not necessarily scanoutable
09:54 emersion: it's not relevant for import
09:54 kennylevinsen: as mentioned, it might work for scanout on one monitor but not the next, or might not match the output configuration and require modeset
09:55 Company: I'm aware of all of that - but it doesn't change the fact that I carry 96bit GUIDs and that's all I can do, because nobody supports anything else
09:56 kennylevinsen: what would you use the bit for if you had it?
09:56 Company: to give it to whoever needs it
09:57 kennylevinsen: ... does anyone need it? the compositor doesn't, and no one else can scanout
09:57 Company: whoever allocates the buffers
09:58 Company: vapi or qemu I guess
09:59 kennylevinsen: why not just feed it the dmabuf hints?
10:00 kennylevinsen: i guess qemu could also just mimic the host GPU capabilities
10:00 Company: because there's no "dmabuf hints" - if there was a libdmabuf with a dmabuf_hints_t and I could just hand that to them, that'd be neat
10:00 Company: but that's not how the world works - at least not today
10:01 kennylevinsen: huh? I'm talking about https://wayland.app/protocols/linux-dmabuf-unstable-v1#zwp_linux_dmabuf_feedback_v1
10:01 Company: yeah, that's a Wayland thing
10:01 Company: but nobody else has that
10:02 Company: Wayland also has https://wayland.app/protocols/linux-dmabuf-unstable-v1#zwp_linux_buffer_params_v1:enum:flags - and nobody else has that either
10:02 kennylevinsen: that's what qemu would have to be able to absorb if you want any scanout success
10:02 Company: any guaranteed scanout success
10:03 Company: so far I'm going for "accidentally works"
10:03 kennylevinsen: allocating scanoutable buffers that never scanout is not a good idea
10:03 kennylevinsen: before dmabuf hints, direct scanout was more of a "if the stars and moons aligned on a day that started with the same letter as your name" ordeal...
10:04 pq: pac85, usually there are multiple necessary conditions to kick off a new frame rendering, not any single event.
10:04 Company: it seems to work so well that Valve hasn't complained about Wayland games not getting scanout
10:04 Company: or have they?
10:04 Company: s/Wayland/Vulkan/
10:11 MrCooper: Xwayland was getting lucky for hysterical raisins
10:11 ofourdan: the best ones, always.
10:12 MrCooper: (always allocating scanout capable buffers, which can be bad as well when scanout isn't possible anyway for other reasons)
10:12 Company: ofourdan: btw, you know you've got a Youtuber fanboi?
10:13 ofourdan: do I?
10:13 ofourdan: I fear the worse…
10:13 Company: https://www.youtube.com/watch?v=h-b1hvsmG6Y
10:14 Company: he did another video about your recent blog post
10:14 ofourdan: oh, that looks actually positive (although I haven't actually looked or listened…)
10:15 Company: I was serious with that fanboi term
10:15 Company: you should go on his podcast and talk about XWayland
10:15 ofourdan: heh
10:16 emersion: though don't listen to the one about wayland-protocols…
10:16 ofourdan: that "internet" thingy scares me, ya know… ;)
10:16 Company: https://www.youtube.com/c/TechOverTea is the podcast, but I don't listen to it
10:17 Company: or rather: rarely, only when the guests are interesting
10:17 Company: emersion: yeah, you have an opinion, he doesn't like you at all
10:17 ofourdan: heh
10:21 emersion: it does seem like i have a special place in their heart, but also, it's not just about me
10:21 emersion:shrugs
10:22 kennylevinsen: you have other fanboys, don't worry
10:22 emersion: lol
10:23 Company: at least now you know someone reads your comments in merge requests
10:53 pq: Company, KMS does not have buffer release events, it only has page flip events, which means what happens in the host side of a VM situation cannot affect how many buffers the quest could benefit from.
10:55 pq: Company, the only way how direct scanout from a VM guest through host to hardware can work is to delay the page flip event to the guest app until the new framebuffer has been forwarded *and* the previous framebuffer has been made idle, which tends to decimate framerate as there is no room to pipeline through all the layers.
11:07 pq: ManMower, riteo, kennylevinsen, the wl_event_queue has unbounded storage to hold incoming events. The "tiny wayland buffer" that overflows is actually in the compositor side.
11:26 alatiera: speaking of wayland-protocols
11:26 alatiera: I was thinking of making a joke MR about adding ext_keylogger_v1
11:26 alatiera: but I am not sure if that would be taken jokingly or it would be annoying
11:28 pq: There is no joke that everyone would take as a joke, so I think a flame war or at least bad PR would be guaranteed.
11:30 Company: alatiera: you have 4 months to come up with good phrasing for that joke
11:30 pq: please don't
11:30 pq: it can only hurt us
11:30 pq: there was another incident this year, but I've fortunately forgot what it was about.
11:31 Company: I don't think ext_keylogger is funny, because Wayland security is not meme status
11:31 alatiera: "do everything x11 does" is the meme though
11:32 Company: ext_x11_v12 would be funnier
11:33 pq: If you file a MR, maintainers must take it seriously regardless if it's a joke. That takes effort. If maintainers shrugged it off as a joke, that could be used as a weapon against them "they don't even bother telling why not".
11:33 kennylevinsen: pq: Yeah, so as long as the client can read_events it should stay alive even if it never dispatches...
11:34 pq: kennylevinsen, yup. All the way until OOM.
11:35 Company: alatiera: it'd probably be funnier as a github gist though, because that's how that meme goes
11:35 alatiera: Company: it's now a proper repo!
11:36 Company: you could file an MR for ext_keylogger_v1 against it!
11:36 alatiera: yes but then I'd have to interact with those people
11:39 soreau: I couldn't wait for an extension so I implemented printer support in the compositor..
11:39 soreau: a keybinding to run a script that screenshots the screen and calls lp. done.
11:40 Company: pq: yeah, I think with the current API you pretty much always want to copy the buffer, which is fine for desktop apps I guess, but you won't get good results in GL benchmarks
11:40 davidre: I wonder if they expect compositors to implement those protocols (if there ever will be any)
11:40 Company: though who runs GL benchmarks in a VM?
11:42 Company: davidre: "they" is composed of too many groups of people to have a clear answer for that
11:43 Company: like, a lot of "they" is people who want to stop Wayland from happening because they like their X11 setup and don't want to change - they don't care at all what happens, as long as it doesn't make X11 go away
11:45 soreau: 'they' are too comfortable writing questionable code in the wm and just putting it on a loop in a script, so 'who cares if the wm dies?' :P
11:46 pq: Company, indeed. IIRC there was even a Weston issue about running: host, weston, qemu, weston; and complaining that framerate is half of what it should be without copying in qemu.
11:46 pq: and you could make it run at full rate if you adjusted each Weston's timings carefully
11:47 pq: (repaint-window weston.ini option)
11:47 soreau: hax
11:47 pq: yes
11:48 soreau: sounds kinda like wlroots max_frame_time or whatever it's called
11:48 pq: possibly
11:48 Company: it should really have some kms interface to do a swapchain
11:48 soreau: well, not a wlroots-specific concept but impl. in many wlroots compositors
11:48 Company: to do this well
11:49 pq: yeah, well... that's also not what KMS is for, really
11:51 kennylevinsen: soreau: the implementation differs a bit, but same concept yeah
11:51 pq: it's just really convenient for VMs to offer a KMS driver, because most userspace software already works on KMS, right up until it becomes inconvenient when the VM behaviour differs from hardware behaviour. Like cursor hotspots and swapchains.
11:51 Company: it's a kernel module, it could just swap the memory behind the fd!
11:51 Company: yeah, it gets tricky at some point
11:51 pq: Company, then it would also need to make sure the other memory has the same contents as the original, because guest userspace does incremental drawing...
11:51 Company: I imagine people want rootless clients in a VM
11:53 emersion: pq, and then scaling and multi-output layout and …
11:53 pq: thankfully I think rootless clients is something no-one would try to shoehorn through KMS, but there are lower hanging culprits. Like the user resizing the VM viewer window and expecting the VM to resize accordingly in it.
11:54 Company: yeah
11:56 pq: the cursor hotspot thing was implemented in KMS recently, along with a way to allow VM viewers hijack the cursor plane position without breaking userspace by having userspace promise to not use cursor planes for anything else than the cursor of the one... err, aggregate of all mouse/pointer decives exposed to the guest.
11:57 pq: a really common setup, but ugh the special-casing just to get KMS working well enough
12:13 kennylevinsen: I wonder what happened to virtio-wl...
12:14 davidre: Company I was trying to refer to the repo is now the thing not githubgist anymore
12:15 davidre: Where the specific repo I know of is empty and I doubt mainstream compositors would implement anything
12:38 vaxry: I don't know if this has been posted already, but probonopd is being probonopd again https://github.com/probonopd/wayland-x11-compat-protocols lmao
12:43 bl4ckb0ne: at last, wleyes
12:49 davidre: Yes that what I was thinking about
15:11 wlb: weston Issue #854 opened by Pekka Paalanen (pq) Name struct owning field `owner` https://gitlab.freedesktop.org/wayland/weston/-/issues/854
15:19 wlb: weston Issue #322 closed \o/ (Use wl_global_remove in libweston https://gitlab.freedesktop.org/wayland/weston/-/issues/322)
19:08 pitust: is there a protocol for asking the compositor to draw my surface on top of a screen (for a notification or something like that)
19:10 pitust: hmm, maybe zwlr_layer_shell_v1 would work
19:43 riteo: pq: re big client-side buffer, nice to know
23:24 stooov: ola people. I noticed that in wayland, chrome|brave|webkit's pointerlock hides the pointer but doesn't hold it in place - and it can hit the edge of your screen. have a go: https://svelte-dj-knob.netlify.app/
23:25 stooov: it works fine (pointer reappears where it began the lock) on firefox, X11, or if your pointerdown even is within about 20px of the edge of the page.
23:44 soreau: stooov: it might be that the compositor you're using doesn't implement pointer lock correctly? have you tried others?
23:45 soreau: the web app you linked to seems to be working correctly in wayfire
23:46 ifreund: yeah, works in river too