09:04wlb: weston Merge request !1389 opened by HeYong (HeYong) replace weston_signal_emit_mutable with wl_signal_emit_mutable https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1389
09:42pq: kchibisov, could it be a case of https://gitlab.freedesktop.org/wayland/wayland/-/issues/10 ?
09:44pq: kchibisov, you can see multiple wl_seat and wl_output globals. Each global refers to a different underlying thing, but since an output is an output, all outputs use the wl_output interface each.
09:46pq: Globals are things, interfaces are APIs, and wl_proxies are instances of an interface referring to a thing.
09:49kchibisov: pq: could, but I'm not quite sure how I end up binding the display.
09:51kchibisov: Like from the log it looks like a client bug https://github.com/neovide/neovide/issues/2091#issuecomment-1813101342 ?
09:51kchibisov: but in the code we reliably remove outputs.
09:54pq: kchibisov, this is issue 10.
09:54kchibisov: so it's a kwin bug?
09:54kchibisov: I read that you need to send things in the near same time in 10.
09:55pq: Originally it is a protocol mis-design, making clients vulnerable to global removal races, and losing that race trigger the error here.
09:55kchibisov: but given that I got add/remove as a 2 consequative events won't it be fine?
09:56pq: It has been solved by adding API (links in the issue) to send the global removal event first, and then much later actually destroying the global, to give time for clients to handle the removal. This is something the compositor needs to implement.
09:56kchibisov: Hm, I think I get the issue is that it's removed on kwin side.
09:56pq: between global_remove event and actual removal, a client binding to the removed global would still work.
09:57pq: yes, this is something only kwin can fix
09:58pq: Also, it's curious how within the same millisecond, the compositor manages to advertise a wl_output, remove it, and advertise another wl_output.
10:00pq: that hints to some kind of software design problem, bcause no-one can unplug and re-plug a monitor within a millisecond. Even connector load detect cannot work that fast, right?
10:00emersion: there are a bunch of cases where something like this can happen
10:01emersion: docks, flaky connectors, suspend/resume with different screens
10:01pq: would dock and flaky connectors really do it within a millisecond?
10:02emersion: i don't know
10:02emersion: maybe also changing a setting in the screen config
10:02pq: from the log, it seems the wl_output lives about 50 microseconds
10:03emersion: maybe this happens when the output is disabled, but output config is managed by another program?
10:04emersion: (would be better to wait for the external program before doing anything with the output, in that case)
10:04pq: maybe - I'd call that a software design problem, this is just teasing clients
10:05kchibisov: it's a KDE virtual desktop thing.
10:05pq: ok
10:06pq: in any case, issue 10 needs a compositor workaround, and that would stop your app from crashing
10:06pq: regardless of how many and how fast the compositor is adding and removing wl_outputs
10:07kchibisov: it also send a configure with 8K x 8K while doing so....
10:08pq: :-p
10:09pq: kchibisov, in the client, do you drain the Wayland events before actually doing the window changes? Or was this not possible because of library API design?
10:11kchibisov: it could be wayland-rs? Or maybe mesa drained a bit.
10:12pq: I mean, draining would be good, so the last window configure you see is not the 8k one
10:12kchibisov: but I don't remember it batching.
10:13pq: as in, you don't needlessly draw an 8k window, only to see in the next event that that was outdated
10:13kchibisov: I think we read all.
10:13pq: cool
10:13pq: just an aside :-)
10:14kchibisov: I know that winit itself batches.
10:15kchibisov: so you won't see this thing with winit.
10:27kchibisov: oh, I got what you've meant with batching. a toolkit side batching. yes, we do it, but I plan on moving it into users side.
12:36zamundaaa[m]: <pq> "in any case, issue 10 needs a..." <- We have that workaround. KWin waits for 5s before deleting globals that were removed
12:36emersion: 5s sounds a bit risky under heavy load
12:36zamundaaa[m]: <kchibisov> "it's a KDE virtual desktop thing..." <- KWin doesn't do anything with wl_outputs when it comes to virtual desktops
12:38pq: well, the protocol log shows everything happening in one millisecond. An old version of kwin, perhaps?
12:39zamundaaa[m]: Would have to be very old
12:39kchibisov: zamundaaa: you could check the issue I've linked.
12:40zamundaaa[m]: Isn't the debug printing on the client side? These timestamps aren't necessarily real time if the client can't keep up, right?
12:40zamundaaa[m]: kchibisov: it's 5.27.9, that's the latest version
12:40kchibisov: zamundaaa: well...
12:41kchibisov: it's all client side log, but the size to 8K res is also really weird.
12:41zamundaaa[m]: True
12:42kchibisov: maybe they have some weird plugin, idk, I just got pinged because my toolkit got involved.
12:44zamundaaa[m]: Oh, the 8k thing is the stupid placeholder output
12:44kchibisov: A kind of big placeholder.
12:45pq: good point about timestamps
12:45kchibisov: yeah, but I'm pretty sure client can keep-up into 5 seconds?
12:46pq: maybe the app contains code that blocked it for 10 seconds
12:46kchibisov: Hm.
12:46zamundaaa[m]: kchibisov: yes, to not resize clients to be smaller than they were on a real output. It's a relic from the past and can def be made smaller now
12:46pq: like weird plugin
12:48zamundaaa[m]: The placeholder gets created when all real outputs get hotunplugged fyi, so "normal desktop usage" can't trigger this to happen
12:48kchibisov: Regular KDE desktop usage with changing virtual desktops or minimizing and restoring
12:48kchibisov: from the issue.
12:49kchibisov: Though, I usually don't trust users...
12:50emersion: zamundaaa[m]: do you really need to advertise a placeholder wl_output?
12:50kchibisov: And their logs from app don't have timestamps.
12:50zamundaaa[m]: emersion: we don't do that. The wl_output is from a real output
12:50pq: ...maybe the app uses eglSwapInterval=1 or equivalent, and ended up stuck in eglSwapBuffers or equivalent while the user was virt-desktop-switched away?
12:51kchibisov: pq: I think it actually does so.
12:51kchibisov: because it was a released version of neovide.
12:51kchibisov: And they moved to frame callbacks only on master.
12:51emersion: zamundaaa[m]: ah i see, the placeholder output is only "leaking" throughthe 8K configure event
12:52kchibisov: Hm, I guess I need a full log.
12:52zamundaaa[m]: emersion: yeah
12:52kchibisov: 1GB wayland debug log compressed to 3M with zip, impressive.
12:56kchibisov: Well, the issue I raised about timestamps in libwayland sort of shown itself.
12:56kchibisov: Since I have no idea how to read them.
12:56pq: they are the dispatch time
12:57pq: I wouldn't have remembered the units though without you asking.
12:58kchibisov: The problem is that it's not really monotonic
12:58kchibisov: 3535002.325] -> wl_surface@27.commit()
12:58kchibisov: [3535002.325] -> wl_surface@27.commit()
12:58kchibisov: [ 71665.818] wl_buffer@40.release()
12:58kchibisov: Like these a 2 adjecnt ones.
13:00kchibisov: Anyway, they run render with swap buffers on a separate thread.
13:02pq: curious what happened with those timestamps there
13:02kchibisov: http://ix.io/4LDX
13:02kchibisov: that's the lo.g
13:03kchibisov: pq: given that WAYLAND_DEBUG uses realtime and does u64 -> u32 via cutting it could probably do a thing like that?
13:03kchibisov: there're a lot of cases like that in the log though.
13:05pq: but 3535002.325 is 12.7 minutes before wraparound
13:06pq: maybe anything really did not happen for ~ 14 minutes
13:06kchibisov: maybe.
13:07kchibisov: They also have 3535002 -> 71665 -> 112210
13:07kchibisov: And that code was in _mesa_
13:07kchibisov: where it manipulates with buffers in swap_buffers.
13:08kchibisov: (the said place is easy to find from the bottom of the log).
13:09pq: eglSwapBuffers could easily be blocked for 14 minutes waiting for a frame callback, if the compositor so decides
13:09pq: e.g. during virtual desktop switched away
13:11kchibisov: Yeah, it could still be the case, I've asked whether they do on the same thread as event loop, but I remember that they don't.
13:12kchibisov: At least it's not the case with what they do now.
13:12pq: maybe their threads are sync'd somehow
13:13kchibisov: yeah, I'll just continue to poke them.
13:23kchibisov: pq: apparently they never used a thread, and they indeed block on vsync.
13:23kchibisov: they used thread on some separate branches, and maintainer sort of _forgot_ about what they had in a particular versios.n
13:23pq: heh
13:23pq: well, that can explain everything
13:24kchibisov: zamundaaa[m]: sounds like it's not your bug :p but user still haven't done hotplugs from their words.
13:24kchibisov: yeah, mesa blocking here is an issue.
13:25kchibisov: It should probably send frame callbacks when output change happens, because otherwise it'll crash all the games using vsync.
13:25pq: no, that's part of a much bigger debate
13:26kchibisov: Like I know that there was a surface suspension stuff.
13:26kchibisov: but mesa can't really use it.
13:26pq: yeah, that's part of the debate
13:27kchibisov: yeah, but I'm pretty sure mesa will still block in the end of the day.
13:28kchibisov: unless it somehow starts communicating that _I'll block for likely indefinite time_.
13:28pq: IMO, using frame callbacks for eglSwapBuffers throttling was a mistake, though no better mechanism existed. There would need to be a new mechanism that ticks at "refresh" rate somehow regardless of window visibility or even mappedness.
13:29kchibisov: Sounds like wp_presentation could kind of work.
13:29pq: how to make that also power efficient is another question
13:30pq: no, presentation is very much about being visible
13:30kchibisov: Can't you estimate how much to block based on that?
13:30kchibisov: like chrome uses it for its requestAnimationFrame.
13:30pq: I think this would need something new, like some of the vblank timing update extensions.
13:31pq: you can estimate, yeah, but the more time passes, the more it drifts
13:32pq: you would need the first presentation to succeed to initialize
13:32pq: and you would want to sync the timer somehow to actual presentation when the window is visible
13:32zamundaaa[m]: kchibisov: well, hotplugs from a user perspective and hotplugs from the compositor perspective aren't necessarily the same
13:33kchibisov: pq: I guess doing something stupid as subscribing to vsync event could sort of solve it.
13:33zamundaaa[m]: Random displays trigger "hotplugs" when they go to standby
13:33pq: kchibisov, part of it at least, yeah
13:34pq: for me, turning a DisplayPort monitor off is the same as unplug from hardware perspective
13:36kchibisov: tbf, I'm more concerned about other mesa issue with latching buffer, but I kind of not have time to try fix myself.
13:38kchibisov: Though, a perfect size of 960x480 for initial buffer could kind of save us.
19:37wlb: weston Merge request !1390 opened by Leandro Ribeiro (leandrohrb) Add support to some color curves in weston_color_curve https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1390
22:39Company: random question of the day: If I zwp_linux_buffer_params_v1::create() twice with the same dmabuf fd, what's supposed to happen?
22:39Company: do I get 2 wl_buffers for the same dmabuf?
22:40emersion: yes
22:40emersion: that's a perfectly fine thing to do
22:41emersion: need to be careful around buffer re-use and such
22:41emersion: if any of these is held by the compositor you can't write to the buffer
22:41Company: that's not the issue actually
22:42Company: I have a static image here while offload-testing and it flickers when resizing the window - and GTK just creates a new wl_buffer every time things move
22:42Company: and I suspect mutter is confused about getting a new wl_buffer with the same dmabuf
22:43Company: it doesn't flicker in weston