03:15DavidHeidelberg: let's assume vblank_mode=0 glxgears is pretty CPU bound. I tried to do PGO build, which should be optimized just for the glxgears, but the perf is just slightly WORSE. Same goes for glmark2-wayland, any idea why that may happen?
03:18airlied: DavidHeidelberg: are you running against a gpu?
03:19DavidHeidelberg: yup.. good point, I should try llvmpipe
03:33airlied: DavidHeidelberg: though PGO won't do much for the CPU bound bits of llvmpipe
03:33airlied: since it's all JIT code
03:33airlied: with mesa I've a hard time seeing where PGO might really make a useful difference
03:39DavidHeidelberg: hmm, 2600 fps to 3900 fps... thou one gear color is broken
03:54airlied: DavidHeidelberg: interesting, I wonder what it optimised -)
03:54airlied: :-)
07:28MrCooper: DavidHeidelberg: FWIW, on this system "vblank_mode=0 glxgears" isn't CPU-bound in the glxgears process, but in Xwayland
07:39alice: try vkcube-wayland :P
08:08emersion: daniels: thanks!
08:22daniels: np :)
09:32kevinmitnick: technically lenght is invariant so when you pass through the length to dependency it would be enough... it's only colliding once you do not have dep for it possibly so async alus could yield incorrect results cause there might be same values involved to remove, as in this case it would forward the smallest possible length with constant subtraction but only for duplicates, in other words
09:32kevinmitnick: what i understood it's possible to connect them over fake dependencies.to enforce order, so the smallest length would depend on the smallest pc, for example, as the smallest possible length w ould yield a biggest value the fake dependency needs to pass through a length to smallest value, the smallest value , so next biggest however forwards length to the next smallest, in other words
09:32kevinmitnick: async instructions are duplicated and designed with fake operands around all the async alus they appear at. BTW HOW DO YOU LIKE MY NEW NAME zombie sweethearts? kevinmitnick is kinda sexy nick imo i am in rage if you ban this one too.
09:59mlankhorst: So in ttm, the units of TT/sysmem are in pages, but vram/stolen memory in bytes?
13:19Mis012[m]: Company: actually, am I not completely wasting my time here? Even if I manage to render into a dmabuf with EGL, it doesn't look like I can do the same with Vulkan...
13:21daniels: Mis012[m]: you can - https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_external_memory_dma_buf.html
13:22Mis012[m]: but I need a surface
13:23Mis012[m]:sent a C code block: https://matrix.org/_matrix/media/v3/download/matrix.org/krZWGpwqoJGMoBsyChTKHDxc
13:23Mis012[m]: currently I have this
13:27zamundaaa[m]: You don't need a surface to render into a dmabuf
13:28Mis012[m]: Company: I guess all I really lack currently from Gtk with making my own wayland/x11 subsurfaces is a signal for the widget moving on-screen and the magic to make the corresponding widget transparent all the way through rather than just showing what's behind it
13:28Mis012[m]: zamundaaa: the surface is the part that I do need, dmabuf is just what GtkOffload currently allows me to plug in
13:29daniels: dmabufs aren't a VkSurfaceKHR type, they're a VkImage
13:30daniels: so you need to go up to whatever abstraction layer is forcing you to have a surface, and punch through so you can just render to an image
13:30Mis012[m]: well, EGL has EGL_PLATFORM_GBM_MESA
13:30daniels: fundamentally it's the same thing, you're 'just' skipping vkQueuePresentKHR (it's a no-op, you're presenting to yourself), and vkAcquireNextImageKHR (you already know what the next image is, because you have the dmabufs)
13:30daniels: GBM is a workaround for EGL/GLES not having an explicit image allocation API like Vulkan does
13:31Mis012[m]: I'm working on https://gitlab.com/android_translation_layer/android_translation_layer
13:31Mis012[m]: so I really can't change the code that is insisting on using surfaces
13:32Mis012[m]: again, I have stuff working currently, but idk if Gtk is willing to expose subsurfaces
13:33Mis012[m]: and I can't really solve the two problems I mentioned with currently public APIs
13:47Mis012[m]: but if Vulkan doesn't have an off-screen surface crutch then I guess convincing Gtk maintainers to expose a subsurface API is the only option
13:52DemiMarie: Mis012: I suggest a Gtk MR.
13:54Mis012[m]: a long time ago I was asking about a subsurface API and it was basically not going to happen, but I guess now that it exists for a different reason maybe making it public is a more reasonable request?
13:54DemiMarie: It’s a reasonable request, but I don’t know if the GTK developers would do the work. I think they would be much more likely to accept an MR.
13:55Mis012[m]: I'd be willing to even throw in X11 support depending on how painful that would be
13:56DemiMarie: I don’t know if they will require that, since GTK is moving Wayland-first.
13:56DemiMarie: I would start with an MR for Wayland.
13:59Mis012[m]: ok, guess I'll get on that
14:21zmike: mareko: do you want to ack https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29841
14:26belongstoheaven: invariant means no other variants , the meaning is typically likely unique with only possible duplicates , but yielding the correct variant. It does not look like anyone was interested, but duplicating the length of async operand in compact format means only one entry at maximum of 300 decimal digits range or so it forwards as said length from input pc. The code has some complexity in
14:26belongstoheaven: form of just thoughtfullness, but what you do is like so complex that you are suicidal terrorist annoyment to people, cause you lack any strategy or thought you are like random number generators. Performance of your crap is not even , i mean does not compare to any critics even yet, it's just so friggin bad. We could say that 50line java trace is a spectacular FLOSS beef trace. Anyone
14:26belongstoheaven: who did not know, Karol on parole looked at java traces, when he was becoming a fecalist, it's like what an excitement it was for a living excrement! We had one bitch with her crankgrangsters in cambodia who would do similar things going in crowded masses to talk how big wanker i am and how stupid freak schizo, she was captured later among with many of his cranks who started to bother
14:26belongstoheaven: tourists with all that in our hotels, the abuse bitches teeth were bunched in and lip was marked after their actions in life, they likely do the same with you it is called a rabbitlip in estonian you can recognize abusers having such marking done, totally braindead ghost wants to have more beefs but they get more brutal treatment this time.
14:30FL4SHK[m]: Can I write a conforming Vulkan driver for an integer only machine?
14:30FL4SHK[m]: floats eat up quite a few logic cells
14:31FL4SHK[m]: maybe soft floats would make sense
14:31FL4SHK[m]: With some partial support for some float related operations like clz
14:32FL4SHK[m]: my thinking is that maybe I can fit more cores if I go integer only
14:32FL4SHK[m]: And then maybe there'd actually be a win in performance to do floats in software
14:33FL4SHK[m]: An FPU for just 32-bit floats could eat up 3000-4000 LUTs. That is a large amount for my 500k LUT FPGA for the number of cores I want to fit
14:53MrCooper: daniels: just bisected wrong colours (red & blue swapped) in glxgears to 5ca85d75c05d ("dri: Fix BGR format exclusion")
14:53daniels: MrCooper: \o/
14:53MrCooper: see also https://gitlab.freedesktop.org/mesa/mesa/-/issues/11398
14:54daniels: LE or BE?
14:54MrCooper: LE
14:54daniels: good lord
14:54MrCooper: I stopped using BE HW about a decade ago :)
14:54daniels: yeah, sensible
14:55daniels: still, good to know that none of our GLX testing makes sure that we deliver correct colours on screen
14:55MrCooper: yeah
15:01mattst88: MrCooper: do I recall you using an Apple laptop as a terminal machine or similar once upon a time?
15:02MrCooper: my main machine was a PowerBook for about 15 years, starting around the millennium
15:03FL4SHK[m]: dang, 15 years with one computer?
15:03MrCooper: not missing the fight against the endianness bug wind mills though
15:03FL4SHK[m]: I've never done that
15:03FL4SHK[m]: heh
15:03MrCooper: 3 different PowerBooks
15:04FL4SHK[m]: ah
15:04FL4SHK[m]: 15 years ago I believe I was just starting to learn how to program
15:04FL4SHK[m]: and now I'm a computer hardware/software engineer professionally and as a hobby
15:07MrCooper: welcome to the club :)
15:09MrCooper: well, I'm not a HW engineer though
15:14MrCooper: daniels: the new code doesn't filter out PIPE_FORMAT_R8G8B8A8_UNORM at least
15:15FL4SHK[m]: MrCooper: my hardware dev is mainly with FPGAs, which the coding languages for I'd argue are like software languages a lot anyway
15:15FL4SHK[m]: I think it is feasible to get software engineers to learn the coding parts of FPGA development
15:16FL4SHK[m]: Because the model isn't too too different from low level software anyway
15:16FL4SHK[m]: I won't get into the details here, as that'd be a bit outside the scope of this channel
15:17mattst88: FL4SHK[m]: is there something you'd recommend reading to get started in FPGA development?
15:17FL4SHK[m]: mattst88: there's a lot of online resources. I learned mostly from online resources.
15:17FL4SHK[m]: I suggest using an HDL like SpinalHDL
15:18FL4SHK[m]: I do have a degree in electrical engineering, but I mostly self taught myself FPGA dev
15:19FL4SHK[m]: SystemVerilog is a good HDL too; if you want to use open source tooling with SV, check out `sv2v`
15:19mattst88: thanks
15:19FL4SHK[m]: Verilator is a really good simulator
15:19FL4SHK[m]: you can practice with it
15:20FL4SHK[m]: no problem
15:20mattst88: oh, neat
15:20FL4SHK[m]: are you talking about mine?
15:21mattst88: I'll add learning FPGA programming to my endless todo list :P
15:21FL4SHK[m]: What's designware?
15:21FL4SHK[m]: haha
15:21mattst88: I was saying 'oh, neat' to the fact that there's a simulator that you can use for development
15:22FL4SHK[m]: Yeah there's more than one simulator
15:22mattst88: I hope you're not asking me what designware is :P
15:22FL4SHK[m]: I just highly suggest Verilator
15:22mattst88: gotcha
15:22FL4SHK[m]: no not you
15:24FL4SHK[m]: nah I don't plan on using those
15:24FL4SHK[m]: If I use non open IP, it'll be hard IP built into my FPGA
15:24FL4SHK[m]: but the CPU/GPU should be entirely open
15:25FL4SHK[m]: Right
15:25FL4SHK[m]: so... I don't know much about developing ASICs
15:25FL4SHK[m]: I know that liberty files exist
15:26FL4SHK[m]: and that usually the low level blocks are not designed by people doing the HDL development
15:26FL4SHK[m]: mostly ASICs just reuse existing low level blocks
15:27FL4SHK[m]: There exists synthesis
15:27FL4SHK[m]: logic synthesis is mostly a solved problem
15:27FL4SHK[m]: apparently, anyway
15:28FL4SHK[m]: Most likely I'd use yosys to do the logic synthesis
15:28FL4SHK[m]: So I'm aware that there's analog parts to a lot of ASICs.
15:29FL4SHK[m]: it's hard for something so exorbitantly expensive to be fully open
15:30FL4SHK[m]: if only it weren't so expensive
15:30FL4SHK[m]: Google Skywater exists at least
15:37MrCooper: daniels: AFAICT the big/little cases are flipped, e.g. PIPE_FORMAT_R8G8B8A8_UNORM has B/A at shift 16/24
15:42MrCooper: I wonder if fixing that would reintroduce the s390x issue though
17:34rebelatwork: mattst88: first off i wrote very little VHDL, this standard was most powerful in it's initial release in the 80's, but back times i did not get used to ada syntax, while i can do it now i have lost the interest, second point, trust me start with asics, laying out the structures i described, and move to FPGA's cause there they run the best. 3. icarus verilog or verilator is not doing the
17:34rebelatwork: best job, i found severe port and instance coercion bugs from them, if you want something on the bar with cadence and synpsys simulators and and open source as well as with permissive licenses choose tachyon verilog simulator, so it supports all verilog standards and the high ones much later in the evolution compared to VHDL initial at nasa times, are on the bar with VHDL. Trust me i
17:34rebelatwork: wrote all the time verilog and system verilog back times, so verilog is with pascal syntax.
18:01rodrigovivi: airlied: sima: while preparing drm-xe-next pr here I noticed something odd... because I did a backmerge recently I see some accel/habanalabs pathces when I $tig drm/drm-next..drm-xe/drm-xe-next .. was it something you pulled and then dropped?
18:01rodrigovivi: if so, I believe I also need to drop here to ensure that you not pick that up again when pulling our PR...
18:01sima: oops
18:02sima: yeah airlied merged a bad habanalabs patch and ditched it again because the pr was too busted
18:02FL4SHK[m]: So I have a question: could Vulkan use fixed point arithmetic?
18:02sima: rodrigovivi, I guess apologies for the fallout, and airlied owes you one for ignoring the dim checks :-)
18:24FL4SHK[m]: also, maybe I could make a 32-bit floating point unit that implements only rounding towards zero and turning subnormals into zero
18:24FL4SHK[m]: I made a bfloat16 FPU like that once
18:24FL4SHK[m]: would Vulkan be compatible with a floating point format like that?
18:36HdkR: With enough extensions anything can be compatible with Vulkan
18:38FL4SHK[m]: oh that's cool
18:38FL4SHK[m]: So I'm thinking of maybe making a state machine based FPU
18:38HdkR: With 80-bit floats i hope
18:39HdkR: Need that additional precision
18:39FL4SHK[m]: 80-bit floats like x86?
18:39FL4SHK[m]: I need to fit a lot of them
18:40HdkR: The other question then ends up being, if no one uses the extension, does it matter?
18:41HdkR: The whole "If you build it, they will come" mentality doesn't happen quite as often in Vulkan land :)
18:45pendingchaos: unless you want to support the float controls extension, I don't think Vulkan SPIR-V requires subnormals or any particular rounding mode: https://registry.khronos.org/vulkan/specs/1.3-extensions/html/chap52.html#spirvenv-precision-operation
18:45pendingchaos: RTNE might be less surprising to applications than RTZ, though
18:50FL4SHK[m]: <pendingchaos> "unless you want to support the..." <- EXCELLENT
19:01rodrigovivi: sima no problem... it looks like only a few patches on top, so I'm going to reset to the previous commit before the merge, then do a new clean backmerge and re-apply and sign-off the 9 patches we have on top...
19:05rodrigovivi: hmm.. sima dave daniels could one of you please give the owner role in the gitlab/xe/kernel so I can change the protection option temporary and force-push to drm-xe-next...
19:19alyssa: pendingchaos: even if you support float controls, denorms are optinal
19:19alyssa: float controls just lets you advertise that you don't have denorms in that case
19:19alyssa: (honeykrisp does this, no denorms in hw)
19:23FL4SHK[m]: Ooh
19:36FL4SHK[m]: that's even better
19:43DemiMarie: alyssa: that’s why I think languages should consider the behavior of denormals to be unspecified
19:49FL4SHK[m]: I think that would be helpful
20:35glehmann: d3d12 requires fp16 and fp64 denorm support, fp32 too for higher feature levels
20:36airlied: mlankhorst: in theory ttm core has no units, each mgr decides what units to manage in
20:39airlied: rodrigovivi: is it protected or just the git push block?
21:05statsaresoclear: you have never learned to think in any real way, and MrCooper appears to be even not the worse, so i interacted some in the channel in 2005 that was very tiny bit, but problems started at overseas starting from 2008...i was one year in australia , one year and three months in new zealand, then 1month in thailand and 3 months in cambodia having last moments of my youth that was about
21:05statsaresoclear: to be ruined by meds, but they did not have sane drivers nor sane people involved in development, and cause it upset me, i got into issues with people stalking and abusing me when i said negative things on irc, and they were so nasty people that next occasions twice in cambodia where dad had some businesses they did not give up on harassing me and my people, so i am considering with
21:05statsaresoclear: delay a brutal way to finish them, all the estonian sports elite will help me i suppose and underground as well, very strong people, technically we have no problems to make them beg for mercy, however this all was cultural shock to me still as to what idiotic people you have on earth.
21:07DemiMarie: jenatali: so that means that d3d12 on Vulkan on Asahi will be broken in this way
21:11DemiMarie: Mis012: looks like you will need to either patch GTK or use a different framework
21:15Company: gfxstrand: re https://gitlab.freedesktop.org/mesa/mesa/-/issues/11383 - why is that not a problem with GL?
21:15Company: is that because GL doesn't need to enumerate all devices?
21:28airlied: Company: yes, GL only deals with one device, though not sure if you have glvnd and nvidia installed if that wakes things up
21:34Company: I'm wondering if there's something one could do to avoid the problem in GTK
21:34Company: because it's highly likely people are going to be unhappy, and I'm not sure new kernel API is gonna land quickly enough to fix things in all distros by September
21:34airlied: don't think so, the vulkan loader's gonna load
21:35Company: woul it help just enumerating one device?
21:35Company: or is the loader gonna load anyway?
21:40airlied: vulkan doesn't really have an enumerate one device interface I don't think
21:45jenatali: Ideally Vulkan didn't need to actually load the drivers to report the physical device info
21:45jenatali: But that ship's already sailed unless a new interface gets added
21:46Company: airlied: I can vkEnumerateDevices(1, ...)
21:47Sachiel: and how do you know that's the one you want?
21:47Company: because I take the first one that has the features I need
21:47Company: and if it isn't, I can vkEnumerateDevices(2, ...) and check the next one
21:49airlied: Company: underneath it has to enumerate them all anyways
21:49airlied: since they aren't ordered
21:49airlied: we have layers to provide the order to the app
21:49alyssa: glehmann: fortunately agx has denorms for fp16, just not fp32
21:49alyssa: and also d3d12 on agx is impossible to do properly anyway so..............
21:50alyssa: no sampler filter minmax hw and that one can't be emulated in any reasonable way (:
21:50Company: then that idea won't work
21:50airlied: there are some kernel changes that might accidentally help alleviate it but not 100% sure
21:51jenatali: FWIW D3D requires vendors to store enough information in the Windows registry that we can populate an adapter list, at least with vendors / device names, without having to load any drivers into the process or wake up GPUs
21:52jenatali: But Vulkan physical devices have so many more properties that the driver really needs to get involved
21:52Company: theoretically, this just enumerates the devices
21:53Company: it's not querying them for details
21:54Company: that's done one-by-one later, and GTK just picks the first in 99% of cases (and so does anyone probably), so we only query one device
21:55karolherbst: could always add a custom mesa vulkan extension you can use to say "give me the phys device of this GPU I want to use"
21:56karolherbst: because that's really want you want to do
21:56karolherbst: not pick the first
21:56Company: I think the Vulkan spec says to pick the first?
21:56karolherbst: electron based applications already have bogus device selection, because they don't care what the compositor renders on
21:56karolherbst: so they often end up using the discrete GPU as well
21:57airlied: we reorder the list in the mesa layer already
21:57karolherbst: ahh, I see
21:57airlied: the problem is layers and exts still mean enumeration
21:57Company: because I remember thinking about this and not having any idea what I should pick
21:57airlied: pick the first will give you the same one as GL usually7
21:57karolherbst: okay, that's good enough then (hopefully)
21:58Company: I haven't heard any complaints yet - but it's still early
21:58karolherbst: but yeah.. ultimately the kernel should cache the information we need in nvk
21:58airlied: anyways it's technically a driver bug and we should work out how to fix it there
21:58airlied: but also lspci will wake up your GPU
21:59karolherbst: I _kinda_ have the same issue with rusticl, because I also have to create pipe_screens on every device for enumeration and...
21:59airlied: I found someone put a profile.sh using lspci into /etc/ once
21:59karolherbst: yeah.. but lspci is just being lspci...
21:59airlied: I wonder if it's still thee
21:59Company: yeah, I was just looking for easy workarounds
21:59airlied: every time you started a shell it wakes up the gpu
21:59Company: so discrete nvidia laptop users don't get sad
21:59karolherbst: though fault is really sysfs waking up the GPUs on actually non relevant things
21:59karolherbst: I think it parses the PCI config thing because sysfs is lacking interfaces
22:00airlied: well you either cache it or read it from pci config space
22:00karolherbst: right, but the config space wakes up the device
22:00airlied: and if you ask to read it from pci config space it has to wake it up
22:00karolherbst: yeah, but the things lspci needs aren't in sysfs, so it has to use the config thing
22:01karolherbst: thought he issue of course is, that lspci also parses everything even if it's not relevant for displaying according to the flags used
22:01airlied: I think Ben mentioned in passing his work to split nouveau via aux bus has some effect
22:01airlied: but I haven't enumerated what yet, esp if it was GL or VK paths
22:02karolherbst: definetly not GL
22:02karolherbst: we do command submission when creating the pipe_screen, so whatever you end up doing, if it involves a pipe_screen you are screwed
22:02Company: airlied: the issue for GTK of course is a GTK 4.16 flatpak being run on Debian stable or RHEL because that will use Vulkan, too, and on an old kernel
22:02airlied: if you submit cmds then it'll suck alright
22:03karolherbst: but we also allocate the channel + sub-channels there
22:03karolherbst: context creation is just brutally expensive on nvidia
22:04karolherbst: when I do rusticl + zink run on nvidia, the runtime increases by x5
22:12rodrigovivi: airlied: it is protected against force-push... neither with dim push-branch -f we can push... I also need to push a rebase on the topic/xe-for-CI...
22:17airlied: rodrigovivi: dim -f doesn't work for force pushs for me
22:17airlied: not due to protection but due it not sending the correct git cmd lines
22:18airlied: I usually sh -x dim push-branch -f then cut-n-paste the last line and fix it :)
22:18airlied: but let me go check the protection status
22:19airlied: rodrigovivi: okay I've clicked the allow to force push for drm-xe-next
22:20rodrigovivi: thank you
22:28DemiMarie: alyssa: does that mean that Windows games will be broken on Asahi, or is there an unreasonable way to emulate it?
22:29jenatali: Demi: I'm not aware of anyone that actually depends on fp32 denorms, but if there's other problems then those are probably bigger
22:29jenatali: Apple has their "game porting toolkit" and got real titles running on it so presumably at least those games don't use the problematic features
22:29DemiMarie: jenatali: looks like the hardware sampler can't do min or max
22:30jenatali: Yeah that seems somewhat niche
22:55FL4SHK[m]: <glehmann> "d3d12 requires fp16 and fp64..." <- ... good thing I'm only interested in Vulkan
22:58gfxstrand: Company: Yeah, it's a bit of a problem but there's not much we can do about it at the moment.
22:58gfxstrand: Like, the only real workaround in userspace would have to have a tiny little NVK server process that gets forked the first time you start NVK and hands out cached device info.
22:59gfxstrand: And that seems kinda mean, TBH
23:00Company: yeah, that seems more work than useful
23:00alyssa: jenatali: gptk is a gigantic pile of hacks, as far as I can tell
23:00gfxstrand: The real solution is for nouveau.ko to create a context at boot, scrape the info, and cache it for us.
23:00jenatali: alyssa: Yeah that tracks
23:00gfxstrand: Company: It also wouldn't work inside your flatpak scenario because you'd end up with one per flatpak, defeating the point.
23:01Company: yeah, we only start once per flatpak
23:01Company: people will have to live with it
23:01Company: or backport whatever patch you guys come up with
23:01gfxstrand: Or it would end up being a zombie that messes with flatpak. "OMG! Why does it say my app is still running?!?"
23:02gfxstrand: We just need to fix it in nouveau.ko
23:02Company: like, 24.04 LTS will probably be worst off
23:02gfxstrand: skeggsb has a plan but IDK if it's back-portable
23:02gfxstrand: nouveau.ko is kind-of a mess on the inside and that makes things like this way harder than necessary
23:03gfxstrand: It's probably possible to do today, though, if someone really tried.
23:04Company: it really depends how many people are affected and how bad it is for them
23:53DavidHeidelberg: airlied: after rebuild & new PGO, colors are ok.. problem with benchmarking is my laptop with TDP and small fan