IRC Logs of #dri-devel on irc.freenode.net for 2023-02-08

03:35 airlied: okay hasvk h264 support is https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21183
03:35 airlied: just need to give it some testing
03:54 ccr: meh. lynne's ffmpeg vulkan branch wants libvulkan 1.3.238 and Debian testing has only .236.
03:54 HdkR: bleeding edge video extensions need bleeding edge things :P
03:55 HdkR: Waiting for the bloody edge of Snapdragon Vulkan Video
03:56 ccr: :)
04:00 airlied: ccr: just comment out the bit in the configure file
04:00 airlied: it doesn't actually need it really
04:00 airlied: since the loader can handle ext it doesn't know about
04:01 ccr: ahh.
04:12 ccr: fails with undeclared constants/enums/something ( VK_KHR_VIDEO_DECODE_H264_EXTENSION_NAME ) .. which seem to be defined in libvulkan headers v1.3.239 available in unstable. in some cases transplanting packages from unstable may work, perhaps I'll try that later.
04:14 airlied: ah you need updated headers, not an updated runtime
04:14 airlied: yeah headers should be fine to transplant
04:42 ccr: hmm. the suggested command "ffmpeg -init_hw_device "vulkan=vk:0,debug=1" -hwaccel vulkan -hwaccel_output_format vulkan -i S01E01.mkv -loglevel debug -filter_hw_device vk -vf hwdownload,format=nv12 -c:v rawvideo -an -y ~/OUT.nut" resulted in ffmpeg: ../src/intel/vulkan_hasvk/genX_query.c:826: gfx75_CmdResetQueryPool: Assertion `!"Unsupported query type"' failed.
04:43 airlied: ccr: cool, I hadn't had a chance to test hasvk yet
04:44 ccr: np. I have few haswell machines, so thought to test :)
04:45 airlied: ccr: one minute I'll push a fix
04:46 ccr: airlied, ok :)
04:46 airlied: ccr: okay possible fix pushed to the branch
04:48 ccr: checking.
04:50 ccr: [AVHWDeviceContext @ 0x555ddb730340] Validation Error: [ VUID-vkCmdBeginQuery-commandBuffer-cmdpool ] Object 0: handle = 0x7f1bfc0784b0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x8dadbdcc | vkCmdBeginQuery(): Called in command buffer VkCommandBuffer 0x7f1bfc0784b0[] which was allocated from the command pool VkCommandPool 0x70000000007[] which was created with queueFamilyIndex 1 which doesn't
04:50 ccr: contain the required VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT capability flags. The Vulkan spec states: The VkCommandPool that commandBuffer was allocated from must support graphics, compute, decode, or encode operations (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-vkCmdBeginQuery-commandBuffer-cmdpool)
04:51 ccr: (gdb) bt
04:51 ccr: #0 0x00007ffff4527764 in anv_h264_decode_video (cmd_buffer=0x7fffd4078670, frame_info=0x7fffebffda00) at ../src/intel/vulkan_hasvk/genX_video.c:130
04:51 ccr: #1 0x00007ffff291fd86 in DispatchCmdDecodeVideoKHR (commandBuffer=0x7fffd4078670, pDecodeInfo=<optimized out>) at ./layers/generated/layer_chassis_dispatch.cpp:5685
04:51 ccr: #2 0x00007ffff28914e0 in vulkan_layer_chassis::CmdDecodeVideoKHR (commandBuffer=0x7fffd4078670, pDecodeInfo=0x7fffd41e6d18) at ./layers/generated/chassis.cpp:6534
04:52 airlied: ccr: ahd drop the debug=1
04:52 ccr: pardon the spam, I can paste these elsewhere if so desired
04:52 airlied: not sure the validation layers are up to much here
04:53 ccr: segfaults without debug=1
04:53 ccr: (gdb) bt
04:53 ccr: #0 0x00007ffff4527764 in anv_h264_decode_video (cmd_buffer=0x7fffd8074830, frame_info=0x7fffd81bba58) at ../src/intel/vulkan_hasvk/genX_video.c:130
04:53 ccr: #1 0x000055555640133f in ?? ()
04:53 airlied: okay guess I better kick a test video
04:53 ccr: heh. if you need anything, just holler. I'll probably be around.
05:14 Lynne: the validation layer is sadly currently pretty useless
05:14 airlied: anyone care to give me a ack on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21184 for anv? brown paper bag fix
05:14 airlied: dj-death, Kayden, gfxstrand ^?
05:15 Lynne: it doesn't properly detect queries and spams errors, and the spec is being unreasonable
05:20 Sachiel: airlied: acked
05:26 airlied: Sachiel: thanks
06:41 ccr: airlied, ooh .. it works now :)
06:45 airlied: ccr: I'm still seeing the odd hang with more frames
06:45 airlied: but it's a lot better
06:47 ccr:cheers at airlied and Lynne
07:03 Lynne: vulkan video is happening, soon I'll be able to watch videos without crashing because the recent radeonsi *fix* actually broke stuff
07:55 Lynne: why do descriptor set templates require a pipeline layout?
07:56 Lynne: they already have the descriptor layout, what possible use can they have for a pipeline layout other than make my/everyone's life more miserable?
08:00 airlied: Lynne: push constants has something to do with it
08:00 airlied: sorry push descriptors
08:01 airlied: at least in radv it's the place I see it being used
08:02 Lynne: you mean internally on a very low level or?
08:03 airlied: yeah, radv_CreateDescriptorUpdateTemplate has a comment about it
08:04 airlied: also "
08:04 airlied: pipelineLayout is a VkPipelineLayout object used to program the bindings. This parameter is ignored if templateType is not VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_PUSH_DESCRIPTORS_KHR
08:04 airlied: "
08:04 airlied: so you might not need to supply it
08:16 Lynne: I'm using update templates, so I do
08:19 Lynne: oh, nevermind
08:20 Lynne: why would you want to update push descriptors via a template anyway?
08:20 Lynne: I like the low-level DIY layout of the regular pushconst updates
12:22 mairacanal: danvet, when you have a chance, would you mind take a look in https://lore.kernel.org/dri-devel/20230131195825.677487-1-mcanal@igalia.com/?
12:42 zmike: Lynne: would recommend checking out EXT_descriptor_buffer
12:42 zmike: it's much simpler to work with
12:53 Lynne: "This extension is primarily intended to aid in crash postmortem"
12:54 Lynne: what's the usage like?
12:55 emersion: there's a blog post it seems https://www.khronos.org/blog/vk-ext-descriptor-buffer
13:00 Lynne: "The descriptor buffer API effectively removes VkDescriptorPool and VkDescriptorSet"
13:00 Lynne: I hate descriptor sets, this is looking good
13:00 emersion: yup, same here :P
13:01 jenatali: That'll be a fun one to implement in Dozen :)
13:02 dj-death: you don't get to do away with the layouts though
13:02 zmike: if you've only gotta juggle layouts (and not pools+sets additionally) then they're not so bad
13:04 Lynne: my issue is that I want to support multiple submissions to multiple queues
13:04 Lynne: which meant I needed one descriptor set per each queue (you can't reuse them), and a template updator
13:05 zmike: hm
13:05 zmike: don't remember if there's a restriction against submitting descriptor buffers repeatedly
13:06 dj-death: there is a restriction you don't modify entries used by one queue
13:06 dj-death: but yeah with descriptor indexing you can modify after recording
13:06 dj-death: and also modify stuff that is not accessed while the GPU is reading the descriptor set
13:07 zmike: 🙏 descriptor 🙏 indexing 🙏
13:07 jenatali: Aka my current bane
13:07 dj-death: jenatali: do you have to support that?
13:08 jenatali: Yeah, I think so. Apps we care about require it
13:08 dj-death: everybody will indeed
13:08 zmike: I can hear airlied crying over a pint now
13:08 zmike: lavapipe descriptor indexing remains a extreme challenge
13:08 dj-death: a pint 🤤
13:09 jenatali: WARP mostly handles it, UBO indexing is busted though
13:10 jenatali: My problem is that we have restrictions in place on descriptor heaps which give us limits way lower than what VK needs. Thanks to NVK though I have a viable path forward
13:10 zmike: the limits are the hard part
13:10 jenatali: Yeah. 500,000 samplers in a set when some hardware has 12-bit sampler indices...
13:11 dj-death: it's kind of ridiculous that up to dg2 you could have more samplers than surface descriptors on intel gpus ;)
13:12 jenatali: But then I saw that you only need to have up to 4000 unique samplers alive at a time. Except D3D's limit in a heap is 2048...
13:12 dj-death: technically it's still more (since it's 32bytes vs 64bytes both in a 4Gb heap)
13:12 Lynne: is it really a good idea to represent descriptors as vkbuffers?
13:13 Lynne: what if the driver doesn't flag coherent memory, then you need to flush after copying
13:13 dj-death: I'm looking forward to every app shooting itself in the foot
13:13 jenatali: For future-looking hardware? I'd say yes. For current hardware, I'd say no
13:13 dj-death: Lynne: there are synchronization bits you need to put in your barriers
13:14 dj-death: I think there is still no validation for all that right?
13:15 Lynne: "If you’re updating descriptors from the CPU, you get the implicit host -> device synchronization on vkQueueSubmit."
13:16 Lynne: that's so un-vulkan, you'd get a citation from the house of un-vulkan activities for that
13:17 Lynne: I do like that you can modify descriptors from the GPU itself
13:18 Lynne: with that, along with the NV extension for gpu command buffers, you could realistically abandon vendor-specific compute suites
13:19 ccr: "are you, or have you ever been updating descriptors from the CPU .."
13:22 Lynne: so you don't allocate descriptors at all with descriptor buffers?
13:23 Lynne: ah, I see, so you allocate them, bind them, and then look them up?
13:29 dj-death: Lynne: you allocate buffers yourself, bind them as descriptor buffers in the command buffers
13:29 dj-death: Lynne: you do all the writing in them yourself
13:36 Lynne: that's so much simpler!
13:41 Lynne: this also removes the need for having sets of descriptors, doesn't it? since you can now choose to update whichever one you want at any time
13:51 zmike: that's architectural and depends how you're using them
13:56 Lynne: are there any advantages left for "layout (set = %i" other than 0?
13:57 emersion: i use multiple sets
13:57 emersion: ah
13:57 emersion: okay, that was a question related to the ext, sorry for the noise
13:57 zmike: it can be easier for managing buffer sizing
13:57 zmike: but it's architectural
14:10 Lynne: hmm, what about push consts/descriptors?
14:10 Lynne: internally, are they just the same as regular descriptors, so it's pointless to use them if using descriptor buffers?
14:22 zmike: again that comes down to architecture
14:23 zmike: either you need them or you don't
14:23 Lynne: I don't know if I need them or don't need them, but not needing them would save me work from not having to abstract them, so I'd like to not need them, if possible :)
14:24 Lynne: push consts are afaik generally cheap, but are they cheap enough to still beat descriptor buffers?
14:26 zmike: implementation dependent
14:32 Lynne: fair enough
14:50 zmike: eric_engestrom: what's the git cherry-pick --option I'm supposed to use again?
14:56 zmike: hm maybe -x
14:59 DPA: Apparently, my monitor supports HDR. I don't think X11 can do that yet, but DRM/KMS can, right?
14:59 DPA: Are there any demos to see what that looks like?
15:06 heat: DPA, *if* (I'm not sure, I'm no expert in HDR) it requires explicit X11 support, X11 will never be able to do HDR
15:07 heat: due to it being in maintenance mode
15:08 emersion: DPA: simplest is probably mpv or kodi
15:08 emersion: running directly with their KMS backends
15:08 DPA: It doesn't need to be X. I'm OK with something that just directly talks to /dev/dri/. I just want an example to see what it would look like.
15:09 DPA: Oh, mpv can do it? I'll try that then.
15:10 emersion: ah, no nvm, mpv support is not merged yet: https://github.com/mpv-player/mpv/pull/10762
15:15 DPA: I think I'll just build that pr myself. It'll be good enough to get an idea of what HDR looks like.
15:41 eric_engestrom: zmike: yup, -x :)
15:41 zmike: ok then I got it
16:22 eric_engestrom: zmike: ah, I see that you pushed directly on the staging branches; thanks for the custom backport :)
16:22 zmike: I remembered -x this time!
16:23 eric_engestrom: I see :D
16:26 ajax: heat: enh. if anyone really wanted to wire it up for Xwayland it's pretty trivial. i wouldn't expect it to ever work for Xorg though
16:36 ajax: or at least. it's pretty trivial for Xwayland to expose Visuals for arbitrary depths, since they map directly to wayland image formats and the compositor is the one responsible for making them display correctly
16:36 emersion: HDR is much more than just high-bit depth buffers
16:37 ajax: getting the hdr metadata plumbed through is then a typing exercise, but then you'd need to go update every x11-supporting media engine
16:37 ajax: yes, i'm aware
16:38 ajax: so like. you could make Xwayland do it, but at that point why are you still using X for this
17:56 danvet: mairacanal, I didn't get around to replying, but debugfs vs accel is essentially a case of "landed together"
17:56 danvet: might be good to chat with ogabbay on this and make sure we don't accidentally diverge further :-)
17:58 mairacanal: danvet, do you have any thoughts on the generalization of the debugfs api?
17:58 dcbaker: PSA: if you're going to merge things to the stable branch, please ping me on IRC
17:59 dcbaker: I've force pushed over things that I didn't know were merged to the branch twice this week
17:59 kisak: mareko ^
17:59 dcbaker: fortunately I later saw them and got them back
18:00 DavidHeidelberg[m]: mattst88: Hey! As we been looking into the SSE2 linking issues, could you give me some basic feedback on https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21180 ? One thing I believe that "-msse2" should be applied to both C and C++ code, since it's linked together, another bit I would like to see what you think about sharing also `-fpmath=sse` and `-mstackrealign`
18:01 mattst88: sure thing! thanks DavidHeidelberg[m]~
18:05 ogabbay: mairacanal: I haven't yet gotten to read the email thread to understand the change, but I can already say that I don't see any reason accel won't align to drm code in this issue
18:06 ogabbay: We have just one driver in (iVPU) and we are now working internally to change the habanalabs driver to use accel, so this is a good point in time to do this alignment
18:19 danvet: mairacanal, replied, but maybe be extra cautious because my brain is not in well working order at thinking stuff through properly right now :-)
18:21 danvet: zackr not around?
18:52 bl4ckb0ne: is there a way to choose a GPU when using eglGetPlatformDisplay with EGL_PLATFORM_SURFACELESS_MESA and EGL_DEFAULT_DISPLAY ?
19:13 Sachiel: eric_engestrom: I failed to tag 58ababdee6cd6b1e08604033602e4a5f9d5ab7a3 for stable and it'd be nice to have it in 22.3, what do you want me to do about it?
19:30 zmike: is there a way to trim an apitrace to a select group of frames? gltrim can do a single frame, but I want to start at e.g., frame 1000 and keep 50 frames
19:30 zmike: do we have that technology
19:36 airlied: zmike: do trim not take a range?
19:36 zmike: the help claims it takes a single frame
19:37 airlied: apitrace trim takes a range
19:37 zamundaaa[m]: <ajax> "or at least. it's pretty trivial..." <- On Xorg, some apps severely misbehave or even crash when it's set to 10 bpc. I guess that's caused by Xorg then *only* supporting 10bpc, and wouldn't affect Xwayland?
19:38 zmike: oh this is different from gltrim
19:38 zmike: baffling
19:40 zmike: hmm I tried passing 1000-1050 as a frame range and it generated something that doesn't seem to play back correctly
19:40 airlied: yeah that seems about right for trimming :-P
19:40 zmike: ah
19:41 airlied: dj-death: sorry about that, thought not sure we'd have found that bug if I'd left the env var in place, though I think we should just have set v_count to 0 and use the queue overrides instead of reintroducing a variable
19:42 airlied: dj-death: where did it show up? intel mesa CI? I'd like to try and fix the queue picking, looks like anv is totally missing any blit queue picker
19:45 zmike: oh ok gltrim takes a range
19:48 dj-death: airlied: yeah, I think it might not be ready to enable by default just yet ;)
19:48 dj-death: airlied: I just ran gfxbench
19:49 dj-death: airlied: tbh it looks like a wsi issue
19:49 dj-death: airlied: it shouldn't even try to use a video queue family for blits
19:50 dj-death: airlied: I've also seen a CTS issue, but I don't have the latest
19:50 dj-death: airlied: I guess you need to run this on a secondary GPU
19:51 dj-death: airlied: it'll force a blit to linear shared buffer with compositor
19:53 airlied: dj-death: so you were using DRI_PRIME=1 setup?
19:53 dj-death: no
19:53 dj-death: just VK_ICD_FILENAME=intel_icd
19:53 dj-death: my primary gpu is an amd card
19:53 airlied: ah okay
19:53 dj-death: but yeah that should repro as well
19:54 airlied: I'll go re-enable my integrated gpu, I think it's a blit queue picking callback, but I'll chase it down
20:01 Kayden: anv-transfer-queue in my tree has a blit queue if you need one
20:01 Kayden: but only for dg2+
20:03 airlied: Kayden: you seem to be missing the wsi callback though
20:03 airlied: though I only glanced at the MR
20:04 Kayden: ah, thanks!
20:13 eric_engestrom: Sachiel: cherry-picked, it will be in tonight's release
20:14 Sachiel: thanks!
20:23 airlied: though presenting from the video queue should be fine, it's just makint it not blit
20:27 ajax: bl4ckb0ne: there's DRI_PRIME= and such, but at that point maybe use platform_device instead of platform_surfaceles
20:27 bl4ckb0ne: yeah, i figured i was better off with a platform_device
20:51 x512: It is possible to do EGL window rendering with platform_device other than EGLStreams?
20:54 emersion: i don't believe so
20:54 emersion: not unless you reimplement the whole WSI
20:54 emersion: s/not//
20:55 x512: It may be a good idea to implement EGLStreams in Haiku...
20:55 DPA: Is there a way to check if HDR is being used or not? I downloaded some HDR test videos and pictures, but don't see any difference between the mpv with HDR patch and the one without.
20:56 emersion: i wouldn't recommend that
20:56 emersion: everyone is trying to get rid of EGLStreams
20:56 emersion: DPA: drm_info should tell
20:56 x512: I don't understand why EGLStreams is not welcomed in Linux.
20:56 emersion: x512: it has many shortcomings
20:56 x512: It seems a good idea.
20:57 x512: Not being able to select EGL device by API is A BIG shortconing.
21:01 emersion: that isn't unfixable
21:01 emersion: create an EGL-Registry issue to discuss, if you care about this
21:01 jenatali: All versions of GL have that problem
21:02 jenatali: WGL and GLX are no different
21:02 ajax: it's not just not unfixable, it's fixed
21:02 ajax: https://github.com/KhronosGroup/EGL-Registry/pull/157
21:03 emersion: oh, i missed that one!
21:03 ajax: no idea why that isn't showing up on the registry website tho
21:03 emersion: it's missing from the EGL registry website for some reason
21:03 jenatali: Ooh, cool
21:04 ajax: glad kyle pushed that over the line for me / embarassed to have left it fallow so long. still.
21:04 emersion: ajax: do we have mesa support for this already?
21:05 ajax: no, though i think i have a Very Old Branch for it somewhere
21:05 emersion: maybe linked from the old PR
21:06 emersion: hm, no, nothing
21:06 emersion: well, shouldn't be too hard to type
21:08 ajax: https://gitlab.freedesktop.org/ajax/mesa/-/tree/egldevice-1115
21:08 ajax: quite old, probably needs massaging, feel free to plunder whatever you need from it
21:11 ajax: also the surfaceless and device platform code is entirely too much copypasta, someone please unify that
21:13 x512: But can somebody tell why EGLStreams is not welcomed in Linux?
21:14 x512: Only because it is different from Mesa/DRM way of doing things?
21:15 x512: Historic background only problem?
21:15 zmike: update: gltrim CAN trim ranges of frames, and it can do it at a blistering pace of about 75 frames/hour
21:16 emersion: x512: because it has intrisic deficiencies
21:16 x512: Details?
21:17 x512: I don't think that industly leader Nvidia will desgin bad API. The probably know that they are doing.
21:17 HdkR: zmike: Spicy!
21:18 x512: industly -> industry
21:18 HdkR: x512: Everyone makes mistakes
21:18 x512: But can someone tell details?
21:18 HdkR: NVIDIA has been around long enough to make some :)
21:18 ajax:stares in GL_NV_path_rendering
21:19 x512: What is mistake if not considering Linux historical background and compatibility?
21:19 zmike: can't wait to do zink support for that one
21:19 emersion: sorry, i've argued way too much about this topic in the past, not interested in bringing that up again
21:19 ajax: did you know nvidia's driver has arb assembly program extensions through sm5?
21:19 ajax: sm6 even i think
21:19 jenatali: :O
21:19 HdkR: It's great, they only stopped added new ARB extension in the past few years
21:20 anarsoul: ajax: but why?
21:20 HdkR: I was kind of hoping for insane Mesh and RT ARB assembly
21:20 ajax: i will give them credit for supporting the mistakes they ship but jeezey petes
21:21 ajax: anarsoul: that is an excellent question that i cannot answer
21:22 ajax: i assume it's something along the line of "we promised some bit of middleware that this would keep working"
21:22 jenatali: "keep working" doesn't mean adding new features to it though
21:22 HdkR: It is also used by some emulators because it is still the fastest way for their driver to JIT shaders out
21:23 ajax: jenatali: maybe "at parity with glsl" was also implied?
21:24 jenatali: Oof
21:24 ajax: HdkR: that's sad enough to be plausible
21:24 HdkR: ajax: Even SPIR-V compiling on their driver isn't as fast :(
22:08 DrNick: does Cg still exist?
22:08 DrNick: and compile to ARB assembly
22:16 jenatali: Ugh. I dislike LLVM
22:30 Sachiel: hakzsam, gfxstrand: what's holding https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16742 ?
23:34 anholt_: airlied: hsw runner should be back now, I'll update crocus baseline state. and I'm also kicking off a "how bad would some hasvk coverage be?" branch.