00:26 zmike: uhhh
00:27 zmike: never seen that before
00:28 zmike: hmmm I think I can imagine how it might happen
00:30 zmike: too bad I'm at the pub
00:31 karolherbst: have fun, but also, why are you checking in on IRC while in the pub
00:37 Ristovski: one does not waste optimization opportunities, no matter the occasion
00:39 zmike: I got a ping, I checked the ping
06:15 treasureXsafe: the war is somewhat wished to be seen to kill the wank spammers and those whore sluts that give anal to penniless crank gangsta buffalos, you wasted all my vacation you retarded human shit. One gang sold catgold another was "buying" our hotel other anal buffalos were harassing all the tourists with their schizophrenia talks from aussie to estonian to south-african to nigerian lamplegs
06:15 treasureXsafe: along with their whores i'd be happy to use firearms to gun them to a ghost filter, the duo of bi's is still harassing our clients there, and to be exact, i never communicate with this human excrements, neither is anymore any clients from the rest of the world coming, the hotel is bought from my assets, but i do not care enough to go there either to environment which is human shit. The
06:15 treasureXsafe: kill off's are delegated to chinese in that region.
07:18 cheakymister: for quantum technologies there used to be 15 or more different approaches, and with those chips that land on some other planet to compute, not very much known to me as to what they gotta do to keep them running in X conditions but i do not think there is a problem to pack as much data as needed to get the gas fuel rocket to start okay. so what was not registered was there is a web blog
07:18 cheakymister: where a guy made quadruple precision that of n length registers, which ends up as intrinsics for plus and minus instructions, one guy had those on the net quite long time ago already, as i am saying technically quantum computers experimental are not quicker than usual classic computers, i'd expect fpgas to hoise more horsepower. But i can not really dig out any words of relevance
07:18 cheakymister: anymore, it has been for some time for me, a real time to go mind my own business.
12:45 randomgigolo: same intrinsics would work on decoder encoder and all the operations in compressed formats too, but i tend to think the most powerful execution format is double compressed, those intrinsics i talked about a few but there might had been some mistakes i did, i do not recall or remember the logs but such intrinsics might be more the less to taclke lattice based cryptography i would assume,
12:45 randomgigolo: however i have not delt with that type of cryptography. double compressed intrinsics would solve tremendous amounts of alus more than 300qubits. You have such intrinsics working that means a danger, but i had done some brainhurting on those already, and i know its possible. again only minus plus needed. To be honest cambodia was in a state of such criminal gangs floating around, that
12:45 randomgigolo: were quite as not only obnoxious only with their anal sluts, but they were dangerous to tourists, thieves as well as cold hearted murderers what you would expect from syndicate gangs anyways right, who only talk about their mob stories, during such moments the military has to take over there. I ignored those gangstas and delt with my own research so science i chose.
13:36 clashoftitaneggs: https://www.youtube.com/watch?v=-UrdExQW0cs 20million qubits is out of reach however classical computers can do it with less. double compression would give one alu unit in 20cycles 300qubits, this is basing on calculation that 64qubits divided by 65336 is now stuffed so, that every after 65536digits you have 64qubits, and 64bits register can have trillions of them. so every 2 then 4
13:36 clashoftitaneggs: then 8 is two times more than in the beginning per bank , roughly that brings one hundreds of qubits, but rdna3 or such card has tremendous amounts of simds of 64bit width, and i do not think more than two times the it can be packed, so wild guess would be 100rdna3 cards to break rsa encryption i think rdna 3 had 2048 simds. so on 2048 simds we are on roughly 2000qubits, then on
13:36 clashoftitaneggs: 20000 with 10 and 20million on 100 of such, those are likely very broken calculations but point is the time is already there.
13:48 zmike: jenatali: try https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33285
13:50 jenatali: zmike: will do
15:04 tenorgifsuedme: considering that i offered this without any chatgpt usage i got it pretty close probably since trillion times trillion is septillion, that google answers is 10 in power of 100, so 300qubits is 10 in power of 90 which so double compressed format has above 300qubits indeed, now the cards are different some might have 32768 compute units these days i'd assume, but this goes already over
15:04 tenorgifsuedme: my budget. however there are further tricks possible so i tend to belive not only that rsa1024-2048 is already broken many times by hackers, maybe it's achievable with only 10graphics cards rather but whatever.
16:05 pistolsmoke: maybe someone wants to meet those estonian fuck aces that spam the world with wankterror , da real deal instead of my techtalks. Asses full of biritish scumbags load. Terrible people. still i think encoding banks outside the hash itself that goes into one register, no longer makes sense, there are zero delay interrupts possible yet, and mixture with non-compressed intrinsical methods,
16:05 pistolsmoke: however more than that i do not see getting done, i do not think i am capably in form for this, nested compressed intrinsics would give entire maniac results, but one needs to come out with some addressing schematic, i have not landed to that spot with any of my thoughts. Nested ones are probably possible but that has already complexity, i am not prepared for this and i exhibit no form at
16:05 pistolsmoke: all tbh. in mathematics or programming, it's just what i talked about is very simple schools material before.
16:29 jenatali: zmike: Yep, fixed it
16:30 zmike: jenatali: cool
16:30 zmike: does it work for whatever you were testing too?
16:30 jenatali: It exhibits the same bug I was seeing with the d3d12 backend and softpipe
16:30 jenatali: I.e. not my bug :)
16:30 zmike: cool
16:30 zmike: wait softpipe?
16:31 zmike: or llvmpipe
16:31 jenatali: Softpipe. This is an arm64 laptop and I didn't feel like building llvm
16:31 zmike: coward
16:31 jenatali: Guilty
16:31 jenatali: FWIW this is the bug: https://github.com/telegramdesktop/tdesktop/issues/28905
16:32 zmike: gross
16:59 nosmokefrom: they do not really specify what is the background of that measure of 20million qubits, what's the rsa breaking algorithm in mind, if that never takes use of any addressing redecoding and encoding like shown on the clip so factor to until 1 and add the fourier transform and append exponents, got sakes i would tell than that nested compressed intrinsics would break rsa with single computer.
16:59 nosmokefrom: Since you do understand when you readdress the results of the intrinsic and re-add and re-subtract those results, the computation efficiency polynomialy grows. when you add one septillion to another septillion guess what happens etc. it would resolve indeed now this number of alus. But my brain started to hurt, and this dwfreed is killing me, guy needs another coffin to be tailored to
16:59 nosmokefrom: him, we make on for jack already. Terrible people, seems airlied coworks with estonian wank spam ass hero real deal shitfaces, pardon i do not like to get involved with any stalkharassing me over their "private" achivements, it's definitely very tactical as well as advanced to get between the fart from illman from england but it looks like they do not stop their spam and abuse to the not
16:59 nosmokefrom: interested people like me, so we are even i would want to kill you as much as you would me, and we'll see who succeeds, however only abusers were you, i already did my time 2years in mental insitution and in estonia that is lot stinkier and dangerous than jail. And that is your favours to me youo fucking scum.
17:06 alyssa: how is host_image_copy supposed to work with sparse? o_O
17:06 zmike: it's not
17:07 alyssa: then why is there CTS for it :clown:
17:07 zmike: or maybe it is and I misremembered
17:09 alyssa:returns not_supported and CTS shuts up
17:09 alyssa: deeply unserious
17:10 zmike: smh how will anyone use this driver
17:11 alyssa: holding it wrong
17:13 HdkR:aims graphics driver directly at foot
17:30 jenatali: ... how do you do sparse with host access at all?
17:31 zmike: very carefully
17:33 alyssa: jenatali: that is my question
17:50 alyssa: I guess lavapipe could do it
18:08 memleak: hello, i've been working with X for many years and trying to help a guy out with a really obscure problem and i'm stuck.
18:09 memleak: he's using an intel board with a i915 GPU that is too new for the 5.4 LTS kernel so I helped get his system to use fbdev first with simplefb, that didn't work (no suitable framebuffer found) even though /dev/fb0 was there
18:10 memleak: next i tried to get vesa going for him by uninstalling the xorg xserver fbdev driver to help shut vesa up, but now he gets: V_BIOS address 0x0 out of range
18:10 memleak: vesa doesn't want to start if fbdev is available and if /dev/fb0 is present so i satisfied those dependencies for him
18:11 memleak: never in my life saw this V_BIOS address 0x0 out of range problem before and no idea how to fix.
18:11 Ermine: is upgrading to newer kernel an option? There are newer LTS kernels
18:12 memleak: it has to be 5.4 because we're using RTAI
18:12 memleak: (real-time kernel, preempt_rt not fast enough)
18:13 memleak: i maintain the rtai repository for linuxcnc. my only other option looks to be backporting support for meteorlake into 5.4 if i can't get vesa going
18:13 Ermine: Well.. I'd try to blacklist any fbdev modules that show up and try to get xorg up with modesetting ddx
18:15 Ermine: (so simpledrm driver is in charge on kernel side)
18:15 memleak: checking if simpledrm is in 5.4...
18:16 memleak: nope, introduced in 5.14
18:16 Ermine: ugh
18:17 jenatali: alyssa: Do you know how lavapipe does sparse?
18:18 Ermine: just in case, there's also #xorg channel
18:22 memleak: thanks Ermine checking there too :)
18:22 memleak: never saw this memory error come up before
18:29 memleak: i'm going to try backporting simpledrm to 5.4 and see how much work it is
18:29 alyssa: jenatali: I assume mmap(MAP_FIXED)?
18:29 jenatali: Ah yeah, makes sense
18:31 alyssa: yeah, more or less https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29408
18:38 lumag: sima, vsyrjala: any resolution on the https://lore.kernel.org/dri-devel/it2puzcitkui2inz4tmvkpig47jyz2efeq3udzffnqwomf3r3v@5sylpgnvqdxk/ ?
18:42 alyssa: does dEQP-VK.dynamic_rendering.primary_cmd_buff.local_read.depth_stencil_mapping_to_no_index_depth_clear crash for anyone with recent CTS?
18:42 alyssa: seems to be a CTS regression
18:42 alyssa: but I find it .. challenging to find things in gerrit (:
18:42 zmike: I've literally never seen a CTS crash in my life
18:43 alyssa: dies inside
18:43 alyssa: #0 0x0000000002500d4c in vk::createShaderModule(vk::DeviceInterface const&, vk::VkDevice_s*, vk::ProgramBinary const&, unsigned int) ()
18:46 HdkR: zmike: Closing your eyes when the CTS runs doesn't count
18:46 zmike: lavapipe too strong, too handsome to crash
18:47 alyssa: oh that's https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/5565 I guess
18:47 alyssa: I guess I can just .dynamicRenderingLocalReadDepthStencilAttachments = true, it's not like anything uses DRLR right hahaha? *sweats*
18:48 memleak: nope.. backporting simpledrm is not feasible
18:50 vsyrjala: lumag: i guess i don't really care what as long it doesn't add some assumptions that trip up when drivers trigger modesets that don't need any extra state checks
19:18 DemiMarie: memleak: why is preempt_rt too slow?
19:19 memleak: latency needs to be lower for this guy to control stepper motors
19:19 memleak: rtai is always lower
19:36 DemiMarie: Have you considered using a completely separate RTOS for the motor control tasks? Staying on Linux 5.4 isn’t going to be feasible in the long term.
19:37 DemiMarie: This could be done with either dedicated cores hidden from Linux or with a separate microcontroller.
19:42 zmike: tarceri: pls make sure to assign your glsl fix today before the branchpoint
19:54 memleak: Yeah, the real fix is to port LinuxCNC to EVL/Dovetail for lower latency than what you'll get with PREEMPT_RT and it supports the later kernels, there's just no support for it in LinuxCNC
19:55 memleak: I'm trying a different approach with the vesa driver, going to do it in gentoo, maybe something screwy is happening with debian
19:55 memleak: I've never seen vesa not work in gentoo, not in my builds anyway
19:56 memleak: brb
20:04 jesperlundfrog: I do not know much about quantum pre execution procedures (in the middle of looking at it from videos) microsoft simulated 4000qubits needed to break rsa, google told me. But packed exec intrinsics no longer have anything special involved, the stuff is encoded twice by compiler, so there is some compiler overhead involved only, the old intrinsics jabber wasn't for real, anyhow the
20:04 jesperlundfrog: compiler has quite a lot of overhead. I expect it to compile less than hour however and break the encryption if no mistakes in the routines in minutes on single computer. however logically taken the collisions can not happen, it needs to be enough big of computation sequence. so that 64bits are outputted correctly and decoded back, then read operands again, encode one level deeper and
20:04 jesperlundfrog: there we'd go again if no ipc than compilation and exection does not need to be fused as like not jit but operands could be loaded at compile time instead of runtime, there is actually no magic involved , compressed sequence can work well, on any number of qubits. It feels like you are as big as abortion leftovers than estonian and finnish anal heroes. Too much of asstrafficing to
20:04 jesperlundfrog: human trash like for laura keskinen, i do not care about this shitbag anymore, it was never like i was harassing her territory with my sexual partners making fraud and stealing money. That borderline human trash is not welcome with the wankspammers anal stuffers on my territory, you get it? I did not care more than about this only, what the asshole does not around me is her own
20:04 jesperlundfrog: business.
20:29 airlied: karolherbst: moving here from other channel, could we move the intel subgroups into a faster to compile header?
20:29 karolherbst: I'm working on pch support
20:29 karolherbst: that should speed it up a lot
20:29 karolherbst: the intel spirv files are generated roughly 8 times faster with pch
20:30 Ristovski: damn, quite a speedup
20:30 karolherbst: soo the plan is to simply precompile the opencl-c.h file
20:30 karolherbst: and then it doesn't matter
20:31 karolherbst: I have it working on the cli, but it's kinda more painful to get it working with libclang...
20:42 karolherbst: it works.. nice
20:45 karolherbst: with PCH: time ninja src/intel/shaders/intel_gfx_shaders.pch src/intel/shaders/intel_gfx{80,90,110,120,125,200,300}_shaders.spv => real 0m0,482s
20:47 karolherbst: without PCH: time ninja src/intel/shaders/intel_gfx{80,90,110,120,125,200,300}_shaders.spv => real 0m1,471s
20:47 karolherbst: the difference in user is even more impressive
20:47 karolherbst: user 0m7,584s => user 0m1,205s
20:47 karolherbst: guess it's waiting more on IO than burning through the CPU
20:48 karolherbst: though the time spent is kinda variable
20:48 psykose: think it's just parallelism (slowest file went from 1,2 => 0,4, bit random), but they're all faster so the user goes way down
20:49 karolherbst: mhhh yeah fair
20:49 karolherbst: I mean, the PCH stuff needs to parse the header once single threaded
20:49 karolherbst: so the actual compile jobs start later
20:49 karolherbst: anyway...
20:49 psykose: means they're even faster than that 0,482 then for the slowest :)
20:49 karolherbst: yeah..
20:49 karolherbst: it's a small win
20:50 psykose: lines up with the -ftime-trace i saw for a lot of the smaller <5s files, they were mostly 50+% frontend
20:50 psykose: the huge 5 minute ones i think won't be impacted that much
20:50 karolherbst: it's just the intel shader stuff, but we do have a couple of more jobs using mesa_clc
20:50 psykose: er, not 5 minutes, 50 s
20:50 karolherbst: we should port the intel raytracing stuff over to mesa_clc as well
20:52 karolherbst: anyway.. best case without PCH: real 0m1,450s user 0m7,324s
20:52 karolherbst: best case with PCH: real 0m0,480s user 0m1,342s
20:52 karolherbst: dj-death, airlied: ^^
20:53 karolherbst: it won't make CI that much faster, but it's something :D
20:53 psykose: amd stuff has a lot of such files too if you wanna try too
20:53 psykose: i think the c++ especially
20:53 karolherbst: this is just for OpenCL C files
20:53 psykose: yea
20:54 karolherbst: host side PCH is kinda a build system mess I don't want to get involved in yet :D
20:54 karolherbst: the rules are funky, even for CL, but the environment is way more controlled
20:55 karolherbst: like at least with clang you can only have a single pch (though you can chain them by linking a new header pulling in a pch)
20:55 karolherbst: anyway.. it's kinda a mess
20:55 karolherbst: now... I need to clean that mess up
21:06 karolherbst: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33292
21:10 dj-death: karolherbst: we'll just delete this stuff soon, don't bother too much ;)
21:10 karolherbst: heh
21:10 karolherbst: though I wanted to have the pch interfaces anyway
21:14 alyssa: karolherbst: what's this PCH thing?
21:14 karolherbst: precompiled header
21:14 alyssa: I see
21:14 karolherbst: tldr: the AST gets dumped to a file and reused
21:17 alyssa: I guess I should review this
21:17 alyssa: Probably tomorrow though, my brain is too fried rn I think
21:18 karolherbst: in theory we could reuse the same PCH file across all of mesa_clc users as long as they use the same CL version and stuff
21:19 karolherbst: could also add more header files into the PCH, but not sure it matters much as the opencl-c.h is kinda the big one here
21:22 alyssa: karolherbst: i'm kinda confused, why is opencl-c.h not already pulled in?
21:23 karolherbst: what do you mean?
21:23 alyssa: what does this have to do with intel_subgroups
21:23 alyssa: aren't we already #include'ing this?
21:23 jenatali: alyssa: Clang has most CL support in a builtin
21:23 karolherbst: depending on supported extensions we either pull in opencl-c-base.h or opencl-c.h
21:23 alyssa: jenatali: ..Cute
21:23 karolherbst: and the latter is like 15x the size or so
21:23 karolherbst: and it slows down compilation speed by a lot
21:23 alyssa: Ok, I see
21:23 airlied: I wonder if we could just move more things to the builtins
21:24 karolherbst: in llvm? yeah, certainly
21:24 alyssa: ^ or just copy paste the subset we care about and include that instead?
21:24 jenatali: Yeah but that introduces a dependency for new LLVM
21:24 alyssa: that might also require llvm changes
21:24 jenatali: And that's... sometimes painful
21:24 karolherbst: yeah, but then we have to maintain our own file
21:25 alyssa: seems .. fine?
21:25 karolherbst: not sure I want to deal with clang internals there
21:25 alyssa: also I feel like I'm missing context, why are we interested in cl_intel_subgroups?
21:25 karolherbst: you enable it in mesa_clc
21:25 alyssa: Uhoh
21:25 karolherbst: but there are other extensions which might pull in the big header file as well
21:25 alyssa: I think I cargo culted that from the old intel_clc
21:26 alyssa: Not sure anything is using it now that GRL is ogne
21:26 karolherbst: yeah.. the raytracing one will need it for sure
21:26 alyssa: gone
21:26 karolherbst: could also disable it...
21:26 alyssa: I guess grl is still in tree..
21:26 karolherbst: let me check how much pch has an impact with intel_subgroups disabled
21:27 alyssa: yeah I think this is just there because of grl
21:27 alyssa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32472
21:28 karolherbst: but I want to enable intel_subgroups in rusticl sooner or later, so I kinda want PCH support anyway
21:28 alyssa: that's fair
21:28 alyssa: pch applies to rusticl online compilation too?
21:28 karolherbst: not yet
21:28 karolherbst: would need to wire it up
21:29 karolherbst: it's a bit more complicated in online compilation, because you need one per combination of arguments I'm not sure yet which matter
21:29 karolherbst: I know that the CL version matters...
21:32 alyssa: ok, right
21:32 alyssa: nothing uses intel_subgroups with mesa_clc
21:33 alyssa: but. as a design decision, mesa_clc just enables everything to make it easy for mesa drivers to use
21:33 alyssa: and since mesa supports intel_subgroups, .. yeah
21:33 alyssa: probably doesn't make sense to special case disable this ext in mesa_clc, either
21:34 karolherbst: especially if it's not that much slower with PCH, though I'll do some testing on it
21:46 karolherbst: uhh.. after rebase that's going to be a bit more painful to support.. well.. still possible
21:54 karolherbst: yeah.. so intel_subgroups disabled gives us real 0m0,279s and user 0m1,195s
21:57 karolherbst: intel_subgroups disabled with PCH is in the same ball park
21:57 karolherbst: slightly less user, slightly more real
21:57 karolherbst: so I think it will become faster with more users of the PCH
21:58 alyssa: karolherbst: so what should we do?
21:59 karolherbst: well.. I want the clc bits anyway. I think it also makes sense for intel_subgroups disabled, because the compile time with the pch not changed is massively faster in either case
21:59 karolherbst: => faster development
22:00 karolherbst: but it's opt-in anyway, so might as well keep it
22:01 karolherbst: let me remember which extension also needed it...
22:02 karolherbst: ahh yeah
22:02 karolherbst: kernel_clocks
22:02 karolherbst: needs the big file as well
22:03 karolherbst: I'm sure the coop matrix stuff will also land there
22:04 karolherbst: anyway.. I think it's good to have it in place in case we have stronger needs for it
22:06 alyssa: sounds good
22:06 alyssa: will review tomorrow
23:10 mareko: daniels: debian-ppc64el still has LLVM 15; it's OK to stop building radeonsi and radv for it, right?
23:17 mareko: jenatali: I wonder if it's better to build windows drivers with -Dllvm=disabled, that should be fine AMD drivers
23:17 mareko: in CI
23:18 jenatali: mareko: We build CL components there
23:18 jenatali: Those need LLVM
23:20 jenatali: mareko: I'm attempting to upgrade my local build from 15 to 19. Assuming it looks fine we should be able to bump CI too