|
What’s the per-clock difference between Rocket and Alder Lake outside of AVX-512, where Intel apparently decided that because little Alder cores couldn’t run it, no Alder cores should run it? Just kills me, is all. I’m wondering if it’d be worthwhile for a machine chiefly concerned with crunching video to skip Alder this time and save a little money with Rocket Lake.
|
# ? Aug 20, 2022 05:52 |
|
|
# ? Jun 8, 2024 14:23 |
|
rocket lake was a very bad generation - it was in some cases worse than its predecessor, and alder lake was a large improvement for production workloads as well as gaming. i would be very surprised if there's any value to be had in opting for it over alder lake unless you really really need avx-512 for some niche use, in which case waiting for zen 4 would still probably seem like a better idea. i think there were technical issues regarding scheduling with the avx-512 instructions only being available on some cores, which was one of the reasons they disabled it?
|
# ? Aug 20, 2022 07:33 |
|
The hilarious part about the "we can't schedule cores right for AVX-512 so we have to disable it" is that it's not a difficult problem to solve in a scheduler. Step 0: You already have a #UD exception handler in the kernel to trap undefined instructions, so you don't have to add anything there! Step 1: Check in the #UD handler if the offending instruction is an AVX-512 instruction, and if so, is it on a core that doesn't support AVX-512. Step 2: Reschedule the offending thread to a performance core and set the thread affinity rules to not use any of the efficiency cores, since you're the kernel and you know which ones are which. This isn't hard! Kernels have been doing this kind of "trap undefined instructions and find some way to make the thread work anyways" thing for fifty loving years! The 8086 had a built-in method to trap floating-point instructions when no coprocessor was installed so they could be emulated in software. There's no excuse for this other than Intel engineers genuinely believing Microsoft, Apple, the Linux kernel team, and the various BSDs to be stupid enough to not know how to do something that their kernels already more or less do.
|
# ? Aug 20, 2022 07:54 |
|
lih posted:rocket lake was a very bad generation - it was in some cases worse than its predecessor, and alder lake was a large improvement for production workloads as well as gaming. i would be very surprised if there's any value to be had in opting for it over alder lake unless you really really need avx-512 for some niche use, in which case waiting for zen 4 would still probably seem like a better idea. the least niche use case i've heard for avx-512 so far is ps3 emulation so yeah i don't feel i'm missing out on anything not having support for it
|
# ? Aug 20, 2022 08:02 |
|
Kazinsal posted:The hilarious part about the "we can't schedule cores right for AVX-512 so we have to disable it" is that it's not a difficult problem to solve in a scheduler. You're way too confident that this would work well in practice. Linus Torvalds himself is well aware of the idea, and has dismissed it. Here's one of the issues I saw brought up in the discussion thread where he poo poo on the idea. As an application programmer, how do you even detect that it's a good idea to try executing AVX512 instructions? On x86 platforms (including Linux), for quite a long time, the standard (if you want to take advantage of optional instructions, that is) has been to use the CPUID instruction to directly query the CPU about its capabilities, then set one or more global variables to select which subroutine to call when the program wants to crunch vector math. But if you're running on a hypothetical Alder Lake where some cores have AVX512 enabled and others do not, what you get back from CPUID depends on which core you happen to be running on at the moment. Your application could wrongly conclude that the processor doesn't support AVX512 at all just because it ran on a small core when it did its feature check. Another: Usually applications which want lots of vector FP throughput would like to run on as many cores as possible. Given that many of Intel's hybrid CPUs have a lot more small cores than big cores, and depend on utilizing them for high compute throughput, now you have to recode your app to be fully aware of the split and run/pin some threads in AVX512 mode and others in AVX256 mode. This has the potential to get real messy, and doesn't work at all for fielded applications (which are an important concern, like it or not). There's more that I don't remember. Intel was originally going to ship CPUs the way you claim is better, but changed plans late in the game. It's not hard to guess that this was because Intel's partners - mostly likely Microsoft in particular - tested it on engineering sample CPUs and told Intel it caused way too much trouble and there was no clean way to work around it. Intel never should have released hybrid chips with cores so different they can't report the same capbilities in CPUID. Even with AVX512 disabled, there's still some pain due to other less prominent capability differences. Every Arm platform with varying core sizes that tried to do similar asymmetric ISA feature support has run into problems, so everyone knew it wasn't a good idea. But Intel needed to rush something to market to keep themselves competitive, so here we are.
|
# ? Aug 20, 2022 08:26 |
|
I have been having good experiences with PS3 emulation with a 12900K that does not have AVX-512 enabled - this particular processor predates them lasering it off, but I'd have to run old microcode on my motherboard and frankly I don't even know how to do that, or if it would require running an older BIOS when I had some instability at my memory clocks prior to a certain more recent one well after the microcode stopped supporting AVX-512 Plus, I use the E-cores and wouldn't want to turn them off just to make PS3 emulation go faster anyway. Not saying it wouldn't be cool if it were just able to do it, but given what I don't know and the necessary trade-offs I'm personally just waiting for a generation that supports it better rather than worrying about it now.
|
# ? Aug 20, 2022 08:26 |
|
Kazinsal posted:The hilarious part about the "we can't schedule cores right for AVX-512 so we have to disable it" is that it's not a difficult problem to solve in a scheduler. BobHoward posted:You're way too confident that this would work well in practice. Linus Torvalds himself is well aware of the idea, and has dismissed it. ??? quote:By: Linus Torvalds (torvalds.delete@this.linux-foundation.org), August 28, 2021 12:15 pm The problem of launching an appropriate topology of threads of course exists too, the result of a CPUID call is (or was) essentially undefined on Alder Lake processors unless user-space programs specifically manage their affinity and scan the cores to understand what's going on. But linus absolutely does not think that trapping exceptions and pinning thread affinity is a functional problem at all, and he's in fact said the complete opposite, specifically. Paul MaudDib fucked around with this message at 09:40 on Aug 20, 2022 |
# ? Aug 20, 2022 08:45 |
|
my understanding is that Rocket Lake was "bad" as far as: * the Rocket Lake i9 having two fewer cores than the Comet Lake i9 (i.e. eight versus ten) * in parallel, the Rocket Lake i9 having just-as-many cores (i.e. eight) as the Rocket Lake i7, with the only differentiator being a number of gated features on the i9 to juice its clocks a little higher but otherwise the rest of the products were "fine" if you knew what you were getting into * the i5 was capable of trading blows against the Ryzen 5 5600X, for something that was cheaper * and it was straight-up faster than the Ryzen 5 3600, for something with comparable cost * the i7 was capable of trading blows agains the Ryzen 7 5800X, again for something that was cheaper and yes, motherboard cost and the upgradeability of an AM4 board versus having to get a new* LGA1200 board would influence this to a degree, and getting a Rocket Lake CPU today might be a little off given that Alder Lake already exists, but my impression was that the "waste of sand" comments really only extended to the i9 but doesn't really deserve to be projected onto the rest of the parts * I'll also throw in that Rocket Lake was so close to Comet Lake if you already had a Comet Lake than it wouldn't really be worth upgrading even in the i5 and i7 tiers
|
# ? Aug 20, 2022 09:19 |
|
"Trading blows" is a very generous way to phrase what really happened. It's true that those parts weren't too far behind AMD, but they were still consistently and measurably behind. So Intel did the only thing they could do and try to win on value. And it was kind of effective (I know some people who bought Rocket Lake CPUs, and they're fine)
|
# ? Aug 20, 2022 10:30 |
|
Paul MaudDib posted:??? You know what's really funny? The detection on a #UD for AVX-512 in 64-bit mode is four instructions. Assembly code off the top of my head: code:
e: In 32-bit mode, 0x62 followed by a modR/M byte is BOUND r16/32, which isn't wide enough for 64-bit, and since nobody really used it Intel just decided the BOUND instruction should simply not be valid at all in x86-64. Kazinsal fucked around with this message at 11:29 on Aug 20, 2022 |
# ? Aug 20, 2022 11:20 |
|
lol ok paul he was saying "yeah obviously we could do this annoying clumsy thing in the scheduler, but here's why avx512 in some cores and avx256 in others sucks anyways", and you cut out the latter part for mysterious reasons
|
# ? Aug 20, 2022 12:01 |
|
That’s all opinion. Why not build a minmaxed scale design with 12c instead of 8+6? Why not 0+30? Because Intel isn’t building a processor for one workload they’re building a general purpose processor that splits the difference. Yes, that isn’t optimal for either case, and there are design challenges with pushing big.little into client environments that haven’t encountered that before. But alder lake is a pretty good performer across a variety of workloads. That’s irrelevant to the point though which was “Linus doesn’t see problems with making it not crash at a kernel level”. He’s actually specifically said the kernel side is annoying but feasible.
|
# ? Aug 20, 2022 12:21 |
|
BobHoward posted:You're way too confident that this would work well in practice. Linus Torvalds himself is well aware of the idea, and has dismissed it. The standard CPUID query could only return the features that are common for all cores. If an application wants to use the fancy features it should do a special query to return a per-core feature set and be required to tell the scheduler which cores the thread is allowed to use.
|
# ? Aug 20, 2022 12:39 |
|
IIRC there was some Intel-internal drama, where one team (design?) fully expected avx512 on and another team (executive?) thought it made the ecores look less capable and the latter announced it by fiat.
|
# ? Aug 20, 2022 13:39 |
Saukkis posted:The standard CPUID query could only return the features that are common for all cores. If an application wants to use the fancy features it should do a special query to return a per-core feature set and be required to tell the scheduler which cores the thread is allowed to use. ARM already tried it with their HMP implementation, and it loving sucked. I'm not convinced HMP is here to stay, to the point that I'm tempted to avoid it when buying my next PC - just because it's still such a new way of thinking about compute, that there's still a lot of open questions that nobody seems interested in attempting to answer.
|
|
# ? Aug 20, 2022 13:39 |
|
Saukkis posted:The standard CPUID query could only return the features that are common for all cores. If an application wants to use the fancy features it should do a special query to return a per-core feature set and be required to tell the scheduler which cores the thread is allowed to use. They could do that, but if Meteor Lake is going to make the feature set consistent between the big cores and small cores then it's probably not worth the trouble of pushing a new public interface that would be useful for one generation
|
# ? Aug 20, 2022 13:47 |
|
repiv posted:They could do that, but if Meteor Lake is going to make the feature set consistent between the big cores and small cores then it's probably not worth the trouble of pushing a new public interface that would be useful for one generation The rumour is that AVX512 512b sized vectors will run over multiple cycles on the next generation -mont cores, yeah? Not necessarily worthwhile for throughput, but allows a homogenous feature set with the big cores with the added instruction flexibility in AVX512 even for smaller vector sizes. It'd be funny if they do go with the 2 LP cores in the SOC die and they end up causing issues by not having AVX512. Has anyone confirmed if Zen4 will run full rate for 512b vectors? I saw rumours of it being split over two cycles like zen originally did for AVX2-256.
|
# ? Aug 20, 2022 14:36 |
|
surprised nobody's brought up the Samsung phone that would re-schedule you in between the CPUID check and NEON instructions
|
# ? Aug 25, 2022 18:27 |
|
oh yeah that was a good one https://www.mono-project.com/news/2016/09/12/arm64-icache/ it was cacheline size changing under your feet, not neon support, unless it happened twice
|
# ? Aug 25, 2022 18:42 |
I share an IRC channel with some of those folks involved (I think, although it might be a different HMP related bug) and reading the backlog from when they figured things out was.. something else.
|
|
# ? Aug 25, 2022 19:15 |
|
repiv posted:oh yeah that was a good one That's really neat!
|
# ? Aug 25, 2022 19:35 |
|
repiv posted:oh yeah that was a good one There's this one and another where the big and LITTLE cores had different feature sets and things would randomly crash as the scheduler moved threads over to a CPU that relies on features that suddenly disappeared. I can't find hte writeup for it anymore but it also had something to do with one of the Android OEMs making a custom scheduler that broke it even worse.
|
# ? Aug 25, 2022 20:54 |
Number19 posted:There's this one and another where the big and LITTLE cores had different feature sets and things would randomly crash as the scheduler moved threads over to a CPU that relies on features that suddenly disappeared. I can't find hte writeup for it anymore but it also had something to do with one of the Android OEMs making a custom scheduler that broke it even worse. Still not convinced HMP schedulers will be solved any time within a good number of years. It needs not just be aware of the frequency differences, any microarchitectural and cacheline differences, and so on and so forth - but it also needs to be aware of the energy consumption based on whether the process expects to run for a short time or a long time, and the only real way to figure that out is by some build-time flag that sets a value somewhere in the binary. That'll only happen once all toolchains have caught up, unless someone makes the decision when "80%" of software has been converted, to simply always schedule the rest of the 20% of the energy efficient cores. BlankSystemDaemon fucked around with this message at 00:14 on Aug 26, 2022 |
|
# ? Aug 26, 2022 00:09 |
|
https://videocardz.com/newz/intel-13th-gen-core-raptor-lake-s-specifications-leaked-14-skus-up-to-16-cores-and-5-8-ghz A leaked SKU lineup for Raptor Lake, with specs. It's speculated in this article that the 13400 could just be a rebadged Alder Lake CPU, and it honestly looks that way. It looks like a 12600K in disguise. Same cache, core config, and stock memory support, but 300 mhz slower max boost clock. But even if it's not, it still seems much slower than the rest of the raptor lake offerings, which would be disappointing since the 12400 was such a good sub-$200 CPU. The 12700 non-K was secretly one of the best values in the alder lake lineup, and I'm pleased to see the 13700 non-K being similarly good here. It's the same as the K variant but with 200 fewer mhz, and no overclockability (which nobody does anymore). The 12700 non-K also had a power limit that was 20W lower, which barely hurt it. That new power figures Intel created for alder lake aren't on this chart, so we can't see how those compare. Dr. Video Games 0031 fucked around with this message at 21:46 on Sep 3, 2022 |
# ? Sep 3, 2022 21:43 |
|
Thanks, looks pretty good but I'd want to see how they work IRL. I almost went for the 12700 before realizing I couldn't get a new GPU to go with it anyway so didn't bother. Also looks like the launches will more or less line up with the Zen4 this time so I wouldn't even have to go through "but if I wait 6-8 months, the new thing might be even better!". Especially if nVidia doesn't keep jerking us around I might end up with an actually new pc for the first time since Kentsfield.
|
# ? Sep 5, 2022 08:59 |
|
Dr. Video Games 0031 posted:https://videocardz.com/newz/intel-13th-gen-core-raptor-lake-s-specifications-leaked-14-skus-up-to-16-cores-and-5-8-ghz Assuming the price point stays the same, that seems.. pretty good? +200 peak boost and adding 4 ecores to what is already the gaming value king doesn't sound awful, unless AMD is planning to launch something really good at $200.
|
# ? Sep 5, 2022 15:18 |
|
VorpalFish posted:Assuming the price point stays the same, that seems.. pretty good? +200 peak boost and adding 4 ecores to what is already the gaming value king doesn't sound awful, unless AMD is planning to launch something really good at $200. That seems like an incredibly small incremental gain over what we already have to me. Every other SKU is a significant boost over the 12th gen except for the 13400. We'll have to see what the reviews say, but the 13400 could end up losing the price-to-performance argument. If the only argument for it is "well, there aren't any other budget options," then that would be disappointing. edit: it would also be competing against discounted alder lake and zen 3 parts. Dr. Video Games 0031 fucked around with this message at 21:39 on Sep 5, 2022 |
# ? Sep 5, 2022 21:35 |
|
Dr. Video Games 0031 posted:That seems like an incredibly small incremental gain over what we already have to me. Every other SKU is a significant boost over the 12th gen except for the 13400. We'll have to see what the reviews say, but the 13400 could end up losing the price-to-performance argument. If the only argument for it is "well, there aren't any other budget options," then that would be disappointing. Should be a substantial boost to productivity/multicore, and not much for per core performance. Not totally unreasonable for a single gen update. I wouldn't expect discount alder lake to stick around for a long time after launch either, they're both on 10esf no? Lowest end i5 going to 6+4 theoretically opens the door for maybe 4+4 or 6+0 at the 100-140 price point, don't think I saw any i3s mentioned, don't know if there have been any rumors there. That could be quite interesting, but I guess probably unlikely.
|
# ? Sep 6, 2022 01:36 |
|
If anyone is an expert at 12th gen Intel CPUs, please weigh in on the following situation--and correct me if I somehow misunderstand how core boost is supposed to work. I have a new prebuilt HP Omen system with the i5-12400F. It performs OK in multicore benchmarks, but is way behind where a 12400F CPU should be in single-core benchmarks. Using utilities like HWMonitor, CPU-Z, CoreTemp etc. none of the cores will boost above 3.99 ghz, even in single-threaded loads and benchmarks, or light to moderate office work or browsing. Benchmark performance (single core) is about 10% below par, which corresponds to all cores being limited to 3.99 ghz instead of the Intel spec of 4.4 ghz. Does anyone know a way I can get around whatever the gently caress HP did to this thing, to keep it from boosting to the Intel-spec level of 4.4 ghz? There are no BIOS settings for anything like core boost or power limits. Omen Gaming Hub allows you to set XMP for memory, but has no direct control over the CPU. I've also tried tweaks to the windows power plans but that hasn't helped either. I don't want to overclock, but I just want to get the individual core max boost back to the Intel spec. Thermals are absolutely not a problem as temps are surprisingly low for a small air cooler. Love that HP is selling "gaming" systems that don't even reach the standard, non-overclocked frequencies that the CPU is supposed to support.
|
# ? Sep 6, 2022 05:09 |
|
You won't be able to alter clock speeds directly since the 12400(F) doesn't support support overclocking. Are you never reaching 4.4GHz? Or can you reach 4.4GHz for like 30 seconds before it drops down to 4GHz? Intel has things called power levels, and motherboards can be configured so you're only allowed to use the max power level in limited bursts. With the 12th gen, Intel changed it so, by default, processors can boost to their max boost indefinitely in lightly threaded loads. But that's only for custom-built systems. OEMs like HP may still use their own power management configurations. This is typically user-configurable, but it's possible that HP is hiding those options from you. If they're loving with the clock speeds without giving you a way to change them, then that's even worse. I would just return the computer if there's no way to change this, as that's downright unacceptable. HP is a garbage company, and this is just example number ten billion of this.
|
# ? Sep 6, 2022 05:34 |
|
I would probably also return the computer, but if you really want to try making it work you could investigate Intel's Extreme Tuning Utility. I no longer have a current enough Intel system to run it but the last time I did, it let me tweak power limits and turbo timers on a laptop dual-core. It might have some way to affect an artificial limit placed on the system by HP.
|
# ? Sep 6, 2022 07:09 |
|
Number_6 posted:If anyone is an expert at 12th gen Intel CPUs, please weigh in on the following situation--and correct me if I somehow misunderstand how core boost is supposed to work. I have a new prebuilt HP Omen system with the i5-12400F. It performs OK in multicore benchmarks, but is way behind where a 12400F CPU should be in single-core benchmarks. Using utilities like HWMonitor, CPU-Z, CoreTemp etc. none of the cores will boost above 3.99 ghz, even in single-threaded loads and benchmarks, or light to moderate office work or browsing. Benchmark performance (single core) is about 10% below par, which corresponds to all cores being limited to 3.99 ghz instead of the Intel spec of 4.4 ghz. Many name brand prebuilts are garbage, including HP. https://www.youtube.com/watch?v=4OZGmWZyhac
|
# ? Sep 6, 2022 07:32 |
|
Is it possible there is just enough bloatware on the thing to keep it from ever being truly a "single" core load? OEMs used to be real bad about that.
|
# ? Sep 6, 2022 12:24 |
|
Dr. Video Games 0031 posted:You won't be able to alter clock speeds directly since the 12400(F) doesn't support support overclocking. Are you never reaching 4.4GHz? Or can you reach 4.4GHz for like 30 seconds before it drops down to 4GHz? Intel has things called power levels, and motherboards can be configured so you're only allowed to use the max power level in limited bursts. With the 12th gen, Intel changed it so, by default, processors can boost to their max boost indefinitely in lightly threaded loads. But that's only for custom-built systems. OEMs like HP may still use their own power management configurations. This is typically user-configurable, but it's possible that HP is hiding those options from you. It never reaches 4.4, on any core. All six cores individually max at 3.99 regardless of the number of threads active or the type of load (web browsing, office, benchmarking, games). On an individual basis, the cores do show different speeds depending on load, so they aren't all locked together in sync. But monitoring shows that none of the cores ever go past 3.99. VorpalFish, I've eliminated some of the bloatware, and it doesn't seem like there is any significant background load on the CPU. I bought this because I needed something quick and in-stock to replace a 9 month old Lenovo prebuilt that just died on me, and I heard that the 40L & 45L line of Omens were "better" now. I really don't want to return it, as I've already got it set up for work, and I may not even game on it enough for this issue to matter all that much. It's just really annoying to find out that I'm not getting the performance that I paid for, and that HP has the balls to sell "gaming" systems that are misconfigured like this, either intentionally or deliberately. Maybe I can get something going with Intel XTU, but I don't want to overclock, I just want the standard turbo mode to frakkin' work at something close to the represented maximum frequency. I wish I knew a real technical contact at HP or even Intel, because I know if I try to complain about this to regular HP customer support it will probably get nowhere. Edit: Screenshot from HWMonitor, showing how even after a long period of office & gaming use, and various benchmarks, each core maxes at 3991. Other utilities show the same thing (as do single-core benchmark results, which are down 10% from expected values.) Edit 2: FWIW, I am on the "Balanced" power plan in Windows, as HP has disabled or deleted the High Performance power plan. But max CPU state is still set to allow 100%, so it still should allow full turbo boost. Number_6 fucked around with this message at 00:46 on Sep 8, 2022 |
# ? Sep 7, 2022 06:25 |
|
It's strange. If this were a power limit thing, then you wouldn't be getting the same clock speeds in single-core tests as multi-core tests. It's like they're limiting the max boost speed to 4 GHz, which would be a real BS move on a chip that can't be overclocked, especially since HP advertises the CPU as capable of boosting to 4.4 GHz on their spec sheets. I guess I would go through the laborious process of contacting customer support about this, then. Maybe there's some kind of setting being overlooked here.
|
# ? Sep 7, 2022 08:03 |
|
Considering for the better part of a decade Intel was selling "up to 4.X GHz" parts that would immediately dump 200 MHz on every single core the nanosecond any one of them prefetched an AVX instruction I'm not sure what anyone is expecting. "Up to" is a term that is so heavily composed of legalese you can't even consider it a technical specification.
|
# ? Sep 7, 2022 08:26 |
|
Kazinsal posted:Considering for the better part of a decade Intel was selling "up to 4.X GHz" parts that would immediately dump 200 MHz on every single core the nanosecond any one of them prefetched an AVX instruction I'm not sure what anyone is expecting. "Up to" is a term that is so heavily composed of legalese you can't even consider it a technical specification. A stock 12400 can boost to 4.4GHz on one or two cores indefinitely, and it holds that clock speed pretty well under different kinds of loads. So you should probably expect that, and it's a rip-off if a prebuilt doesn't provide it.
|
# ? Sep 7, 2022 08:52 |
|
Even a 35w i5-12400T is supposed to be able to hit a single core turbo of 4.20 Ghz How peculiar
|
# ? Sep 7, 2022 08:55 |
|
Kazinsal posted:Considering for the better part of a decade Intel was selling "up to 4.X GHz" parts that would immediately dump 200 MHz on every single core the nanosecond any one of them prefetched an AVX instruction I'm not sure what anyone is expecting. "Up to" is a term that is so heavily composed of legalese you can't even consider it a technical specification. Yeah, up to is pretty reasonable given how modern boost algorithms works, and by and large the chips can hit peak clocks under specified conditions. Dropping clocks during avx workloads is also completely reasonable - they consume more power than just about anything else, and thus represent a constraint.
|
# ? Sep 7, 2022 12:49 |
|
|
# ? Jun 8, 2024 14:23 |
|
https://www.techpowerup.com/298676/key-slides-from-intel-13th-gen-raptor-lake-launch-presentation-leak The only differences are +1 USB 3.2 2x2 port and they're moving 8 PCIe lanes from 3.0 to 4.0. So basically... get a discounted Z690 board instead of a Z790 board if you're building a new system for Raptor Lake. There may be some improvement to memory overclocking, but many Z690 boards can already reach DDR5-7000, and I have a feeling the economical "sweet spot" for DDR5 will be somewhere around 5600 or 6000. (though any 5600 kits will OC to 6000 easily anyway) edit: And those max turbo power figures for the 13700K and 13600K, woof. The 13700K now consumes as much power as the 12900K (250W), while the 13600K will consume as much power as the 12700K (180W). What will 14th gen bring next? 180W 14400? Dr. Video Games 0031 fucked around with this message at 10:08 on Sep 8, 2022 |
# ? Sep 8, 2022 09:51 |