The Intel 12th Gen Core i9-12900K review: hybrid performance brings hybrid complexity

Thom Holwerda 2021-11-04 Intel 34 Comments

Overall though, it’s no denying that Intel is now in the thick of it, or if I were to argue, the market leader. The nuances of the hybrid architecture are still nascent, so it will take time to discover where benefits will come, especially when we get to the laptop variants of Alder Lake. At a retail price of around $650, the Core i9-12900K ends up being competitive between the two Ryzen 9 processors, each with their good points. The only serious downside for Intel though is cost of switching to DDR5, and users learning Windows 11. That’s not necessarily on Intel, but it’s a few more hoops than we regularly jump through.

Competition is amazing.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

34 Comments

2021-11-05 5:01 am

Luke McCarthy
I’m not that impressed with Alder Lake. It gets Intel back in the game with AMD probably earlier than most suspected they would, and manages to just beat the year-old 5950X at the expense of massively more power usage. The Gracemont cores are pretty impressive for their size.
2021-11-05 5:43 am

Brendan
More performance (and more efficiency) is always nice; but I’m more interesting in the “hybrid P + E cores” aspect (as it’s the first time I know of that mainstream 80×86 attempted it). Specifically; what Intel’s “thread director” actually is (the manuals don’t describe any new hardware at all despite news articles claiming it’s new hardware); and whether there’s any way an OS (that supports it) can “undisable” AVX-512 on the P cores.

2021-11-05 7:18 pm

Xanady Asem
AVX512 is fused off in the P-cores.
As it stands there is no support for different revisions of the x86 ISA on the same system.

2021-11-06 9:35 pm

Brendan
AVX512 is fused off in the P-cores.

Apparently not (Intel said they were fused off, but were wrong). See: https://www.anandtech.com/show/17047/the-intel-12th-gen-core-i912900k-review-hybrid-performance-brings-hybrid-complexity/2

The thing is, the firmware isn’t necessarily special, and if the firmware can enable AVX-512 (when E cores are disabled) then maybe an OS can use the same method to enable AVX-512 (maybe when E cores aren’t disabled).

As it stands there is no support for different revisions of the x86 ISA on the same system.

Being pedantic (sorry); it’s not that simple. Multi-socket systems have always had the potential to support slightly different chips; and Intel’s ancient (late 1990s) MultiProcessor Specification even gave explicit warnings/advice to OS developers (“Operating system writers should factor in processor variations, such as processor type, family, model, and features, to arrive at a configuration that maximizes overall system performance. At a minimum, the MP operating system should remain operational and should support the common features of unequal processors.” in section “B.8 Supporting Unequal Processors”).

The problem was always that operating systems mostly didn’t bother supporting it (even though hardware allowed it in some cases); and motherboard manufacturers weren’t too enthusiastic about validating all the potential permutations either.

Of course I’m not saying it’d be trivial for an OS to do more than the bare minimum (support the common features of unequal processors.). For “with or without AVX-512” there’d be a major difference in the task state area used for switching between tasks that (at least for normal operating systems like Windows and Linux) would complicate task switching and migrating tasks from one class of CPU to another. However; recently (for Linux 5.15) Linux developers added support dissimilar CPUs (mostly systems where only some ARM cores support 32-bit), so they’re already part of the way towards supporting a hypothetical “only some 80×86 cores support AVX512” situation.

2021-11-06 11:13 pm

Alfman verbose=1
Brendan,

Apparently not (Intel said they were fused off, but were wrong).
…
Of course I’m not saying it’d be trivial for an OS to do more than the bare minimum (support the common features of unequal processors.). For “with or without AVX-512” there’d be a major difference in the task state area used for switching between tasks that (at least for normal operating systems like Windows and Linux) would complicate task switching and migrating tasks from one class of CPU to another.

Yes, that’s interesting. I don’t think many will want to develop features around functionality that is officially unsupported, but in principal it shouldn’t be that difficult to achieve.

What normally happens when an unsupported opcode gets detected is to kick off an invalid opcode exception interrupt. The OS can then handle it however it needs to. It could terminate the program with an error, but it doesn’t have to if it corrects the fault and executes the instruction again. In this case it could automatically reschedule the thread on a regular full performance processor that supports 512 SSE.

The page you linked to kind of alludes to this, and it’s probably what intel themselves were planning on doing before deciding to shut off 512 bit SSE for whatever reason. It’s kind of a mystery to why intel would decide to drop something that’s evidently working already such that motherboard manufacturers are already enabling it. The article seems to suggest the decision was not technical but politics or marketing.

2021-11-07 10:30 am

Brendan
What normally happens when an unsupported opcode gets detected is to kick off an invalid opcode exception interrupt. The OS can then handle it however it needs to. It could terminate the program with an error, but it doesn’t have to if it corrects the fault and executes the instruction again. In this case it could automatically reschedule the thread on a regular full performance processor that supports 512 SSE.

For better or worse, a lot of Windows software (especially if compiled with Intel’s ICC) does “dynamic dispatch”. What this means is that if the executable is started on an E core its initialization code will detect the E cores features, find that AVX512 isn’t supported, and then auto-configure itself (set function pointers, etc) to not use AVX512.

With this in mind; for Windows; I think the best approach (in theory) would be for the executable file’s header to include 2 sets of flags – one indicating which CPU features the executable requires and the other indicating which CPU features the executable can benefit from (but doesn’t require). That way the OS can decide if the process should use P cores or E cores when the process is first started (and avoid “wrong dynamic dispatch” and avoid the cost of migrating all threads when one thread uses an unsupported instruction; while also allowing OS to detect that no CPU supports the executable when the program is installed).

It’s kind of a mystery to why intel would decide to drop something that’s evidently working already such that motherboard manufacturers are already enabling it.

I think it’s a combination of Intel needing Windows to add special support for their chip (and “more support” making it harder for Intel to convince Microsoft to make the changes before release day); and Intel wanting parallelizable code to use all the cores in parallel (and AVX2) for efficiency rather than only using the P cores (and AVX512) to get less work done less efficiently.
2021-11-07 11:42 am

Alfman verbose=1
Brendan,

With this in mind; for Windows; I think the best approach (in theory) would be for the executable file’s header to include 2 sets of flags

Certainly, although my point was that the OS can auto-detect when an existing thread uses it in real time even without being told.

I think it’s a combination of Intel needing Windows to add special support for their chip (and “more support” making it harder for Intel to convince Microsoft to make the changes before release day); and Intel wanting parallelizable code to use all the cores in parallel (and AVX2) for efficiency rather than only using the P cores (and AVX512) to get less work done less efficiently.

Even so though, what microsoft does is on microsoft. If they don’t support it, then that’s their loss. I don’t see why intel would be pressured to deny the feature’s existence because of that, there is a world that exists beyond microsoft after all. Anyways even though I don’t follow their behind the door reasoning, it makes for interesting trivia that it’s present in the BIOS & silicone!

2021-11-07 2:17 pm

Xanady Asem
I should have been more specific: system I referred the actual silicon part not the OS.

Some bios seem to support AVX512 when E-cores are disabled, The tests I’ve seen were people being able to see the wider registers manually, but I don’t know if precompiled codepaths were getting a CPUid that allowed them to use the full AVX512.

I’ve been told the issue seems to be at the memory and fabric controllers level, it doesn’t seem to support asymmetric capability operation on a per-core granularity.

2021-11-07 6:50 pm

Alfman verbose=1
javiercero1,

Some bios seem to support AVX512 when E-cores are disabled, The tests I’ve seen were people being able to see the wider registers manually, but I don’t know if precompiled codepaths were getting a CPUid that allowed them to use the full AVX512.

I’ve been told the issue seems to be at the memory and fabric controllers level, it doesn’t seem to support asymmetric capability operation on a per-core granularity.

Are you talking about asymmetry between completely different CPUs? The type of asymmetry we’re talking about here is some cores supporting AVX512 and others not on the same CPU, which shouldn’t be impacted by the memory fabric.
2021-11-09 10:51 pm

Xanady Asem
Asymmetric/heterogeneous load/store requests impact fundamentally the ring fabric, memory controller esp the shared L3.

I assume Alder Lake either doesn’t implement the functionality or the HW is too buggy in this release.
2021-11-10 1:08 am

Alfman verbose=1
javiercero1,

Asymmetric/heterogeneous load/store requests impact fundamentally the ring fabric, memory controller esp the shared L3.
I assume Alder Lake either doesn’t implement the functionality or the HW is too buggy in this release.

If you can dig up evidence for this, then I’d have to accept it. But as of right now it seems to be working already in terms of AVX512 and it may only be a matter of scheduling the binaries to execute on the right cores, just like the author discusses..
2021-11-10 2:52 am

Xanady Asem
It’s only working with the E-cores disabled, Thus there’s no supported heterogeneous ISA within the chip.

The ring/mem controller is a mostly HW scheduling problem, not OS. In the L3 there seems to be no core tags, so passing a wide stride load/store request to a core that does not support it seems to be a consistency risk. So at least in this iteration the controller doesn’t seem to support asymmetric operation.
2021-11-10 6:25 am

Alfman verbose=1
javiercero1,

It’s only working with the E-cores disabled, Thus there’s no supported heterogeneous ISA within the chip.

I seems like they disabled those cores to compensate for OS limitations rather than hardware ones. While I could be wrong, my gut feeling is that E cores don’t care what the P cores are doing. AVX512 uses the full cache line mechanics that the CPU is already using even when it’s not executing AVX512 instructions. So it would be reasonable to think that the E cores are already “compatible” with AVX512 instructions running on neighboring cores. Nevertheless obviously it needs to be tested before we can say anything conclusive about it.
2021-11-10 8:47 am

Xanady Asem
The E-cores are not compatible with AVX512, that’s the whole point. And that’s why asymmetric operation is not currently supported in this HW revision.
It’s both a HW and SW issue. Even if the SW scheduler is aware of the ISA affinity for each cluster. The wide strides are live on the ring and L3, So a Gracemont read on a Golden Cove commit could be problematic, since the Gracemont controller does not implement that type of packet. Or if it does, it’s probably too buggy for Intel to support it on this release.

As it seems Intel is not officially supporting AVX512 on Alder Lake.
2021-11-10 2:10 pm

Alfman verbose=1
javiercero1,

The E-cores are not compatible with AVX512, that’s the whole point.

You are misconstruing what I said: “it would be reasonable to think that the E cores are already ‘compatible’ with AVX512 instructions running on neighboring cores.”

The author has already proven that AVX512 is compatible with those neighboring cores. Those AVX512 instructions may never reach the E core decoder such that it would even be aware other cores are running AVX512 instructions. If the AVX512 algorithm only touches the other compatible cores, it’s quite plausible that the E core will happily chug along running it’s own code completely oblivious of the fact that an AVX512 algorithm is running on other cores. So I will not be surprised if someone manages to fix the OS scheduler and get it working.

And that’s why asymmetric operation is not currently supported in this HW revision.

Given the author’s reporting, it may still work despite not being “supported”. I wish I had one to test with here to gather evidence one way or the other. I concede that I could be wrong, but given that you face the same lack of evidence you should concede that your assumptions could be wrong as well. So until we get more evidence it doesn’t seem like we can definitely answer the question here.
2021-11-10 4:19 pm

Xanady Asem
Yes, we can answer the question easily: Alder lake doesn’t support AVX512 in hybrid mode. I’m simply giving you the reasons from the HW side of things why intel may not be supporting it
2021-11-10 4:35 pm

Alfman verbose=1
javiercero1,

Yes, we can answer the question easily: Alder lake doesn’t support AVX512 in hybrid mode.

Where is your evidence though? Because if you don’t have any then it’s an open question.
2021-11-10 5:06 pm

Xanady Asem
The evidence is in the article.
2021-11-10 6:35 pm

Alfman verbose=1
javiercero1,

The evidence is in the article.

The article doesn’t back what you are saying though. The author explains why the E cores cannot run AVX-512 and we are all in agreement that they can’t. But there’s no evidence to suggest that, with proper BIOS & OS support, AVX-512 applications could not run on the P cores while having other non-AVX-512 applications running on the E cores.

Part of the issue of AVX-512 support on Alder Lake was that only the P-cores have the feature in the design, and the E-cores do not. One of the downsides of most operating system design is that when a new program starts, there’s no way to accurately determine which core it will be placed on, or if the code will take a path that includes AVX-512. So if, naively, AVX-512 code was run on a processor that did not understand it, like an E-core, it would cause a critical error, which could cause the system to crash. Experts in the area have pointed out that technically the chip could be designed to catch the error and hand off the thread to the right core, but Intel hasn’t done this here as it adds complexity. By disabling AVX-512 in Alder Lake, it means that both the P-cores and the E-cores have a unified common instruction set, and they can both run all software supported on either.

If you disagree, then please cite the specific text that proves that there’s a fundamental incompatibility between C & P cores even when AVX512 applications are restricted to P cores by the OS.

Not for nothing, but the article claims that intel engineers actually did have AVX512 working since the beginning and Intel only later disabled AVX512 in the BIOS to their annoyance. It seems quite plausible that E cores and P cores were engineered to be fully compatible at the cache & memory interface. Hypothetically it might just be a matter of changing the OS so that AVX512 applications are not schedule to run on cores that don’t support it.

Without new evidence one way or the other, it remains an open question.
2021-11-10 8:15 pm

Xanady Asem
No. The article only claims that AVX512 is present on the P-cores, and it’s not fused off as it was initially claimed. There’s no mention of AVX512 supported on a hybrid system with the E-cores enabled,.

The evidence is that the shipping systems do not support hybrid ISA operation. Once again we’re stuck in a debate cycle in which you don’t consider reality as representative of what is going on.
2021-11-10 9:05 pm

Alfman verbose=1
javiercero1,

No. The article only claims that AVX512 is present on the P-cores, and it’s not fused off as it was initially claimed. There’s no mention of AVX512 supported on a hybrid system with the E-cores enabled,.

I know, that’s why it’s an open question, sheesh.

The evidence is that the shipping systems do not support hybrid ISA operation.

That’s just an assertion, where is your evidence though?!? The author did not test AVX512 with E cores enabled because it was an easy way to limit AVX512 to the P cores that support it. That does NOT prove it could not work with proper BIOS & OS support. This was what Brendon’s posts were about and I’m in agreement with him that it’s possible that it could work.

Once again we’re stuck in a debate cycle in which you don’t consider reality as representative of what is going on.

Either you have the evidence or you do not. I am genuinely curious but so far you’ve only provided conjecture with no evidence. It makes the discussion extremely tedious if your going to continue not providing evidence while still maintaining that you have it. Open that damn box Schrödinger and lets see what you’ve got 🙂
2021-11-10 9:23 pm

Xanady Asem
You can’t prove a negative.
All that we know is that Intel does not support Hybrid ISA operation on this system. The evidence is in the actual systems.

I’m simply adding extra context on why it is not just a simple OS scheduling issue, since the controllers for the ring fabric, L3 and LS queues are managed by HW and mostly opaque to the programmer. AVX affects the stride and therefore also the consistency, so that HW has to be aware. So the best we can guess is that either the HW support is not there or it is not validated for this iteration.

Really nothing more. Just a polite contextualization that some may or may not find helpful. But as usual, you have to make it weird.
2021-11-10 10:03 pm

Alfman verbose=1
javiercero1,

You can’t prove a negative.
All that we know is that Intel does not support Hybrid ISA operation on this system. The evidence is in the actual systems.

I’m simply adding extra context on why it is not just a simple OS scheduling issue, since the controllers for the ring fabric, L3 and LS queues are managed by HW and mostly opaque to the programmer. AVX affects the stride and therefore also the consistency, so that HW has to be aware. So the best we can guess is that either the HW support is not there or it is not validated for this iteration.

Ah good, now we’re getting somewhere. I agree it’s a guess, that’s a great way to put it. Testing of the actual hardware should reveal more answers. I hear what you’re saying about the fabric, but I still think that intel engineers who didn’t know that AVX512 was going to get disabled would have made sure the core’s IO interfaces were compatible regardless of AVX512 instructions on the P cores. I’m happy to concede that it’s just a guess and that we need more evidence!
2021-11-10 11:32 pm

Xanady Asem
I was already there, you’re the one catching up dude.

The POR for Alder Lake was set in stone well before it was taped out. There is no way the AVX issue was an after thought that caught any of the engineering teams off guard, or created any drama.

The engineering teams for Golden Cove were doing the features as requested, since it is a core being shared with the Xeon lines which expects to have complete AVX functionality.

This happens all the time. I’ve worked on IPs where their functionality was only used partially in different designs. Validation these days takes longer than design. So more and more of the original functionality is enabled after several iterations of the road map, as more of it is validated.

Doing asymmetric HW and SW is too much of a PITA for a niche feature like AVX512. So it make sense that at least in this iteration Intel is not bothering with it. Even if most of the HW is there, it’s probably not validated as that would have delayed the roadmap.

On top of that, Alder Lake even without AVX512 is hitting 220+ Watts when all clusters are on full turbo. So with AVX512 on + E-Cores it would be an even bigger thermal disaster. Which is not a good thing IMO.

AMD is kicking Intel’s butt in the consumer space without supporting AVX512. So Intel just wants to get their big.LITTLE arch out there ASAP before Zen4.

Perhaps for Rocket Lake the HW for full asymmetric operation will be validated and enabled. But right now, that you have to turn off the E-Cores is a clear indication that asymmetric operation is either not implemented or fully validated.
2021-11-11 12:22 am

Alfman verbose=1
javiercero1,

I was already there, you’re the one catching up dude.

I don’t care how we got here, it’s just good that we’re in agreement that the hardware itself is the ultimate evidence. Intel will say what it wants officially, but it’s interesting that early testing is finding support for AVX512 that intel didn’t want us to see.

The POR for Alder Lake was set in stone well before it was taped out. There is no way the AVX issue was an after thought that caught any of the engineering teams off guard, or created any drama.

Well, according to the article engineers had the feature working early on and it’s only disabled in BIOS, it’s not unrealistic to think this feature is already working in silicon and intel’s marketing wanted to take it away in order to help distinguish it from a future sku. Again, we’re just guessing like you said.
2021-11-12 12:53 pm

Xanady Asem
You’re still not understanding my point.

AVX512 is not working in hybrid mode. That’s all.
2021-11-12 1:55 pm

Alfman verbose=1
javiercero1

You’re still not understanding my point.

AVX512 is not working in hybrid mode. That’s all.

That’s fine, but you’re still assuming that it won’t work. It’s quite possible if not likely that intel has code to make this work internally. It would not be a stretch to think a linux hacker could fix the OS to support it. So I think it is completely fair to ask about what evidence exists for the assumption. It’s totally fine that you don’t have evidence that it can’t work, after all I have no evidence that it does work either! That’s my point, until we get more evidence, we cannot definitively say whether the hardware can do this or not.
2021-11-13 1:53 pm

Xanady Asem
It’s impossible to prove a negative.
I’m simply telling you that it’s not just an OS scheduling issue, as HW support is also needed for asymmetric operation.
Since Intel has stated they don’t intend to support that feature in this product, we can have an educated guess that the feature is either not implemented or not fully validated.
2021-11-13 5:16 pm

Alfman verbose=1
javiercero1,

It’s impossible to prove a negative.

It depends. When you have all the data you can prove that there are no positives through the process of elimination, Mathematically and algorithmically we do this. Sometimes there are times when this is not practical. In this case intel may have already let the cat out of the bag with it’s board partners who are unlocking features that intel wanted them to keep secret.

Since Intel has stated they don’t intend to support that feature in this product, we can have an educated guess that the feature is either not implemented or not fully validated.

I expect it’s not validated. However that won’t necessarily stop people from doing it, just like overclocking. Some people stick to supported configurations, others don’t care. To each their own.

2021-11-07 10:53 am

zdzichu
Scheduler is probably ready to schedule processes on capable cores. The process needs AVX? It gets the P-core.
I wrote “probably” because I last heard about this capability a year ago – https://lwn.net/Articles/838339/ , The article is about ARM systems, but the it’s generic feature for every platform.

2021-11-05 11:39 am

Alfman verbose=1
Thom Holwerda,

Competition is amazing.

Yes it is. We now have 3 fairly strong CPU vendors between intel, amd, and apple, and that’s great news! The whole industry is constantly improving and competition is what drives it. In healthy markets it’s very normal and expected for competitors to leapfrog each other’s specs because each vendor will be adjusting their targets to be competitive in relation to the rest of the market. This is why it’s best to keep a cool head and not get caught up in the news hyperbola that doesn’t fit reality. Osnews is guilty of this too. It was only 18 days ago you said that nothing intel or AMD even comes close, well here we are less than 3 weeks later and M1 max’s ST performance is beaten by 14% and MT performance is beaten by even more because it doesn’t have enough cores.. This is why it’s more important to look at the trends than the fact that vendors are leapfrogging each other every product generation, which shouldn’t be surprising when they’re close.

Looking at the trends, we see that ARM and M1 processors may not have the best performance on the market every quarter, but what they do have is exceptional power efficiency. Apple’s presence is probably what motivated intel to invest in hybrid P+E cores. Regardless of which companies one prefers, everyone should agree this is great for choice and competition compared to a few years ago when x86 was the only serious contender.
It’s AMD’s turn to show what they have for us next 🙂
2021-11-05 5:59 pm

Anonymous
None of this diverts me from questions about abuse of market power and forced obsolecence and unnecessary environmental waste among other things. Standard modular and repairable and upgradeable designs and power efficency and longevity of use interest me more than feature creep and gimmicks. I don’t go shopping in a Ferarri.
2021-11-06 4:28 am

franko
“and users learning Windows 11”.
It is early days but it seems to run pretty good using Linux. So Linux users can stick with what they know and get the benefits of the hybrid cores.
https://www.phoronix.com/scan.php?page=article&item=intel-12600k-12900k&num=1

2021-11-06 7:12 am

Xanady Asem
Linux, via Android, has been doing big.LITTLE scheduling in ARMland for years now.