Linked by Thom Holwerda on Thu 18th Aug 2005 16:46 UTC, submitted by Nicholas Blachford
Intel "At next week's Intel developer forum, the firm is due to announce a next generation x86 processor core. The current speculation is this new core is going too be based on one of the existing Pentium M cores. I think it's going to be something completely different."
Thread beginning with comment 19750
To read all comments associated with this story, please click here.
Member since:

Okay, now, let me outline why it'd be suicide for Intel to persue a design anything like the one outlined in this article.

Basically, it's safe to assume that Intel will follow the Pentium-M design philosophy, at least for its next generation of processors. First, let's figure out exactly what Intel needs out of its next processor.

1) Integer performance. For the time being, GPUs cover most of the FPU needs of PCs. As long as the CPU is fast enough to encode/decode HD media formats (eg: MPEG4-HD, WMV-HD, etc), it's fast enough. For people who really need lots of general purpose FPU performance, namely the scientific processing market, the Itanium has them covered, and has a decent and stable niche there. And of course, the all-important server market couldn't care less about FP performance.

2) Low power usage. Laptops are outselling desktops now, so low power usage is a must.

3) Scalability to reasonable multi-core designs. For the near-future, multicore is going to mean being able to run a couple of apps at once without bogging the machine down. In the desktop market, it'll be at least a few years before you start to see apps that can scale reasonably to 2-way and 4-way designs, and a decade or more before you see apps that can scale well to a dozen cores or more. In the server market, you already have apps that can scale to 64-way+ designs, but in that space, a 400mm^2 chip is entirely reasonable.

Now, the overriding requirement for Intel is this:

4) Low risk. Intel cannot afford to release another Itanium or another Prescott. It wouldn't kill them, but would easily give AMD the 25% market share number that they are aiming for by 2009.

The above 4 requirements naturally lead to the following design directions, all of which suggest a design based on the Pentium-M:

1) A short pipeline. Anything silly like the SPE's 18-stager is out. FPU code doesn't care so much about pipeline length, but integer code does, and integer performance per watt is going to be the main critereon for the next-gen Pentium.

2) A large, low-latency cache. Integer code is cache-happy, but it likes low-latency cache. That means a cache of "several megs" is out, because of the latency hit. It also means that wasting half the cache for translated code is out, because it'd be much better to just have half the cache at a lower latency.

3) Superlative branch prediction. The Pentium-M's branch prediction is great, and good branch prediction gets you a whole lot of integer performance for a relatively small amount of die space (relative to a meg of cache for holding VLIW translations!)

4) 32-bit x86 ISA. In the desktop market, Intel will probably push the 32-bit ISA quite hard. The next Intel CPU might be natively 64-bit, but it also might not. It'll likely support x86-64, but it might be handled as multiple 32-bit operations, just like current P4's handle x86-64 code as multiple 16-bit operations (the P4's integer pipeline is strange...)

5) No dependence on external technology advancements. Intel cannot afford to again risk releasing a product that requires the market to catch up to it. It'll not release a product that requires particularly good compilers (because most vendors won't bother), or a product that requires developers to shift suddenly to massively-parallelized code. This is the primary reason why Intel's next processor will be a multi-core Pentium-M. The design doesn't care too much about scheduling, and can scale comfortable to 2 to 4 cores. That's all Intel needs, and really all they can afford to gamble on.

Reply Score: 4

nimble Member since:

Spot on. Rayiner for Inquirer Processor Editor! ;)

The above 4 requirements naturally lead to the following design directions, all of which suggest a design based on the Pentium-M:

1) A short pipeline.

Oh well, at least one thing the Pentium 4 and particularly Prescott with its 31 stages have succeeded in: changing pipeline terminology.

On it debut the Pentium Pro's 12 stage pipeline was considered excessively long when other processor designs had at most eight stages. The PPC604 e.g. had six.

current P4's handle x86-64 code as multiple 16-bit operations

That's almost deliberate obstruction. But surely it does have a 64-bit unit for address arithmetic, doesn't it?

Reply Parent Score: 1

rayiner Member since:

Ha ha, I think I lied about the 16-bit thing. The P4 had 16-bit "fast" ALUs (it handled a 32-bit operation in two cycles). They ran at 2x the clockspeed, so they were like 2 regular 32-bit ALUs. But I just realized that no 64-bit P4 ever came out. The Prescott (P5?) has 2 32-bit ALUs. They can handle 32-bit operations in 1 clock cycle (but with some limitations that mitigate much of their advantage), but still handle 64-bit ops in multiple clock cycles. The "slow" ALU and the AGUs are all 64-bit.

Intel's basic problem WRT 64-bit support is that they've got the double-speed ALUs. Just getting those ALUs to 32-bits was a pretty trick of engineering. They weren't in a position to just make then 64-bits wide to support x86_64 code.

Reply Parent Score: 1

Nicholas Blachford Member since:

You make a lot of good points, and in many cases you're right but you're looking at it from the point of an engineer, not a company who's sole purpose is $$$.

SPEs speed
I worded this badly, it means they are fast at the sort of things SPEs are designed for, not everything. As for their integer performance, I don't think they've released any benchmarks.

Based on Itanium.

Itanium is only one implementation of a VLIW architecture, Transmeta and Elbrus are others. The Elbrus designers didn't like Itanium and claimed they could do a lot better than Merced, this was 10 years after their first VLIW design.

I think Intel will be very useful for Intel to learn from but I don't think a new VLIW design will be Itanium based. It'll be closer to the Transmeta designs (which Elbrus thought were much better).

Single threaded performance.

Most PCs are bought by corporations or home owners to run Word, surf the Web and read email. Single threaded performance really isn't that important to those people.

The entire point is going this road will mean smaller cooler cores, if Intel can put more cores on a chip they'll sell more. Yes the enthusiasts will all think this sucks and go and buy AMD but that hardly matters since they do that anyway and in any case are only a small part of the market.

Intel will be able to get more sales by marketing the number of cores and selling variation such as 16, 14, 12 and 10 cores, if AMD are only doing 4 or 8 cores they're in trouble. Benchmarks wont help, there'll be sets provided by both sides, difference is Intel has more marketing clout.

A low latency cache is good of course but Intel shows a clear trend of going to ever larger cache sizes quickly. A smaller low latency cache has to be balanced against a larger, higher latency cache which avoids going to memory more - which has a latency positively massive in comparison. If they manage 16 cores at 65nm I'd expect them to include a huge external L3 / L4.

Risk taking
Intel *can* afford to take risks, they have the money to do several new core designs simultaneously. It's AMD who can't take too many risks, if they get it wrong they could be in trouble.

However, Transmeta have proven the technology works. It's not radical new technology.

That said AMD have experimented with this sort of design as well, one of the alternatives to the K8 was VLIW based, even the G5 contains VLIW like technologies.

Interestingly IBM and AMD are getting very cosy these days, IBM have exactly the kind of this technology I was talking about in their R&D labs.

Everyone is going multi-core now, multithreading may not be easy but it's not exactly a new idea.

BTW - I'm not the first to have predicted this, someone on Ace's hardware pointed out a similar prediction from 2002 - for AMD.

Reply Parent Score: 1

RE[2]: Rayiner
by re_re on Sat 20th Aug 2005 02:31 in reply to "RE: Rayiner"
re_re Member since:

I believe that AMD will typically be the innovator and Intel will follow. Generally the smaller company innovates, the bigger company takes advantage of it.

Anyway, I hope Intel messes this up, I would really like to see AMD hit the 25% market share mark. I would also like to see AMD stay second because Generally, the company with the upper market share does not have the best product because they do not need to.

I see AMD as the Innovator, the one who keeps Intel on its toes.

Reply Parent Score: 1