Intel will release software later this year designed to dramatically improve how well its Itanium chips run programs written for its Pentium or Xeon processors, CNET News.com has learned.
Intel Plans Itanium Course Correction
2003-04-23 Intel 27 Comments
I was sick of hearing the backwards compatibility crap for sometime already. No one would probably be using 32 bit computers by the end of the decade. Why then should we stick to old 8086 instruction set expansion? Intel did the right thing and came up with a revolution. I think the idea behind EPIC (explicitly parallel instruction computing) is extraordinary, that would put the inherent problems with execution of instructions aside and would harness the potential of the silicone more. And if Itanium wouldn’t eventually be darn fast, the engineering team at Intel would realize that way before. That I suppose would force AMD either out of business or make them do the same transition eventually. Especially when in a couple of years Itaniums reach benchmark results incomparable to those of Opteron’s for code optimized to run on each platform.
.. especially if the Itaniums could run the 32 bit code almost as good as the Xeon, that would undermine the whole point of Opteron. And Linus is going to swallow his comments and (hopefully) wither & die with Transmeta…
This is nothing more than a press release. For all we know, Intel made this press release to draw some attention away from Opteron. We should wait and see if Intel can deliver.
Hm, all perforance orientated interpretation methods I know uses jump arrays in one or another form, which would have horrible results on Itanium and JIT code generation looks like a heavy complicated and time consuming thing cause of the epic design. So I have a bad feeling about this solution. (Sorry for my english).
I was already beginning to fear i had to go to opteron for my new system (if and when i build it). i think ia64 is great, but it is reality that you can’t migrate everything at the same time. so 32 bit compatibility is nice.
my reasons for wanting intel:
-i have always used intel: 486 dx2, p133, p200, pIII450, and now a dual PIII 1GHz. i have nothing but good experiences with intel.
-intel cpu’s produce way less heat than amd’s.
i don’t mind 10-15 % price difference. for me it is quality and stability that counts.
And that’s why you want an Itanium? Bwahah..!!!
Yeah right. The p4 3,06 is the most power hungry x86 on the market. And intel expects to reach the 100 watts soon. Ok the Athlon 3200+ also eats a nice 65 watts, but take the 2400+ for example, 35 watts, significantly less than the P4 2,4…
stop banking on software optimizations for their CPU to perform well. Bring it back to hardware.
That’s the point of EPIC. It assumes you’ll be using a professional compiler (like icc) that would create an optimized code for your hardware. Optimizing during the compile time (with much higher resources and time) is much easier than optimizing in hardware.
was the worst moves Intel could have made. MMX and SSE were more widely accepted, but it is for this reason that they are falling short (comparitvely speaking to AMD) with FPU. you HAVE to have an SSE2 optimized app to get the performance out of it. Where as AMD’s FPU is just overflowing with raw power. That long pipeline just kills the CPU. I could care less how high the MHz are. If the damn thing has a poorly though out arch then it’s worthless. 20 stage pipeline!! the mispredictions just wreck performance.
My AMD stock has been on the rise solid for the past few days ..
intel cpu’s produce way less heat than amd’s.
Last time I checked the specs, Itanium 2 dissipates 135 watts at 1Ghz. Opteron has dissipates a maximum of 80w. At current speeds, some have speculated that its more like 40.
No, Itanium isn’t cool. Its hot (not in a good way :-).
Umm…so you expect intel to write compilers for all of your languages? What happens when intel decides to release itanium3? Will they reintroduce hardware optimization stuff as being some newly discovered processor technology?
You can’t tell me that intel won’t start putting optimization back into the hardware to maintain backwards compatability. Either that or else you’ll have to have software recompiled for each successively released generation since your itanium1 software is likely to run like even worse crap on itanium3.
This whole thing about requiring the compiler having all the smarts smacks to me of a more creative way to shake down customers for money long term.
How much attention the Itanium gets for how dismally it’s selling. Intel moved what, a little over 3000 units in 2002?
Why aren’t people talking about how horribly the Itanium is selling?
Itanium is an interesting architecture, as far as ISA theory goes. However the ISA really isn’t a major limitation for any modern architecture. There’s much more interesting work being done in the fields of SMT, something Intel hasn’t begun to touch with Itanium.
As far as price/performance goes, the Itanium simply isn’t there. The dismal sales of the Itanium just go to show that price is still important, even in the high-end market.
Furthermore, as far as high-end processors go, EPIC just isn’t that interesting from a design perspective. All the real innovation in parallel execution is occuring in the Power and SPARC architectures. Sun’s upcoming multicore Niagra processor promises SMT with support for 32 hardware threads, and cache coherency.
> Why aren’t people talking about how horribly the Itanium is selling?
That was Itanium. Itanium 2 is out for some time already and must have sold far more than that. And the new Itanium 2 (Madison) is coming soon. The next generation Montecito will be pin compatible with these.
> This whole thing about requiring the compiler having all the smarts smacks to me of a more creative way to shake down customers for money long term.
The instruction set is settled already. Therefore one can produce a compile for Itanium 2 which will work for next generation of Itaniums. The other small additional optimization tricks always happen with new generations (gcc has so many flags for Athlon, Athlon XP, K6, etc…)
Do you mean SMP? How did you get the idea that Itanium is not good at SMP? Check this out:
> Furthermore, as far as high-end processors go, EPIC just isn’t that interesting from a design perspective. All the real innovation in parallel execution is occuring in the Power and SPARC architectures.
Power and Sparc are RISC, nothing more. EPIC is theoretically better. And practically it would be much better because Intel has a larger consumer base and more advanced production plants than IBM (though close) and Sun. Itanium is designed for scaling and servers. How did you deduce it is not good at parallel execution? If EPIC is not interesting, what is? If you are obsessed about multi-cores on the same die you’ll have it with Itanium soon…
Do you mean SMP?
No, I meant SMT, Symmetric Multithreading.
Power and Sparc are RISC, nothing more. EPIC is theoretically better. And practically it would be much better because Intel has a larger consumer base
Itanium is Intel’s first foray into the high-end market. Their customer base in this market is virtually non-existant. As I said, they moved a little over 3000 units in 2002. Sales of the Itanium are dismal.
and more advanced production plants than IBM
No, they do not. IBM recently built the world’s most advanced fabrication plant.
(though close) and Sun.
Sun does not own a fabrication plant. They outsource production of their processors to TI, which is in part responsible for the high cost of SPARC processors.
If EPIC is not interesting, what is?
Cache coherent processors supporting SMT, which will greatly increase performance of any application where several threads are manipulating a similar set of data.
If you are obsessed about multi-cores on the same die you’ll have it with Itanium soon…
Yes, and how many hardware threads is that going to support, two? By the time this happens Sun will move its big iron to UltraSPARC IV and V, and IBM will have ramped the clock speed of the Power5 up higher.
I was reading up on Itanium recently and found out some interesting things. On Floating point they rule the roost, the 1GHz beats everything else.
On integer however it’s the opposite, they lag nearly everyone else. A 1.2GHz P3 gives it a run for it’s money.
However the most interesting bit was the difference between the 900MHz and 1GHz versions. You would think these would be similar but this is far from the case, on floating point especially where 1GHz leads by far but the 900MHz has it’s arse kicked by a P4.
The diffecence is the 900MHz model has a 1.5MB cache whereas the 1GHz has a 3MB cache. The itanium 2 rules the floating point SPEC marks not because of it’s advanced architecture or compiler technology but becasue it includes a huge cache.
They spent billions on the Itanium and in the end it all comes down to a bigger cache!
This cannot have gone unnoticed – indeed Intel plan to boost the cache to 6MB.
But what will happen if you did this to a POWER4 or UltraSPARC or even an Opteron? Where would that leave Intels advantage.
Eheh, if that was completely true then everyone would care about adding more cache and make the processor dumber
I think HP, SGI, Dell and all the rest of the bigass companies know something to support IA-64. I still think that to anyone who knows how processor works (not to a business executive who probably only cares for price/performance) EPIC is the most amazing thing for decades in processor history if it’s not as amazing as asynchronous processors.
I think Itanium is in early stages of proving itself. And I hope its only advantage won’t be the huge cache.
I have been reading the interesting discussion between atici and bascule.
I have a question about the Opteron. Could you mention something about this processor that has been done right. Eg about how effective their architecture for multiprocessor support is Vs Itanium2.
Also why did Intel bother with x86 emulation, couldn’t they sell Xeon daughterboards for those who need it?
The ITANIUMS’s higher clock speed allows faster access to the Cache reducing latency, but if you put 3MB of cache on an Opteron, you would have latency problems, not to mention a HUGE die.
> I think HP, SGI, Dell and all the rest of the bigass companies know something to support IA-64. I still
> think that to anyone who knows how processor works (not to a business executive who probably only
> cares for price/performance) EPIC is the most amazing thing for decades in processor history if it’s not
> as amazing as asynchronous processors.
I agree that Itanium looks nice on paper from a conceptual point of view – lotsa registers, rotating register stack, potential for parallel processing etc. But the acid test is how well the hardware runs. As far as I can tell, it seems to be lagging. That a big cache helps makes one wonder that CPUs are beginning to hit the wall because of memory subsystems. If that’s the case, if you can reduce the working set of the programs as they are executing (i.e. using CISC) you are likely to get better performance. Perhaps the x86 with it small number of registers and CISC just happens to find a sweet spot because of that and performs better than IA64 because the size of the instructions is smaller. This issue may also be a bonus for the x86-64.
On the emulator issue (which is what the original topic was), that is an extremely bold claim to make. I have rarely read of an x86 emulator getting better than 1 in 10 performance (bochs is 1 in 100).
First of all, Itanium 2 (McKinley) beats Itanium 1 (Merced) in performance hands down. I have heard by as much as a factor of 2. Those HP folks in Colorado working on Itantium 2 took a completely different approach than those Intel folks in Santa Clara who worked on Itanium 1. They learned from all of Itanium 1’s mistakes. It also has very good FPU resources. I would not completely count out IA-64, yet.
One of things with Itanium is that the compiler plays a very large role in scheduling instruction in the pipeline. Each generation of Itantium will have different processor characteristics which the compiler has to optimize for. Even though code compiled for one Itanium machine will work on all IA-64 machines, the performance might suffer. So as the compiler for Itanium 2 matures, the performance may get much better.
One advantage of IA-64 over Athlon and P4 is that in-order pipelines are possible because the compiler (in theory) can schedule code in advance. This is a server chip so single application performance is not their main goal. These inorder cores can be very small and putting 8, 16, or 32 cores on a chip is not too far out there. SMT out-of-order chips require a large amount of complexity which has a disadvantage of restricting processor speed and bloating the size of the core.
It definitely remains to be seen what IA-64 can do as it matures. Many people ruled out P4 when it first came out and now it is giving AMD a run for their money.
Maybe when AMD is done with the x86-64 architecture and realise it is time to move on, maybe Intel and AMD should partner up to design the next generation processor that will meet exactly the consumer needs.
Both sides have great engineers who if they work together on the initial design, could produce something sweet, then the competition can continue with the additions they usually put on processor eg SSE,MMX,3DNOW, etc.
Anyway that is my dream.
Opteron captures the top spots on Microsoft’s Exchange 2000.
Not bad for a brand new CPU running unoptimized binaries.
While Itanium will have higher performance than an individual Opteron, there is no way Itanium can compare on price/performance.
These inorder cores can be very small and putting 8, 16, or 32 cores on a chip is not too far out there. SMT out-of-order chips require a large amount of complexity
Not really. You just need a separate decoder for each hardware thread. The Sun Niagara, for example, supports 4 hardware threads for each core, with 8 cores allowing for a total of 32 hardware threads, all with cache coherency.
which has a disadvantage of restricting processor speed and bloating the size of the core.
Large die sizes aren’t really a problem in the high end arena, where processors can cost upwards of $5000 each. Niagara’s place is in machines like the current Sun Fire 15k and so forth, except it will be taking the place of 32 existing UltraSPARC III processors. With just four Niagara processors you could have a 128-way machine, which isn’t even possible with the existing Sun Fire crossbar architecture.
Ok so i was wrong.
the latest CPUs i bought were my PIII 1GHz cpus.
i knew the P4 is a hot one, but i had assumed they would try to lower the power consumption.