SGI today announced that its SGI Altix 3000 servers and superclusters deliver world-record performance on the next-generation Intel Itanium 2 processor (Intel code name Madison). Preliminary results of 64-bit application tests reveal that “the SGI Altix 3000 family running on Madison will once again provide record-shattering performance, price/performance and scalability in a standard Linux OS environment”.
Since it’s made by SGI, it will cost two to three times what the same hardware/software would cost if manufactured by another vendor.
>Since it’s made by SGI, it will cost two to three times what
>the same hardware/software would cost if manufactured by
>another vendor.
You make the blith presumption that another vendor could manage to create “the same hardware/software”…
Yet another lamer who thinks his overclocked athlon xp machine is equal to sgi or sun hardware.
Learn a bit about memory bandwidth. There is a reason these are so expensive.
In a standard Linux OS environment …
I really like the sound of that!
Preliminary results for a prototype machine using engineering samples of an unreleased processor, running a selected set of benchmarks compiled with an unspecified compiler and running an unspecified version of Linux with an unspecified amount of RAM.
Yep, it really sounds good to me.
People will always wait for MS Windows.
“Learn a bit about memory bandwidth.”
Guess wot – I did learn a bit about it. Just enough to tell you that this has been debunked so many times that it is not even remotely funny any more. Bandwith cannot be viewed isolated. And if the MHz speed is so much faster, it can make up for restraints in bandwith, even surpass. That’s what happened long time ago… never mind.
I personally think Itanium is one of the boldest initiatives
ever in the history of processormaking. Usually the market leaders don’t take such bold steps and stick to the conventional. It is very good news for all of us that we’ll be rid of the x86 crap soon. Itanium’s enginnering principles make lot of sense to me and it will rock AMD and Transmeta (Oh I almost forgot they’re one of the contenders — Hi Linus) a** soon enough.
SGI and Itanium is a very good combination. It is the MIPS/IRIX boxes that are very expensive, the Altix boxes are not cheap but they do not cost as much in comparison.
when I first saw the HP Itanium offerings, they were identical to the Dell Itanium offerings, since the boards and cases even were designed and built by Intel. Has that changed with this? Is there any reason to believe we will not shortly be seeing HP systems (in their superdome configuration) that posts exactly the same benchmarks?
“And if the MHz speed is so much faster, it can make up for restraints in bandwith, even surpass. That’s what happened long time ago… never mind.”
This has to be one of the most uninformed posts I have read in a looooong time. If you BW is constrained, higher MHZ ratings do not make up for it, in fact they make it even worse as the processor will have a lower utilization rate!
I.e. a piece of data comes to you every 1 second. So what difference makes if you can process such piece of data in .5 or .1 (higher MHz) seconds. The difference is that your processor will be “under-utilized” 50% vs 90% of the time. You are still constrained by the fact that your data comes in at the same rate… thus both processors achieve the same throughput… even one is way faster than the other.
So yeah, bandwidth does actually matter.
And no, no one is offering what SGI is, this is a system with hughe local and internodal bandwidth. It offers NUMA and cache coherency, and thus it supports a single image for a machines with several processors. This is not a cluster!
“Is there any reason to believe we will not shortly be seeing HP systems (in their superdome configuration) that posts exactly the same benchmarks?”
Well Chris you tell us, you are working for HP .
BTW, HP just posted theyr transaction results for they superdome with 64 Itanic running windows…..
Hmmm… Are you flamebaiting, a troll, or do you sincerely believe what you wrote?
1) Itanium dates back to 1996 and in fact is even older than that, since it is supposed to maintain compatibility with both the HP-PA architecture and the x86, exactly because both Intel and HP did NOT want to get rid of the previous ¨crap¨.
2) The present Itanium 2 has a bug that forces its users to downclock the processors to 800MHz to avoid crashes.
3) The present Itanium 2 die is 400mm2 and dissipates 130W.
4) At 800Mhz the Itanium 2 when running x86 binaries is slower than a Pentium II at 300MHz.
5) There were < 500 Itanium systems sold in 2001, < 4000 sold in 2003, and the estimates for the first half of 2003 are < 5000.
6) In fact Intel has lost so much money on developing the Itanium and trying to market it, with such dismal sales, while technology has evolved so much in the same period, that many people are questioning whether the whole project still makes sense or should be dropped immediately.
So Itanium may rock for some people, but others have reasons to think it stinks.
This is not your average desktop machine. These are machines that are sold in clusters of 4-128 processors! It’s true that most single processor PC-class machines can keep up with single-processor RISC workstations, because individual x86 CPUs are so fast. However, once you get into multi-processor scalability, the PC architecture can’t touch high-end machines from SGI and Sun. The architecture of this machine (NUMAflex) is designed from the ground up for large-way multi-processor operation. PC-based architectures just cannot compete with machines like these.
PS> The Itanium2 does rock for this kind of work. Right now, the Altix 3000 with the older 1GHz Itanium2 processors is quite competitive with the IBM p690 with 1.7 GHz Power4+ processors. This Atlix machine has the 1.5GHz Madison Itanium2’s, which have double the cache. As a result, these processors should have a significan tlead over IBM’s Power4+’s.
“I personally think Itanium is one of the boldest initiatives
ever in the history of processormaking. Usually the market leaders don’t take such bold steps and stick to the conventional. It is very good news for all of us that we’ll be rid of the x86 crap soon. Itanium’s enginnering principles make lot of sense to me and it will rock AMD and Transmeta (Oh I almost forgot they’re one of the contenders — Hi Linus) a** soon enough.”
Please reffer to:
i432 -> Intel’s attempt to “revolutionize” processors in the late 70s/early 80s. Yeah, everybody runs it today, right? 32bit, OO support, I mean this is bold… this is the future! x86 was just a backup in case the 432 did not make it. Which was impossible, I mean it made lots of sense!!!
i860 (80860) -> Intel attempt to “revolutionize” processors, it is RISC, it is fast, everybody is going to design for it. x86 will soon be dead!! Trust me this time, this makes lots of sense… the 486 is just a safety design, and the x86 will soon replaced by this super duper 860…. it makes a lot of sense!
Oh, yeah. Intel R0X0rZ!!!!
“is quite competitive with the IBM p690 with 1.7 GHz Power4+ processors. This Atlix machine has the 1.5GHz Madison Itanium2’s, which have double the cache. As a result, these processors should have a significan tlead over IBM’s Power4+’s.”
What? An HP superdome with 64 Intanic2 running at 1.5Ghz barelly outperformed an IBM regatta with 32 Power4’s. So this significant lead for the Intanium2 is what almost half the speed over a comparable POWER4? Uh?
The Itanium have gone much further than i860 ever did, I really don’t see that comparison as valid anymore… It was valid for some time, though.
The Altix scales much better than the Superdome.
“The Itanium have gone much further than i860 ever did, I really don’t see that comparison as valid anymore… It was valid for some time, though.”
Care to explain? How has the itanium gone much further than i860?
Comparisons are quite obvious:
i860 RISC new buzz word for the 80s -> IA64 VLIW er I meant EPIC new buzz word for the 90s
i860 a bitch to code for -> IA64 another bitch to code for
¨The Altix scales much better than the Superdome.¨
That one is good. You get 1st place.
¨An HP superdome with 64 Intanic2 running at 1.5Ghz barelly outperformed an IBM regatta with 32 Power4’s.¨
Nice try. You get 2nd place.
“I personally think Itanium is one of the boldest initiatives
ever in the history of processormaking.¨
Hors concours!
Look at you, talking about measly zonked out architecture, that everybody knows is slow.
Macs beat SGI machines flat out in any test, and their sheer speed and incredible memory bandwidth makes even the fastest SiliconGraphSux seem like stonage.
I run both Mac and SGI at work and the Mac runs circle around it, especially with the new Altivtec technology, which makes a processor up to FOUR times faster than equivalent x86. (or any architecture MIPS etc.)
Hmmm, that´s true, but you have to qualify that statement.
The IA64 is a **not** a bitch to code for if you are coding in C or any other high-level language, because the compiler is supposed to take care of the EPIC/VLIW thingy.
So the truth is that unless you are coding a new compiler or some other low-level application, the IA64 is just as easy or hard to code for as any other microprocessor.
—People will always wait for MS Windows.
Yes, only people like you.
Go home and sleep with your lovely widows.
it is the sad TRUTH: computers like this are not longer neseccary!
look at GOOGLE, perhaps one of the biggest consumers of computing power, what do they use?
off the shelf PCs
the future is not big machines like this with lots of power hungry processors and all in one big piece so if something breaks u have to take the whole thing apart to fix it.
the futurure is clustering of normal off the shelf PCs.
these processors suck they are 1.5GHz. even if they are itanium epic it cannot make up for the clock speed of the p4, especially in terms of price/performance. can u really justify paying thousands of dollars for each cpu? the performance advantage just isn’t there? can you really tell me that 10 P4 Xeons which cost the same as one itanium are really slower? i don’t think so
personally I see no future for this computer, especially since it only runs linux. who will want to use it? nobody, that is who.
why run this when you could have 20 quad xeon systems for the same price?
Well, the older SGIs, like say an O2, runs with a MIPS r5k or r10k up to around 300 Mhz. Its been a while since Apple had 300 Mhz systems, and those systems were designed around the G4 which is probably faster than a MIPS clock for clock, but I do not know. What I do know is that altavec is just another name for SIMD which give up to a 30% boost in performance for optimized apps, which Apple has a lot of. But a 300 Mhz G4 would never come close to the benchmarks of a 1.2 Ghz AMD.
An Itanium 2 can ray trace at least twice as fast as any chip it was put up against. That’s some fast expensive fpu.
The problem is it sux for almost everything else. Best tool for the job and be sure to keep an eye on that price/performance ratio.
“The IA64 is a **not** a bitch to code for if you are coding in C or any other high-level language, because the compiler is supposed to take care of the EPIC/VLIW thingy.
So the truth is that unless you are coding a new compiler or some other low-level application, the IA64 is just as easy or hard to code for as any other microprocessor.”
I was talking about the compiler development. Of course when you use C it is the same level of complexity, that is the whole point of a high level language, duh!
i860 was hard on the copiler, because of the scheduling constraints. So C or Fortran compilers could never generate code that was any where near the peak performance quoted by intel (cray on your desktop). Same goes for the IA64, VLIW is really dependent on compiler technology, and neither multiflow nor cyndrome were unable to deliver the compiler technology. And same goes for HP/intel… sadly compiler technology always moves a couple of generations behind HW
¨I was talking about the compiler development.¨
For nearly 100% of desktop computer users, the code complexity to develop a compiler is irrelevant.
The 68000 had a very nice architecture and was particularly easy to code for in 68K assembler. By comparison, the 386 was a bitch.
But x86 processors historically outsold 68K ones by a ratio of 30:1 for desktop applications.
So IA-64 code complexity is largely irrelevant to its success or failure on the desktop/workstation/server markets.
“So IA-64 code complexity is largely irrelevant to its success or failure on the desktop/workstation/server markets.”
LOL, yeah… riiiiiight. x86 had legacy to support its technicals shortcomings vs. the 68K series. the 386 was not that bad, granted the reduced set of registers made things interesting.. but it was a clear improvement over the 8086 segmented memory and the 286…. (of course it was a POS when you reverted to the old modes though). Basically the 386 was used as either a really fast 8086 or 286 for the most part of its early life. However the IA64 can not be used as a very fast Penitum IV .
IA64 does not the same kind of installed code to support its shortcomings, so it better be good or there is not compelling reason to chose IA64 over other 64bit offerings. This is the reason why IA64 machines have been sold like hotcakes right? right?
Some points:
# In testing the BLAST® (Basic Local Alignment Search Tool) application from the National Center of Biotechnology Information, a suite of tools designed to identify similar protein and DNA sequences within genomic databases, SGI Altix ran 57%, or 2.3 times, faster than the IBM system. In tests of HTC-BLAST, SGI’s high-throughput driver program for BLAST, SGI performed more than 72%, or 3.5 times, faster than IBM.
If this percent math has been performed with Itanium, then the processor is really not ready for scientific tasks. If Altix runs 57% faster IMHM (in my humble mathematica) this is 1.57 times faster then P690. 72% are 1.72 times.
Current evidence indicates that these numbers are not expected to vary much when applied to the new POWER4+TM processor.
Sorry, but this is nonsense. First as the linked PDF states, p690 was equiped with 1,1 GHz processors. So the newest POWER4+ processors with 1,7 GHz are already 54% or 1.54 times faster then POWER4. Additionally as can be seen on http://www-1.ibm.com/servers/eserver/pseries/hardware/datactr/p690_… the performance increased even more due to larger caches and optimized architecture.
So Madison may still be impressive CPU, it may even be on par with POWER4+, but it is definitely not faster.
However it would be interesting to see a TPC-C benchmark on Altrix with Linux and Oracle. I don´t think SGI will be doing that, because Altrix is optimized for technical and sciencific tasks but not commercial database software.
Greetings from Anton
Always remember that the Power 4 in this test is a dual-core processor. The fact that 64 single-core Itaniums outperformed 32 dual-core Power 4’s is significant.
Intel will be going multi-core soon enough at which point I dare say they will reign supreme.
¨This is the reason why IA64 machines have been sold like hotcakes right? right?¨
No, wrong again. You don´t seem to be able to perceive the point here.
Let me try to put it in simple words for you: people don´t buy desktop/workstation/servers based on a particular CPU because this CPU presents a simple assembly language model. That´s irrelevant. Buyers concentrate on price, compatibility with previous software, availability of new software, brand, technical reviews, etc…
Or in more technical terms: the complexity of the ISA for a particular processor has little relevance for its success in the market for desktop/workstation/server applications.
So your argument that the Itanium is doomed because ¨it´s a bitch to code for¨ just like the i860, is wrong.
Whether the Itanium succeeds in the market will never have anything to do with the difficulty or ease in programming in assembler for it. And that´s a fact, whether you understand it or not.
# Always remember that the Power 4 in this test is a dual-core processor. The fact that 64 single-core Itaniums outperformed 32 dual-core Power 4’s is significant.
I know, I know this is very confusing. But for the last time: p690 has 4 MCMs with 4 processors on each. Each processor has 2 cores. So there are 32 cores in a system. So the only significant fact is that a POWER4-system with half cores is at least as fast as 64 processor Altrix. And there are plans for a 64 processor machine. The “Squadron” with POWER5 will have even 128 cores on it.
Greetings from Anton
@Rowel: The two scenarios are completely different! Google’s situation is that it has lots of processing requests, but each request requires little processing. This is a very parallizable problem, which is why a cluster of computers makes sense. However, there are many important computing problems that just don’t run well on clusters. Either the problem is hard to parallelize or the algorithms have memory access patterns that play havoc with a cluster’s slow inter-node communication. In these cases, which crop up a lot in scientific computing, you need a large shared memory machine like this.
Also, clock-speed has little to do with this. The original 1.0 GHz Itanium2’s are faster, in floating point code, than a 3GHz P4. The 1.5 GHz Itanium2’s will be significantly faster. Itanium2’s are expensive, but not as much as the Sun and SGI processors usually used in such machines, and will get cheaper as HP, IBM, and SGI start shipping more machines and production volumes increase.
Lastly, nobody really cares that it only runs Linux. People using this machine aren’t going to complain that KDE’s menus are too cluttered. Hell, most such machines will probably run headless (without a GUI) anyway! The high end market is still dominated by UNIX, and Linux is simply another UNIX from the point of view of sysadmins.
@Kevin: 100% of desktop users have absolutely no importance for machines like these. These machines don’t go anywhere near a desktop. These machines are used for scientific, financial, and other sorts of data-crunching applications. There is a huge body of these applications that are written in a wide variety of languages. It is important that good compilers for these languages can be written so these applications can perform well.
First off, I do not need your patronizing Mr. Rasmussen thank you very much, you operate under the assumption that your own percetion is somehow the law.
The fact that the IA64 is a bitch to code for has nothing to do with the complexity of the ISA but rather the constraints imposed by the VLIW programming model, so I do not know where you are getting that from. The ISA itself is nothing but a RISCy approach, so ain’t that complex really. The pairing of the subinstructions to generate the VLIW macroinstruction is what makes things interesting. This has a direct effect on the real world performance of the machine. Since IA64 does not have the same software code base as the X86 the same sort of assumptions can not be applied to IA64. I was using this complexity in the programming model to compare the IA64 to the i860, a chip that it too was going to sweep the marketplace 15 years ago or so. I never claim that this is the main reason why a chip does or doesn’t sell… Try to actually read my posts, thank you very much. We can go in a circular argument all day if you want, I am afraid that you operate under the assumption that if you repeat the same argument enough times, it somehow becomes true. Again my main point is that you can not take the same assumptions for the sucess of the x86 and apply them to the IA64.
Then you diminish my claim that Itanium so far is not selling according to expectations, since that somehow has nothing to do with the sucess of the platform. Oh, so if sales volume has nothing to do with the sucess of a product, I do not know what it does. Care to elaborate? What is it then, pixie dust?
Whaa, can you try making posts without simultainiously insulting everyone you reply to? It makes it rather difficult to get your point when its nested in between the flamebaiting.
However, it would be even nicer if it didn’t cost an arm and a leg. Whether Intel like it or not, they cannot keep this “two archictecture” product line up for ever, one day they will need to decide whether to continue with x86 or go with the Itanium.
The Itanium on paper lines up to be a very good chip, however, pricing them at $4,000 to $5,000 per CPU when you closet competitors are still cheaper and performing better one wonders whether Intel has got the message.
If it were me, however, I would never had developed the Itanium and instead works to create a hybrid Alpha/PA-RISC combination that would take the positives from both architectures, then atleast we would end up with a architecture famila to most programmers rather that the situation we have now with operating systems and compilers still not mature enough.
¨100% of desktop users have absolutely no importance for machines like these.¨
I agree. The SGI Altix machines cater for a very specialized market which you have described.
I was writing about the Itanium 2, for which I believe there is a market for desktops, workstations and servers – if Intel can get its marketing mix right, which it seems they are having some difficulties in executing.
When it comes to Itanium compilers, both Intel and HP have been working on optimizing compilers for the IA-64 architecture for a long time, because compiler availability was part of the strategy for the then-Merced since the very beginning of the project.
This is why I keep insisting with <whaaa> that the fact that the IA-64 architecture is ¨a bitch to code¨ is totally irrelevant to its acceptance in the market. It was designed as such, and Intel/HP tackled the compiler availability issue as an integral part of their strategy a long time ago.
¨… my main point is that you can not take the same assumptions for the sucess of the x86 and apply them to the IA64.¨
I am left to wonder how you can reach this conclusion from anything I wrote above.
Also, you seem to be confusing ISA with register set. The ISA, or Instruction Set Architecture, according to Patterson and Hennessy, ¨includes anything programmers need to know to make a binary machine language program work correctly, including instructions, I/O devices and so on.¨ That of course includes any instruction pairing rules required for VLIW.
>>There is a reason these are so expensive.<<
I think the major reason is: economies of scale. SGI sells very few systems, hence each system has to sell for a lot.
I have no problem with SGI quality, SGI makes very good products. But you have seen their financial statements? A total disaster. I see no way SGI can stay in business. Price/performance just isn’t there.
SGI expensive? If you have to ask about the price, you shouldn’t even be looking at SGI systems, go home and play with your Dell.
SGI made two mistakes:
1. Sat on their laurels while the market evolved and made some proprietary SGI tech mainstream and cheap.
2. Tried to enter the low end market where they couldn’t make a profit without becoming just another generic cloner, which apparently they didn’t want to do.
If SGI goes back to developing ridiculosly high-end systems, they’ll recover from the current losses. There’s plenty of opportunities in VR/sim, military, government, science/medical apps, etc.
Personally, I would love to see them put out an O2 like system using the Itanium and port IRIX to it. Irix unlike Linux is a mature workstation solution.
As for the cost of porting it, talk to Intel, they have a large fund setup for companies wanting to port their applications to Itanium. If Intel see that they can push Itanium into the ultra-super-duper high end workstation market using a well recognised operating system, I can’t see why they wouldn’t grant some money.
I know SGI released under the GPL a whole set of compilers for use on the Itanium platform, they are based on the set they use on IRIX. called Open64
http://open64.sourceforge.net/
As for madison, there’s going to be at least 4 versions.
Initial release will have one with 6mb onboard cache there will also be a 3mb cache version of this chip.
Afte that we’ll see a lower-end model with 1.5mb cache.
Final version of the family will have 9mb’s of onboard cache. The problems regarding heat dissapation should go down a good bit with the lower end models (3 and 1.5mb) as well as from the overal shrink to .13 process