Linked by Thom Holwerda on Sat 17th Jun 2006 22:22 UTC
IBM A new paper from a group at Lawrence Berkeley National Laboratory, "The Potential of the Cell Processor Scientific Computing" [.pdf], explores the performance of IBM's Cell processor on some specific types of code commonly found in high-performance computing applications. The paper compare Cell's performance on these kernels to the performance of the Cray X1E, AMD Opteron, and Intel's Itanium2. The idea here is that Cell will be a commodity processor (at least that's what the authors and IBM hope), so it'll be a viable HPC alternative for the cost-sensitive academic research market. This paper represents the first formal academic attempt to decide if Cell hardware is something that researchers will want to invest in. So how does Cell stack up in comparison to these three competitors? In a word, it screams.
Order by: Score:
Publication quality
by theine on Sat 17th Jun 2006 22:28 UTC
theine
Member since:
2005-09-29

What a hideously looking research paper. I wonder whether they use Openoffice or Word.

Reply Score: 0

RE: Publication quality
by Thom_Holwerda on Sat 17th Jun 2006 22:34 UTC in reply to "Publication quality"
Thom_Holwerda Member since:
2005-06-29

What a hideously looking research paper.

I think it looks clean. I prefer two columns, as I really detest reading long lines of text. I start reading at the wrong line when my eyes move from the end of a line to the next one; this is because I read at literally lightning speed (it's scary sometimes, seriously).

Reply Score: 1

RE[2]: Publication quality
by Alex Forster on Sun 18th Jun 2006 23:23 UTC in reply to "RE: Publication quality"
Alex Forster Member since:
2005-08-12

No way, literally?

Reply Score: 1

RE: Publication quality
by rayiner on Sat 17th Jun 2006 22:43 UTC in reply to "Publication quality"
rayiner Member since:
2005-07-06

Heh. Look at the abstract. Every other line is hyphenated. That crappy H&J didn't come from TeX I'm sure of that.

Reply Score: 2

RE[2]: Publication quality
by chemical_scum on Sat 17th Jun 2006 23:49 UTC in reply to "RE: Publication quality"
chemical_scum Member since:
2005-11-02

Yea superficially it looks like a typical academic paper prepared in TeX not a Word Processor. But you are right there is a lot more hyphenated lines than you would expect.

To check this I downloaded a random recent paper from astro-phys on arXiv. I checked to make sure that it had been prepared in TeX by downloading the source as well as the PDF. It had a lot fewer hyphenated lines than the LBNL paper (though the astro-phys paper had more than I expected) and looked indefinably better.

Anyone got an idea what it was really prepared in ?

Reply Score: 1

RE[3]: Publication quality
by whenney on Sun 18th Jun 2006 02:35 UTC in reply to "RE[2]: Publication quality"
whenney Member since:
2005-07-06

Anyone got an idea what it was really prepared in ?

Obviously LaTeX - it uses CM fonts for a start. Yes it is ugly, but that is not the authors' fault, but rather of whoever designed the ACM macros (http://www.acm.org/sigs/pubs/proceed/template.html)

Edited 2006-06-18 02:36

Reply Score: 1

RE[4]: Publication quality
by Lakedaemon on Sun 18th Jun 2006 07:39 UTC in reply to "RE[3]: Publication quality"
Lakedaemon Member since:
2005-08-07

The reserch article looks quite okeyish for me.

It was obviously written with (La)teX.

The many hyphens come form the fact that lines are very short due to the 2 columns.

What bothers me more is the sentences sticking out of the columns, and the lack of vertical space which makes paragraph a very indigest read.

it seems on their templates there :
http://www.acm.org/sigs/pubs/proceed/template.html
that they are concerned about the number of pages of their papers (which isn't surprising for research paper)...

So, in my opinion, this paper isn't as ugly as was said, if you consider that they traded aestetics for a little number of pages.

The author could have worked a bit more on his sentences to avoid (weird) hyphenation...that's true, but research paper authors only care for facts sometimes.

Lakedaemon

Edited 2006-06-18 07:40

Reply Score: 1

RE[5]: Publication quality
by theine on Sun 18th Jun 2006 12:55 UTC in reply to "RE[4]: Publication quality"
theine Member since:
2005-09-29

It was obviously written with (La)teX.

Indeed, this is what pdfinfo gives:

Title: CF06_final1.dvi
Creator: dvips(k) 5.95b Copyright 2005 Radical Eye Software
Producer: AFPL Ghostscript 8.51

I have to say I'm realy suprised that somebody managed to make LaTeX output look this bad.

Edited 2006-06-18 12:57

Reply Score: 1

RE: Publication quality
by netpython on Sun 18th Jun 2006 10:39 UTC in reply to "Publication quality"
netpython Member since:
2005-07-06

What a hideously looking research paper. I wonder whether they use Openoffice or Word.

They clearly preferred function(s) above fashion.

Reply Score: 1

RE[2]: Publication quality
by theine on Sat 17th Jun 2006 22:41 UTC
theine
Member since:
2005-09-29

Two columns are fine and standard -- I was more referring to line/paragraph spacing and fonts.

Reply Score: 1

Ehh .. yes .. lets discuss beauty ...
by poohgee on Sun 18th Jun 2006 01:14 UTC
poohgee
Member since:
2005-08-13

.. & aesthetics 8) - Cool .

But yeah that paper wont win beauty contests IMO ;)

The headline makes all this sound like a surprise - isn't this exactly what this multi-CPU Cell thing was designed for ?

Are there any other processors of similar design to Cell out there ? because - I dont know bout Cray - but AMD & Intel I guess have the problem of having to be backwards compatible x86 chips so they cant just go mad on some advances in hardware developments .

The Cell is brand new without any backwards compatibility & PC software etc that it has to remember .

Reply Score: 1

fffffh Member since:
2006-01-04

Cell use a modified core of Power arhitecure.

Reply Score: 1

platforms
by transputer_guy on Sun 18th Jun 2006 01:58 UTC
transputer_guy
Member since:
2005-07-08

There will already be alot of Opteron server systems already headed to HPC land and the Cell/PS3 has not yet proved itself to be a viable or cost effective platform, who knows what the yield will be.

Recently we saw the introduction of FPGA coprocessors for the Opteron socket, a brilliant idea I believe, made possible by the recent opening up of the HT bus.

I could see a similar opportunity for a Cell as coprocessor to Opteron in that same second socket. I am not sure it that makes complete sense, or even if IBM has a compatible HT link. Such a module would have just the Cell cpu, some RDRam for which it has special interfaces, and possibly an FPGA HT bridge if needed.

Atleast such a solution would allow Opteron systems to continue on and the risk of investing in Cell platform much reduced. This makes the same sense as not designing special purpose Opteron+FPGA boards the way Cray/Octiga Bay did and allows the customer to choose from various Opteron server boards.

The same will likely happen with ClearSpeed as the other FPU candidate.

Reply Score: 3

RE: platforms
by SamuraiCrow on Sun 18th Jun 2006 02:39 UTC in reply to "platforms"
SamuraiCrow Member since:
2005-11-19

Most high-end supercomputing platforms run some Unix variant so the processor used would only be as critical as its need to run the software: function over form.

The Cell processor contains a fully functional PowerPC processor core (substituting IBM's brand of hyperthreading for out of order execution) so it will already run Linux. Why use Opteron as a host processor when the PowerPC Cell is self-hosted?

It is more likely that ClearSpeed will come out as a similarly self-hosted Opteron spin-off as a competitor to the Cell. Due to endian issues the two systems cannot easily be mixed.

Here's an article on ClearSpeed for the uninitiated:
http://www.reed-electronics.com/electronicnews/article/CA6316147?ni...

Reply Score: 1

RE[2]: platforms
by Wes Felter on Mon 19th Jun 2006 18:16 UTC in reply to "RE: platforms"
Wes Felter Member since:
2005-11-15

Why use Opteron as a host processor when the PowerPC Cell is self-hosted?

Because the performance is very different.

Reply Score: 1

RE: platforms
by John.Gustafsson on Sun 18th Jun 2006 10:37 UTC in reply to "platforms"
John.Gustafsson Member since:
2005-08-08

Hmm, one quad Opteron + 3 co-processors with 16 "pipelines" each gives us 4 massively fast CPUs with 48 coprocessor units. Compare that to 4 Cells which gives us 4 slow CPUs and 32 coprocessor units (possibly 28).

The cell is not, and will not, be a silver bullet, but rather the PS2 on steriods (and we all know what a pain the PS2 is to code for...).

Reply Score: 0

v RE[2]: platforms
by MediaSex on Sun 18th Jun 2006 11:22 UTC in reply to "RE: platforms"
RE[2]: platforms
by ceo1 on Sun 18th Jun 2006 23:31 UTC in reply to "RE: platforms"
ceo1 Member since:
2006-02-02

Only one way to find out:
Let's see if it stands the test of time. If IBM/Mercury/etc manage to make the Cell a 'commodity' processor, then perfect.

As it stands now, the GFLOPS/$ ratio isn't terribly attractive and although I welcome diversity, I do not see the Cell succeed on a large scale (e.g. medical imaging, oil and gas, research/scientific community).

It will be equally interesting to see the GPGPU approach take off. The GPU is already a commodity.

-CEO

Reply Score: 1

RE[3]: platforms
by rayiner on Mon 19th Jun 2006 02:01 UTC in reply to "RE[2]: platforms"
rayiner Member since:
2005-07-06

Actually, the GFLOPS/$ ratio should be excellent with the 65nm shrink, which will come sometime next year. The chip will be small (~120mm^2), produced in large quantities, and have its development cost subsidized by the PS3.

Reply Score: 1

Not-to-be-exceeded-numbers
by Cloudy on Sun 18th Jun 2006 02:50 UTC
Cloudy
Member since:
2006-02-15

Back in the old days in supercomputing, we called results like these "not to be exceeded" numbers, because they always assume the best performance possible from the system.

It's funny to see them in the future perfect tense though.

Reply Score: 2

Not that impressive
by Marcellus on Sun 18th Jun 2006 07:21 UTC
Marcellus
Member since:
2005-08-26

One problem with the cell is that it is only single-precision which is not too useful in HPC applications.
A problem with the results from the paper itself is that the Cell tests were handtuned to extract the best numbers possible. Something that you're not likely to spend that much time on in a real world application, where you have time constraints (time to implement) to take into account as well.

Reply Score: 1

v RE: Not that impressive
by MediaSex on Sun 18th Jun 2006 11:26 UTC in reply to "Not that impressive"
RE[2]: Not that impressive
by Marcellus on Sun 18th Jun 2006 15:56 UTC in reply to "RE: Not that impressive"
Marcellus Member since:
2005-08-26

Ok, so I forgot to mention that Cell does have double-precision support as well... Maybe I forgot because the double-precision performance stink.

You mean all of us console companies, defense contractors, medical computing, media companies are all wasting our time on Cell based systems???

I'm only aware of a single console company (Sony) that is playing with the Cell.
I'm not aware of any defense contractors, medical computing or media companies that are building anything around Cell.

Reply Score: 1

RE[3]: Not that impressive
by rayiner on Sun 18th Jun 2006 16:57 UTC in reply to "RE[2]: Not that impressive"
rayiner Member since:
2005-07-06

1) The article was about Cell's double-precision performance. While the article did test simulation, rather than hardware, it showed that Cell's double-precision performance could be quite usable as well.

2) Raytheon is working with IBM to use Cell in defense applications, Mercury Computer Systems is releasing a Cell-based blade server for industrial and medical computing, and Toshiba is going to use Cell in HDTVs.

Reply Score: 1

RE[3]: Not that impressive
by Alex Forster on Sun 18th Jun 2006 23:28 UTC in reply to "RE[2]: Not that impressive"
Alex Forster Member since:
2005-08-12

Maybe I forgot because the double-precision performance stink.

If by "stink" you mean "only marginally above average as opposed to eye-wideningly above average."

Reply Score: 1

Fast in 3d too
by datadevil on Sun 18th Jun 2006 09:20 UTC
datadevil
Member since:
2006-03-03

Not that a cpu is purely for 3d, but it does show its potential; I saw a demo of a Cell powered machine running a 3d flightsim on a conference this week, pitched against a ppc, and it was a framerate of 8-10 against 50 or so, pretty impressive.

Reply Score: 2

RE: Fast in 3d too
by ceo1 on Sun 18th Jun 2006 23:36 UTC in reply to "Fast in 3d too"
ceo1 Member since:
2006-02-02

Are you by any chancereferring to the precorded flight sim video of the Cell vs the PPC at EAGE in Vienna ?

Reply Score: 1

I'm still missing something...
by tamlin on Sun 18th Jun 2006 12:34 UTC
tamlin
Member since:
2006-06-18

... and that is a CPU, or even a well-documented FPGA (though a CPU, even if something as Cell, is more known and understood by most) on a PCI card, with tools included to program and use it.

If this Cell CPU is capable of over 200 GFLOPS as the linked ArsTechnica article refers, why not throw just a few MB of SRAM onto a PCI card, put a Cell chip on it, and have a darned Cray-killer-on-a-PCI-card (sounds almost like the "pogo-on-a-stick" from SpaceQuest 3, doesn't it :-) ).

I know I could have most certainly made very good use of such a thing, had it been priced correctly of course when I wrote and ran some RSA-576 factoring code a few years back - heck, I even considered buying a Xilinx (sp?) kit due to general purpose CPU's being so horribly slow on large integer math (they have to do it sequentially, at that time 32 bits at a time, while an FPGA clocked at a mere 100MHz could do it all in parallel and literally do a 576-bit * 576-bit multiplication in 2-3 clockcycles).

What did however worry me about the Cell was from the ars article "... and Cell still manages to trounce the other guys at performance/watt". This was in comment to the "fact" that the Cell was (according to those number s) able to churn out 204.7 GFLOPS, while the Cray X1E stopped at 29.5.

I don't know about you, but if you multiply the power consumption of a Cray X1E by just 6 (not even reaching the Cells alleged capability) and the Cell still comes out on top for power/FLOPS, it could in theory mean the Cell sucks more than 6x the power of a Cray X1E. We're talking about shipping a CPU with integrated power plant and colling solution in a container now - just for one CPU! :-)

I hope someone knowing more about this can put my worries to rest.

Reply Score: 2

RE: I'm still missing something...
by nimble on Mon 19th Jun 2006 09:38 UTC in reply to "I'm still missing something..."
nimble Member since:
2005-07-06

an FPGA clocked at a mere 100MHz could do it all in parallel and literally do a 576-bit * 576-bit multiplication in 2-3 clockcycles.

No way. Perhaps at 10 MHz, if you had a big enough FPGA. But even the fattest Virtex-4 (in a >1000-pin package) has "only" 512 dedicated 18-bit multipliers (forget about doing this in regular FPGA fabric). Please correct me if I'm wrong, but to do all the partial multiplications in parallel, you'd need (576/18)^2 = 1024 of those multipliers, and then you still need enough LUT to add them all up.

Reply Score: 1

RE[2]: Publication quality
by theine on Mon 19th Jun 2006 13:48 UTC
theine
Member since:
2005-09-29

They clearly preferred function(s) above fashion.

This sounds like you somehow have to make a compromise between the two -- which you clearly do not.

Reply Score: 1