Hannibal of Arstechnica writes “Last week, OS News published an analysis of IBM’s Cell-related patents. This article presents some of the information in the patents in an easily digestible format, but it has some serious flaws, as well. And I’m not talking about Cell-specific flaws, though there are those, but what appear to be problems with the author’s understanding of basic computer architecture.” Read On.. In the meantime, Nicholas Blachford explained a bit more his points on a sixth Cell article, and also wrote a rebuttal on Ars’ article.
Mr. Blachford,
Trying to pass guesses and hypothetical ramblings off as real technical journalism is bad form. The article was written in such a tone that you knew what you were talking about because you had studied the patents, and had knowledge of the PS3. In reality however, you were simply making guesses from made up (yes, imaginary numbers that came from your head) specifications extrapolated from a patent application. Remember that REAL theoretical performance is going to be different and not once do you make that apparent in your article. Instead spouting off incredible numbers as if they were true. Please admit to that.
I’ll defer to Hannibal on all matters ppc…..
His “pedantic” rant? You’re far-out claims cried out for a hard-hitting fact check. You’re wrong, he’s right, life goes on.
As an artist, I am subjected to criticism on a daily basis – it took a while to get used to, but is now something I look forward to. If you’re to take on an “tech analyst” (just what we all need!) role, you sure as hell better have your facts straight – we avid readers of such matters are tired of mis-informed hacks – please don’t allow yourself to fall into that category.
Should get these guys some boxing gloves and have them go at it. Battle of the ego’s is all i can see here
This part:
Furthermore, the article is chock full of wild-eyed and completely unsubstantiated claims about exactly how much butt, precisely measured in kilograms and centimeters squared, that the Cell will kick, and how hard, measured in decibels, that the Cell will rock.
Made me giggle.
I’m of the opinion that, considering specs will be officially released on February, an speculative article is totally unnecesary and prone to be full of errors that will be shown the next month.
When I read Blachford’s article, many of his statements didn’t make much sense to me. Aren’t storage locations in a processor either caches or registers? So when you go around saying the Cell, which is a processor, won’t have any caches but will have RAM embedded in it, it will certainly raise many eyelids.
Quoting the conclusion of the article: “The first Cell based desktop computer will be the fastest desktop computer in the industry by a very large margin. Even high end multi-core x86s will not get close. Companies who produce microprocessors or DSPs are going to have a very hard time fighting the power a Cell will deliver. We have never seen a leap in performance like this before and I don’t expect we’ll ever see one again, It’ll send shock-waves through the entire industry and we’ll see big changes as a result.”
Come on Mr. Blachford, how could Hannibal’s critique come as a surprise given such speculative claims? Even if it would turn out to be true, it still wouldn’t be good journalism to state something like that long before actual products have seen the light of day (i.e. all readers may not have enough knowledge on the subject in question to differentiate between fact and fiction). While the intention of the article may have been good you should’ve left all that crap out, and maybe then it would’ve been much closer to your objective of summarizing the Cell in an easy-to-understand article. I understand it hurts a lot to be beaten upon in a front page article at the #1 tech news site, and maybe that was a bit over-the-top on Hannibal’s part, but you can’t really argue with his objections towards your article. In fact, without the many speculative claims in your article, I am sure he would’ve let any factual errors (whether true or in his opinion) in the article slip through without comment.
That said, don’t let this discourage you; I am looking forward to reading other articles by you in the future, just leave the pure speculation out of them – or at least write something like “imagine if…”
While I think Hannibal makes good points, he could have put them across in a more diplomatic manner. But it does serve as a good wake-up call.
Please could someone let me know the patent numbers (IDs) relevant to the ‘cell’ processor if you have any ideas (I didn’t spot any quoted), I would certainly like to assess them myslelf.
I would further assume that patents are freely publicly accessible and I hope would be accessible in electronic form. Is this so?
That said this technology does indeed exist, try the following exact lines in Google:
“auto-parallelizing” compiler
“auto vectorizing” compiler
Such compilers do exist today. However, there are no where as efficient as is claimed in the article. Hannibal does make good points here. Many of the tasks we do are serial and is practically impossible to parallelize. GPUs can do it since they are operating on a very constrained problem domain.
I’m glad the follow-up article did address this somewhat (kudos to Nicholas Blachford) but he should have made things more clear. In the followup article, the author mentions that you need to break the problem up into software cells. This is the hard and should have been emphasized more. This alone means that the theoretical peak performance of the Cell architecture will be hard to attain if not impossible. Adding more cells won’t help as they will just be sitting there twiddling their thumbs.
But I still think Hannibal was overly harsh 🙂
One has to remember, this processor has been under design for the past several years. From the outset, it was described as a “radical” architecture. And, with transistor budgets going up, it makes it much easier to put “embedded RAM” in many devices.
I work in the “embedded systems” arena, and there is a BIG move to here, to have fast, local, easily accessable data for all sorts of Processing. In fact, most DSP’s run from NOTHING BUT internal “embedded RAM” in order to keep their high throughput rates up. Same goes for most NPU’s (network processor chips that have programmable “engines”). The problem with registers is that they contain a “structure” (i.e. size) and are fixed. They are Access is also controlled by the Instruction set (i.e. only certain instructions work on registers). Having registers is good (i.e. RISC architectures), but having too many has it’s problems too. The AMD 29K architecutre had 192 registers to use any way the compiler wanted to. When you’ve called down in your software N times, you end up havnig to “spill” this to the stack, and then “fill” it back. Takes time (even with block move instructions).
Cache is also “structured” (Y size lines, N ways, etc.) and is primarily used to store recently used instructions/data. Cache’s require tags and are based upon addresses, which have to be matched, checked, etc. Logic that takes time. Simply having “scratch RAM” in some local address space is easier to deal with, you can use it any way you want.
I worked on an SoC design which had “micro-engines” doing some network processing on data flowing in/out of the chip. You’d be suprised how quick one can chew up a few K (yes Kilobytes) of precious RAM doing protocol processing.
So, yes, having Embedded Ram might raise some eyebrows, but this is a “radical, out of the box” architecture. The real key, IMHO, will be how to handle the programming and being able to “easily” harness all of that raw horsepower. We will know very soon. . . .
Blanchford’s piece is typical of amateur-written articles that make little distinction (due to lack of knowledge or over-enthusiasm) between PR technobabble and objective fact. In this case the author happened to be called on it in a very public way by someone–Hannibal–who makes his living writing articles the way they should be written. Research and deep, intuitive knowlege of the subject matter are not optional.
The internet is full of the type of marginally useful tracts that Blanchford produced. The fact that OSNews saw it fit to publish (did they pay money for it?) says more about the editors and aspirations of OSNews than anything else.
OSNews will only become a serious resource for technology news when it learns to distinguish between journalism and oversimplified verbose trash.
Come on Mr. Blachford, how could Hannibal’s critique come as a surprise given such speculative claims?
None of the people who have written to me have complained, and some of these are *seriously* well qualified people. Aside from the odd OSNews and slashdot troll, I simply haven’t seen this kind of reaction anywhere else.
Look at the performance achieved with GPUs. These have been shown to outgun general purpose CPUs many times over. They are already approaching a 10X peformance difference – thats not theoretical, this is measured performance in real applications.
Cell is very similar technology aimed at more general applications but built in a much more agressive manner.
Whe it comes to the first Cell beating a PC I am not making speculative claims.
Perhaps I’ve made the error of the assuming that people already know what GPUs are capable of. If you are already farmiliar with them, these figures do not seem off the wall, if not they must sound like fantasy.
Cell should launch sometime this year, I think a lot of people are in for a big surprise.
Look at the performance achieved with GPUs. These have been shown to outgun general purpose CPUs many times over. They are already approaching a 10X peformance difference – thats not theoretical, this is measured performance in real applications.
I’m sorry but I don’t see the leap in reasoning. GPUs perform much better than CPUs because GPUs are working on a very small problem domain. And it is a problem domain that can be fairly easily vectorized/parallelized.
The same cannot be said about the general purpose CPU. Sure, they have some vector processing capability like Altivec and SSE, but their applications are limited to only a certain number of problems since many problems are serial in nature.
I’m sorry but I don’t see the leap in reasoning. GPUs perform much better than CPUs because GPUs are working on a very small problem domain. And it is a problem domain that can be fairly easily vectorized/parallelized.
…I couldn’t have said it better myself!
That said, assuming Nicholas is correct in his speculation, there should be certain applications – like Viro points out – that would benefit from an increase (possibly massive if Nicholas is correct) in vector processing capacity, e.g. media encoding/processing. Basically, you could list the problem areas where the Mac == PowerPC and its Altivec units are allowed to shine, and I guess you’d end up with a pretty good summary. However, adding “DSP” power (using a popular term) to a PC isn’t anything new, it has just never catched on in the main stream. But obviously I’d love to have it in my box too some day…
Just that, there weren’t many so big mistakes on the article as a whole, the whole point is that the cell is going to be a big power pc chip with many coprocessors built in.
That point is made, then he guesses what the performance in real life could be comparing it to what do we know about similar technology, you may argue that maybe he’s expecting too much, but that’s up to you to decide it.
As Blatchford pointed he’s not a professional journalist, and I do agree with him that many of the mistakes he made came more out of poor writting than from technical merits.
I think that the important thing here is that before blatchford no one attempted to explain how the cell architecture works, and I truly think that the people in Arstechnica should have written a better description rather than critizising someone gratuitously.
And i Thought I will read some serious journalism about the Cell architecture and what Blatchford possibly miss wrote until I actually clicked the link and saw the Arstechnica logo.
I agree with ucedac.
I’m sorry but I don’t see the leap in reasoning. GPUs perform much better than CPUs because GPUs are working on a very small problem domain. And it is a problem domain that can be fairly easily vectorized/parallelized.
A lot of applications can be vectorised, not ust Graphics.
There’s a lot of research into the area of using GPUs for general purpose processing.
I don’t know if you read part 3, it covers the areas which Cell will be able to accellerate, most high compute tasks will benefit:
http://www.blachford.info/computer/Cells/Cell3.html
A lot of applications can be vectorised, not just Graphics.
There’s a lot of research into the area of using GPUs for general purpose processing.
All usage of GPUs for general purposes processing has been limited to: scientific analysis like fluid simulation, linear algebra, ray tracing, cryptography, media, OR pretty much anything with heavy vector match calculation.
Find one example of a general usage of GPU programming outside of 3d and media.
It WON’T help out your standard applications on your PC but it is certainly suited as a video game console or a media player/encoder. Not understanding this, and realizing why it won’t be able to “emulate” an x86 is why people wonder about your claims and understanding of basic computing. Vectorization won’t help Office run any faster, or speed up AI in games or help you process xml any faster.
You acknowledge this in Part 3 but then forget about it during the next section “Cell vs PC” and even later in section 3: “But generally PCs either don’t need much power or they can be accelerated by the Cell, Intel and AMD will be churning out ever more multi-core’d x86s but what’s going to happen if Cells will deliver vastly more power at what will rapidly become a lower price?”
You also provide no reference for any information as to how & why you expect it to be cheaper and why you expect “software and price will not be a problem”. Even the ps2 had significant supply issues at launch and even during the 2004 xmas season. Past precedent is no guide for future success – even nVidia and ATI have supply issues with their latest GPUs.
Where is the basis for “The Cell includes at least a PowerPC 970 grade CPU “?. The only thing Sony/IBM/Toshiba has said is that it will have “a 64-bit Power processor core ”
You claim that GPUs “generate large amounts of heat” yet fail to speculate that this could be a problem for the CELL as well.
You also don’t address why Sony needs to work with nVidia to create a custom GPU for the PS3, considering the “power of the CELL”. This news has been out since 12/7/04
I fully understand your points – and your motivation for writing your “analysis”. Good work !
(It was obvious that you were making some optimistic guesses)
Im on your side…
> Aren’t storage locations in a processor either caches or registers?
I disagree: local RAM managed by SW is not a normal cache because there is not the associated HW logic: normally cache are transparent for SW point of view, which is not the case here and it is not also seen as a register by the CPU because you have to use load/store instruction to access it.
That said, having small fast local RAM for fast access in specialised vector coprocessor is nothing new..
The shortages this past holiday season can be attributed to the redesigned PS2. The old models were essentially eol and Sony was not going to manufacture more once the old supply dried up. Every retailer I talked to knew this fact. The Christmas demand insured they would clear out all the old stock, makes sense from a business standpoint.
To be fair, the counterpoints by others seemed legitimate, but the tone of them came off as a little insulting. I think the nature of analyzing something still in development will lead to speculation on the author’s part. It is not like the author is personally testing the Cell processor. Speculative journalism tends to contain a fair amount of sensationalism. I thought it was done in good faith after studying patents to discuss theoretical capabilities.
I don’t think the purpose of the articles was to mislead people into thinking the Cell would be better in EVERY respect than EVERY processor currently available. I think the point was to show how exciting a rethinking of traditional CPU and architecture design paradigm can be. Let’s cut some slack, human being are incapable of being impartial observers. It is hard to come by just the facts.
Nathan
Dear Blachford, if you can’t write factual articles, then please at least don’t use so much rhetorics and name promotion in the article. It requires no special factual knowledge to avoid unnecessary name promotion, such as the Amiga vs. PC rant in your article was. You could easily improve many of your articles by avoiding that. In short: avoid rhetorics, rant, politics and advertisements.
I found Blachford’s article quite good, and it was clear to me which parts of it were facts and which were speculations/opinions of the author.
On the contrary, I didn’t like at all Hannibal’s article: instead of spending so much time criticizing someone else’s work (in such an arrogant way), he could have written another piece on the Cell processor putting the facts in the way he finds correct.
But criticizing is always easier than doing, isn’t it?
Anyway thanks, Mr Blachford.
I am with Blachford on it. Kick hannibal’s a$$
I found Mr Blatchfords article fascinating; easy to understand and clearly speculative, in that the only available source is an opaque patent filing. Thanks Nicolas, for bringing this forward and making it digestible, even tasty ! I hope your projections prove accurate, and you are revealed as the guy who broke the story.
OTOH, Hannibals front page, full on, attack was completely unnecessary and only served to reduce Hannibals own credibility. Those of us who actually read the article are aware that
a) the actual product specs have not been released, nevermind tested, and therefore contained a large measure of speculation.
b) NB is neither an engineer, nor professional journalist, but is bringing this info to us as a well rounded, well grounded enthisiast. I, for one, can accept that at face value, and can appreciate the perspective.
I don’t just swallow it whole, Hannibal, I consider the sources, so you’re not leaping to my defence, just being a flaming ass**** ! In the future, if you disagree with the conclusions of an article, or suspect it’s sources, try explaining why in a clear and civil manner. Otherwise you’re just being a gigantic troll. Back to overreactors annonymous, for you !
>>It WON’T help out your standard applications on your PC
>>but it is certainly suited as a video game console or a
>>media player/encoder. Not understanding this, and
>>realizing why it won’t be able to “emulate” an x86 is why
>>people wonder about your claims and understanding of basic
>>computing. Vectorization won’t help Office run any faster,
>>or speed up AI in games or help you process xml any
>>faster.
Use your imagination, if one of this cells contains an improved PPC core and 8 powerfull GPUs, the PPC can do the x86 emulation, and the GPU’s can help emulate a virtual VGA plus a virtual sound card…
Which I think would improve performance on an emulator/virtual machine a little bit, don’t you think so?
It can be argued that ms-office doesn’t scale well but how much of the current CPU is it actually using?
The primary source of cpu utilation on my box is:
1) Idle
2) Generate electricsheeps (screensaver distributed computing)
3) Compiling upgrades
of these… I’d say all of them scale well =)
What does the future hold fo us? What future applications could be implemented with the cell? Speech recognition?
“Use your imagination, if one of this cells contains an improved PPC core and 8 powerfull GPUs, the PPC can do the x86 emulation, and the APU’s can help emulate a virtual VGA plus a virtual sound card…
Which I think would improve performance on an emulator/virtual machine a little bit, don’t you think so?”
(…I assume you meant APUs not GPUs…)
If CELL doesn’t have the top of the line PowerPC, then I’d rather have a Mac with one, and use virtual PC.
If the CELL does have the top of the line PowerPC, and the APUs will be used for 3d graphics, then why did Sony need to get a GPU from nVidia for the PS3?
And if the CELL does have the latest PowerPC, then won’t that be MORE expensive then the standard PowerPC CPU?
And, BTW, multi-processors or multi-cores help out for “compiling” – not vector processors.
The comment about registers and onchip RAM was interesting. I wonder what the chances of making stack based processing units in the future is.
If the number of registers is growing and that means that it takes longer to dump them all onto the stack, wouldn’t it be better to develop really fast local RAM and fudge register addressing with stack offsets?
Done correctly it would save all that thrashing around on the stack for each “function”. With a shared pool and the stream based processing described in the original article you could just pass a stack reference when passing control to another vector unit (assuming the other unit was nearby).
Of course I’m just making this up. The whole thing would be a complete shambles to use because more often that not you don’t actually know how big the stack needs to be. Local ram is likely to be severely limited in quantity and expensive space wise. The offset calculations alone would almost certainly destroy any potential gains.
Still not all new (pronounced old/unused) ideas are necessarily bad. Maybe the ell will deliver some serious grunt and lead to more peoplt exploring alternative ideas.
Speculate away I say!