Linked by Thom Holwerda on Tue 1st Jun 2010 21:33 UTC
Intel "In an announcement at the International Supercomputing Conference, Intel provided further details on the many-core chip that it hinted at earlier in the month. The first product based on the new design will be codenamed Knight's Corner, and will debut at 22nm with around 50 x86 cores on the same die. Developer kits, which include a prototype of the design called Knight's Ferry, have been shipping to select partners for a while now, and will ship more broadly in the second half of the year. When Intel moves to 22nm in 2011, Knight's Corner will make its official debut."
Order by: Score:
Comment by cb88
by cb88 on Wed 2nd Jun 2010 00:14 UTC
cb88
Member since:
2009-04-23

Would it have been any better had they named it Knight's Fury? No of course not but it really would have been a better name. It would have made more sense with Knight's Ferry being the developer version.

All I'm not impressed with intel so far their designs are little more than *take small cores and network together* which isn't the best approach for supercomputing IMO cloud computeing perhaps but not supercomputing.

Reply Score: 1

RE: Comment by cb88
by kaiwai on Wed 2nd Jun 2010 00:44 UTC in reply to "Comment by cb88"
kaiwai Member since:
2005-07-06

Super computers don't actually require complicated CPU designs given that the instructions are fixed; all the CPU that is required to do is suck in the information, crunch it and spit out the other side. When you use a super computer for number crunching you're pushing in a sequence of equations and pumping out a result the other end so you can get away with stripping off branch prediction and so forth because all you're really interested in is raw power.

Have a simple CPU design, chain together those many cores, parallelise the code to buggery, a heck load of bandwidth and a clock speed going gang busters and you'll be all set to party.

Edited 2010-06-02 00:45 UTC

Reply Score: 2

RE[2]: Comment by cb88
by drahca on Wed 2nd Jun 2010 11:36 UTC in reply to "RE: Comment by cb88"
drahca Member since:
2006-02-23

Super computers don't actually require complicated CPU designs given that the instructions are fixed; all the CPU that is required to do is suck in the information, crunch it and spit out the other side.


Instructions are always fixed, it's called an ISA. And what you are describing seems to be some kind of stream processor. Most super computers need to be good in several tasks, with algorithms which can be parallelized successfully to varying degrees. Of course even if you parallelise it to bits there is always Amdahl's law.

When you use a super computer for number crunching you're pushing in a sequence of equations and pumping out a result the other end so you can get away with stripping off branch prediction and so forth because all you're really interested in is raw power.


You can only get away with stripping off branch prediction (and I presume other niceties such as Out Of Order Execution) if you have a well behaved algorithm, which you almost never have in reality. Of course some (parts of) algorithms run well on GPUs, which is what you seem to be describing here.

Have a simple CPU design, chain together those many cores, parallelise the code to buggery, a heck load of bandwidth and a clock speed going gang busters and you'll be all set to party.


Again, this only works for some algorithms. Communication between processors does not scale that well for most workloads. So you'd rather want fewer high performance cores, than more low performance cores. Scaling is not very important if your total performance still sucks.

If you don't believe me, check out the super computer top 500. Almost all systems use Xeons or Opterons.

What Intel is building here is interesting. Larrabee was supposed to be a many core x86 processor with massive vector units. The memory system was cache coherent using a massive ring bus. There were serious doubts as to if it would scale very well even for embarrassingly parallel workloads. This MIC might look more like the other project Intel had, in which there was no cache coherency but all chips were connected by a switched network and one had to use explicit message passing between threads in software, almost like a cluster on a chip.

Reply Score: 2

RE: Comment by cb88
by ssokolow on Wed 2nd Jun 2010 00:44 UTC in reply to "Comment by cb88"
ssokolow Member since:
2010-01-21

I saw an Intel video about these experimental many-core chip designs and cloud computing was specifically intended to be the target. Their whole goal is to explore ways to further improve space- and power-efficiency in cloud computing datacenters. (eg. by having 50 cores that consume as much as a single high-end CPU and can be throttled back to a 10th of that at off-peak times)

Reply Score: 1

Supercomputing?
by ShadesFox on Wed 2nd Jun 2010 01:12 UTC
ShadesFox
Member since:
2006-10-01

Doubtful it will be useful in super computing. The current 6 core chips tended to not have nearly the performance improvement over 4 core chips that was expected. Mostly because the problem isn't CPUs count. Or CPU speed. The problem is the memory. Memory bandwidth is the killer, and no one seems to be offering solutions.

Reply Score: 2

RE: Supercomputing?
by cb88 on Wed 2nd Jun 2010 03:32 UTC in reply to "Supercomputing?"
cb88 Member since:
2009-04-23

Well ... intel has eDRAM which is more compact which is also why intel chips have such huge caches these days...

I would be curious to see what would happen if cores were capped at 4 and whatever extra die space were thrown at cache and a real integrated GPU design where it would be more akin to how an FPU is treated instead of just a device hanging off of PCI-E

Reply Score: 1

RE[2]: Supercomputing?
by cerbie on Wed 2nd Jun 2010 04:04 UTC in reply to "RE: Supercomputing?"
cerbie Member since:
2006-01-02

Aren't you confusing them with IBM? I'm pretty sure Intel is just really good at making small cheap SRAM.

On top of that, even eDRAM would leave them with the problem of having to have a royal caravan of RAM slots--it would only make the size of cache cheaper. eDRAM is still no performance match for SRAM.

However, even with SRAM caches, workloads that can crunch on moderate sizes of data that can be fit into a shared cache might be able to work very fast, without jacking up the RAM bandwidth. If Intel needed to, I'm sure they could do 32+MB SRAM caches on a die, and still make their high margins.

Edited 2010-06-02 04:09 UTC

Reply Score: 2

RE: Supercomputing?
by Vanders on Wed 2nd Jun 2010 12:45 UTC in reply to "Supercomputing?"
Vanders Member since:
2005-07-06

Doubtful it will be useful in super computing.


These are more likely to be used as co-processors rather than as a replacement for the nodes primary CPU. It's similar to what nVidia & Clearspeed already do.

Reply Score: 2

Yay!
by reduz on Wed 2nd Jun 2010 04:57 UTC
reduz
Member since:
2006-02-25

At last! realtime raytracing!

Reply Score: 2

Comment by Neolander
by Neolander on Wed 2nd Jun 2010 07:41 UTC
Neolander
Member since:
2010-03-08

This "multiple low-powered core" technology is not gonna last on the desktop, the day people realize that only few problems scale well accross multiple cores.

For virtualization-oriented servers, on the other hand, putting that together with NUMA could do wonders. But as other people around, I think that bus bandwidth issues will kill this product.

Edited 2010-06-02 07:42 UTC

Reply Score: 2

RE: Comment by Neolander
by rom508 on Wed 2nd Jun 2010 09:49 UTC in reply to "Comment by Neolander"
rom508 Member since:
2007-04-20

This "multiple low-powered core" technology is not gonna last on the desktop, the day people realize that only few problems scale well accross multiple cores.


You mean only few software programs scale well across multiple cores. There are many problems that can be decomposed into parallel tasks, you just need to build your software from the ground up to take advantage of large number of parallel execution units.

There are many things people do on desktop machines that benefit from multicore processors: audio/video encoding, digital photography, data rendering, be it a complex 3D scene or office/web document. And many new problems can be created to fill the demand for such hardware.

Reply Score: 1

RE: Comment by Neolander
by reduz on Wed 2nd Jun 2010 16:55 UTC in reply to "Comment by Neolander"
reduz Member since:
2006-02-25

No one says that all existing software or all existing types of software should scale to multiple cores. It's more an issue of existing software taking advantage of parallel processing for different kinds of tasks.

Software like photoshop, 3ds max, even web browsers (scaling javascript and the rendering processes) can be modified to take advantage. Audio software can greatly be benefitted too (run multiple virtual effects/synthesizers each on a separate core), and of course videogames (physics simulation, renderingm etc).

So the target is to give more power to existing software, not asking it to be rewritten...

Reply Score: 2

Really looking forward to this
by iseyler on Wed 2nd Jun 2010 13:10 UTC
iseyler
Member since:
2008-11-15

Why the odd number of 50 though?

Can't wait to get our hands on this chip to see how it performs! This is something we want to support in BareMetal OS (http://www.returninfinity.com) for HPC.

Edited 2010-06-02 13:11 UTC

Reply Score: 1