Linked by Amjith Ramanujam on Wed 19th Nov 2008 22:07 UTC, submitted by caffeine deprived
Hardware, Embedded Systems Nvidia and partners are offering new "personal supercomputers" for under $10,000. Nvidia, working with several partners, has developed the Tesla Personal Supercomputer, powered by a graphics processing unit based on Nvidia's Cuda parallel computing architecture. Computers using the Tesla C1060 GPU processor will have 250 times the processing power of a typical PC workstation, enabling researchers to run complicated simulations, experiments and number crunching without sharing a supercomputing cluster.
Order by: Score:
Persoanl?
by tyrione on Wed 19th Nov 2008 22:27 UTC
tyrione
Member since:
2005-11-21

Here I was hoping there was some bizarre explaination for the name.

Reply Score: 1

RE: Persoanl?
by amjith on Wed 19th Nov 2008 22:39 UTC in reply to "Persoanl?"
amjith Member since:
2005-07-08

Here I was hoping there was some bizarre explaination for the name.

Fixed the title. Thanks. ;)

Reply Score: 1

Specifics?
by Vanders on Wed 19th Nov 2008 22:43 UTC
Vanders
Member since:
2005-07-06

"Computers using the Tesla C1060 GPU processor will have 250 times the processing power of a typical PC workstation"


For what operations? How big and how fast is the the on-card memory on a C1060? What programming models does the C1060 support?

While I have no doubt it'll do vector math much faster than a general purpose CPU, it won't help much if you're processing a large data set as the PCIe bus will become the (very small) bottle-neck.

Edited 2008-11-19 22:44 UTC

Reply Score: 3

RE: Specifics?
by CodeMonkey on Wed 19th Nov 2008 23:20 UTC in reply to "Specifics?"
CodeMonkey Member since:
2005-09-22

For what operations?

GPU units really shine in huge SIMD problems where you have a very large dataset and need to perform the same operation on each element. Examples would be simulations, visualization, medical imaging, etc.

it won't help much if you're processing a large data set as the PCIe bus will become the (very small) bottle-neck.

While the PCIe bus is usually the limiting factor, it can be dealt with. Usually by transferring very large chunks of data over at once (hundreds of megabytes to several gigabytes), performing the computation on the GPU, and tranfering the results back, rinse, repeat. Even with the bandwidth limitations, the computational gains are so great, the end result is usually orders of magnitude faster.

How big and how fast is the the on-card memory on a C1060

4GB, 512-bit GDDR3, 800MHz, 102 GB/sec.

What programming models does the C1060 support?

Since at it's heart it's just a GPU, the programming model is shader based. GLSL or HSL could both be used (the OpenGL and DirectX shading languages). However, NVidia's CUDA toolkit is also available (and the preferred method) which is essentially an extension to C designed with a kernel type processing model in mind (GPU kernel, not OS kernel).

Edited 2008-11-19 23:25 UTC

Reply Score: 5

RE[2]: Specifics?
by Vanders on Thu 20th Nov 2008 07:52 UTC in reply to "RE: Specifics?"
Vanders Member since:
2005-07-06

"For what operations?

GPU units really shine in huge SIMD problems where you have a very large dataset and need to perform the same operation on each element. Examples would be simulations, visualization, medical imaging, etc.
" [/q]

I know, but I'd be interested in seeing bench marks of high-level operations (I.e. how fast can it reduce a matrix of n*n compared to a CPU?)

"it won't help much if you're processing a large data set as the PCIe bus will become the (very small) bottle-neck.

While the PCIe bus is usually the limiting factor, it can be dealt with. Usually by transferring very large chunks of data over at once (hundreds of megabytes to several gigabytes), performing the computation on the GPU, and tranfering the results back
" [/q]

Yes, that's why I was interested in how much on-board memory it has.

"What programming models does the C1060 support?

Since at it's heart it's just a GPU, the programming model is shader based. GLSL or HSL could both be used (the OpenGL and DirectX shading languages). However, NVidia's CUDA toolkit is also available
" [/q]

Ah, so you can't take your existing Fortran and recompile chunks of it for the C1060?

Reply Score: 2

RE[3]: Specifics?
by CodeMonkey on Thu 20th Nov 2008 14:25 UTC in reply to "RE[2]: Specifics?"
CodeMonkey Member since:
2005-09-22

I know, but I'd be interested in seeing bench marks of high-level operations (I.e. how fast can it reduce a matrix of n*n compared to a CPU?)

Interestingly, the CUDA SDK comes with a BLAS library implemented on the GPU. They also have an FFT library as well.

Ah, so you can't take your existing Fortran and recompile chunks of it for the C1060?

It's not as simple as a re-compile, no. And really, you wouldn't want it to be. When, using a GPU to accellerate processing, it's not just another processor. It has a very different memory model and a very differnt processing model. In order to really take advantage of and best leverage the GPU architecture, the code needs to be structured with that in mind.

Say, for instance, you have 500 matricies of size 500x500 and you needed to use these matricies to solve some A*x=b equations. On the CPU, you would loop though all 500, solving one at a time.
While this will work on the GPU, it's not an efficient way to use it. On the GPU, you would copy all 500 to the GPU memory, run a single solver on all 500 simultaneously, and then copy the results back.

Specialized hardware generally requires specialized programming to fully exploit it.

Reply Score: 2

RE[4]: Specifics?
by Vanders on Thu 20th Nov 2008 16:57 UTC in reply to "RE[3]: Specifics?"
Vanders Member since:
2005-07-06

"Ah, so you can't take your existing Fortran and recompile chunks of it for the C1060?

It's not as simple as a re-compile, no.
...
Specialized hardware generally requires specialized programming to fully exploit it.
"

Which, in a round-about way, brings me to the point: while these cards look very nice and clearly have a roll to play in specialised applications such as real-time medical imaging, they are not a "drop in" replacement for a proper cluster. If you write your code to use one of these cards you will find yourself tied to nVidia in the future, with perhaps no opportunity to run your code on a faster machine in the future should the need arise.

If you write your code using say, MPI on Fortran, you can pretty much expect your code to run five or ten years from now, even if it's running on a totally different cluster.

Reply Score: 2

RE[5]: Specifics?
by javiercero1 on Sat 22nd Nov 2008 20:45 UTC in reply to "RE[4]: Specifics?"
javiercero1 Member since:
2005-11-10

CUDA is a programming model, mostly based on super-threading, data streaming and data parallelism.

It is being ported to the CPU, and later (via Apple's OpenCL, which is mostly CUDA-based) to ATI's GPUs (although the ATI parts have poorer programmability).

Basically, once you map your algorithm to CUDA, you should be able to run it on either the CPU or GPU in the near future.

Alas, if you already have developed your code on OpenMP and it works for you.. as they say, if it ain't broken...

However, where the CUDA boards shine is on their price per flop and power per flop. So they are very, very, very attractive.

Reply Score: 2

Crysis
by JonGretar on Wed 19th Nov 2008 23:07 UTC
JonGretar
Member since:
2008-10-30

Finally a computer that might be able to play Crysis on full settings.

Reply Score: 9

Comment by talaf
by talaf on Wed 19th Nov 2008 23:57 UTC
talaf
Member since:
2008-11-19

Cuda is much like C actually, and very easy to handle imo. The truly hard part is designing your algorithms to use the heavily distributed computational power, and memory access/control can be tricky (but that's true anything heh ^_^). You also cannot use device functions from the device, which effectively disable all recursive programming and a handful of usual algorithms.

But honestly the benefits are so great on some applications it's almost crazy. Check it out, almost everybody has a Geforce 8+ somewhere and Cuda is available on both Linux and Windows ;) Matlab has plugins for it too iirc, and it's so easy to set up, one shouldn't deprive himself of such resources ;)

Reply Score: 3

Nice, but...
by mmu_man on Thu 20th Nov 2008 00:13 UTC
mmu_man
Member since:
2006-09-30

Does it have a graphics card ?

Reply Score: 3

Is the devkit Free (as in speech) ?
by mmu_man on Thu 20th Nov 2008 00:14 UTC
mmu_man
Member since:
2006-09-30

Like if I want to use it with another OS (say, Haiku :p)...

Reply Score: 3

CodeMonkey Member since:
2005-09-22

Free to download, yes. Open source, no. The hardware has a certain version of the CUDA API it supports so I'm assuming it would also require the nvidia drivers. This would in turn limit support to Winindows XP / Vista x86/x64, Linux x86/x64, Solaris x86/x64, OSX x86/x64, and FreeBSD x86. However, AFAIK, only the Windows, Linux, and OSX drivers support the CUDA API.

Reply Score: 1

did someone say...
by pixel8r on Thu 20th Nov 2008 03:29 UTC
pixel8r
Member since:
2007-08-11

"Vista Capable"?

Reply Score: 1

The Academy beware?
by orfanum on Thu 20th Nov 2008 07:01 UTC
orfanum
Member since:
2006-06-02

Combine this news with the following:

http://news.bbc.co.uk/1/hi/technology/6425975.stm

and you may just have the beginnings of a much more devolved understanding of the Academy. The increase in open access journals and datasets and the fact that in the UK at least, on average 80% of PhDs do not make it into formal university research jobs, could provide the basis for an accelerated evolution of tertiary education, in which the dichotomy between Town and Gown ultimately dissolves.

I hope so - those pundits that bewail the furtherance of knowledge by those beyond a self-maintaining elite ('The Cult of the Amateur' by Andrew Keen, for e.g.) in my very humble opinion need to be put firmly in their place. The printing press did not mean literacy for the few only, and neither should that ever-unfolding digital broadsheet, the Internet.

Reply Score: 2

Personal Supercomputer?
by Silent_Seer on Fri 21st Nov 2008 01:34 UTC
Silent_Seer
Member since:
2007-04-06

Well how about a 72 core Mips workstation:

http://sicortex.com/products/deskside_development_system

now that is something close to a supercomputer on your desk.

With that said, the above posts are correct, Nvidia's stream processors do outperform traditional CPUs on certain data parallel tasks. The key to achieving that is by coding specially for it.

Reply Score: 2

So ..
by astroraptor on Sat 22nd Nov 2008 02:55 UTC
astroraptor
Member since:
2005-07-22

Crysis should finally be able to run at 60fps?

Reply Score: 1

Want an open-ened API
by BrendaEM on Sat 22nd Nov 2008 18:24 UTC
BrendaEM
Member since:
2005-11-23

One again, the consumer is the big loser, in the same way the customer had to choose between SLI and Crossfire when buying a motherboard, the user must choose between programming APIs for those cards.

I believe that Apple may be making a API passthrough, but it would still be sad if the only open GPU API connector between Nvidia and ATA--is proprietary in itself.

OpenSouce people: Motivate and create a open standard for number-crunching on a video card, or be left out in the cold, having to make difficult decisions.

We are seeing a major revolution in computers, what I feel is the biggest change in computers since the CD-ROM.

A suggestion for Nvidia: Get 45nm parts, and pull the trigger on Intel.

A warning to Intel: be very afraid, and get that 6-Banger out.

Reply Score: 2

RE: Want an open-ened API
by javiercero1 on Sat 22nd Nov 2008 20:47 UTC in reply to "Want an open-ened API"
javiercero1 Member since:
2005-11-10

Once again, we have someone posting from their parent's basement giving directions to a whole industry on what they have to do.

Arm chair quarterbacking is sooooo much easier, than actually doing.

How is the capacity of utilizing a few gigaflops currently present in a lot of desktops a "bad thing for customers?"

Reply Score: 1