Linked by Thom Holwerda on Thu 6th Aug 2009 22:38 UTC
AMD "AMD has announced the release of the first OpenCL SDK for x86 CPUs, and it will enable developers to target x86 processors with the kind of OpenCL code that's normally written for GPUs. In a way, this is a reverse of the normal 'GPGPU' trend, in which programs that run on a CPU are modified to run in whole or in part on a GPU."
Order by: Score:
Who's on first?
by Tuishimi on Fri 7th Aug 2009 03:40 UTC
Tuishimi
Member since:
2005-07-06

This competitive stuff is fun!

Reply Score: 3

Really AMD?
by tylerdurden on Fri 7th Aug 2009 17:06 UTC
tylerdurden
Member since:
2009-03-17

OpenCL for the CPU earlier than for the GPU?

At least NVIDIA already has a working OpenCL developer release for their GPUs, you know the intended target audience of the system. Bah...

Reply Score: 1

RE: Really AMD?
by tyrione on Sat 8th Aug 2009 00:08 UTC in reply to "Really AMD?"
tyrione Member since:
2005-11-21

OpenCL for the CPU earlier than for the GPU?

At least NVIDIA already has a working OpenCL developer release for their GPUs, you know the intended target audience of the system. Bah...


So does AMD. Just go sign up and download it.

Edited 2009-08-08 00:08 UTC

Reply Score: 2

RE[2]: Really AMD?
by javiercero1 on Sat 8th Aug 2009 15:23 UTC in reply to "RE: Really AMD?"
javiercero1 Member since:
2005-11-10

.... to be fair NVIDIA has had the opencl sdk available for developers for a while, whereas ATI just released theirs (which brings down our linux devel system for stream really hard, so it seems they are not ready for production yet... so this x86 opencl release seems to be a stopgap to not keep developers waiting)

Reply Score: 2

But I thought...
by Tuishimi on Sat 8th Aug 2009 16:15 UTC
Tuishimi
Member since:
2005-07-06

...the purpose of the GPU was to offload some work from the CPU? I understand that this would be good for mini/micro systems - where GPU is limited. Would this help them work together in some way - act like a multi-processor GPU? That would be nice - maybe preprocess the commands before popping the results off to the GPU for final calculation and rendering.

Reply Score: 2

RE:
by grabberslasher on Sat 8th Aug 2009 19:35 UTC in reply to "But I thought..."
grabberslasher Member since:
2006-02-09

OpenCL is a lot more than just a GPU accelerator for code; it's designed to be a system where the code you write can run on CPUs, GPUs and other accelerator cards without you having to do any other work. It also maximizes performance and threading usage for compute tasks running on the CPU.

Reply Score: 2

Comment by malkia
by malkia on Sun 9th Aug 2009 20:56 UTC
malkia
Member since:
2005-07-17

I've "ported" the nvidia nbody sample from their OpenCL package, to the AMD cpu based one.

I can't comment on NVIDIA OpenCL benchmarks as I'm under NDA, but compared to the same CUDA sample it was 1.3GFLOPS for AMD/OpenCL (CPU) vs. 28GFLOPS CUDA (can't comment on NVIDIA OpenCL). Though I need to test again

MacBook Pro (MacBookPro3,1)
Intel Core 2 Duo 2.6 Ghz, 1 Processor, 2 cores, 4MB L2 Cache, 4GB Memory, Bus Speed 800Mhz
GeForce 8600M GT (PCI-Express x16 width) 256MB

Bootcampe'd Windows XP 32 bit Service Pack 3 using 190.38 nvidia drivers (bit modified to install on the 8600M - thanks to laptop2go)

That to be said, CPU based OpenCL is exactly what we need. There are lots of servers, where graphics cards are not present, also if you have Remote-Desktop'd into such machine, the graphics driver is replaced and you can't use CUDA (won't comment on OpenCL). VNC is too slow for solution (maybe only HP RGS). And OpenCL would be there for the PS3 SPE's...

The beauty of it, is that it offers a more restricted "C"-based language, where you can still program normally (not an assembler), but it would still run efficiently. From that perspective, you can forget all your worries about using C++ as dominantly performance language (through boost, and other template libraries), and use your favourite high-level language (javascript, C#, java, lisp, rub, python, perl, etc.) as long as it has some form of freezing foreign array data (e.g. the garbage collector should not move it) and accessing it later.

Put the management decisions of what work need to be done in the high-level language (it would be easier to organize such tasks there), and then write directly the low-level workers in OpenCL - most likely all OpenCL implementations would always have the compiler loaded so you can even change on the fly, instead of recompiling - much like GLSL in OpenGL.

It would take quite time to catch on, but I think it might hit the sweet spot.

Reply Score: 1

RE: Comment by malkia
by tyrione on Mon 10th Aug 2009 07:02 UTC in reply to "Comment by malkia"
tyrione Member since:
2005-11-21

I've "ported" the nvidia nbody sample from their OpenCL package, to the AMD cpu based one.

I can't comment on NVIDIA OpenCL benchmarks as I'm under NDA, but compared to the same CUDA sample it was 1.3GFLOPS for AMD/OpenCL (CPU) vs. 28GFLOPS CUDA (can't comment on NVIDIA OpenCL). Though I need to test again

MacBook Pro (MacBookPro3,1)
Intel Core 2 Duo 2.6 Ghz, 1 Processor, 2 cores, 4MB L2 Cache, 4GB Memory, Bus Speed 800Mhz
GeForce 8600M GT (PCI-Express x16 width) 256MB

Bootcampe'd Windows XP 32 bit Service Pack 3 using 190.38 nvidia drivers (bit modified to install on the 8600M - thanks to laptop2go)

That to be said, CPU based OpenCL is exactly what we need. There are lots of servers, where graphics cards are not present, also if you have Remote-Desktop'd into such machine, the graphics driver is replaced and you can't use CUDA (won't comment on OpenCL). VNC is too slow for solution (maybe only HP RGS). And OpenCL would be there for the PS3 SPE's...

The beauty of it, is that it offers a more restricted "C"-based language, where you can still program normally (not an assembler), but it would still run efficiently. From that perspective, you can forget all your worries about using C++ as dominantly performance language (through boost, and other template libraries), and use your favourite high-level language (javascript, C#, java, lisp, rub, python, perl, etc.) as long as it has some form of freezing foreign array data (e.g. the garbage collector should not move it) and accessing it later.

Put the management decisions of what work need to be done in the high-level language (it would be easier to organize such tasks there), and then write directly the low-level workers in OpenCL - most likely all OpenCL implementations would always have the compiler loaded so you can even change on the fly, instead of recompiling - much like GLSL in OpenGL.

It would take quite time to catch on, but I think it might hit the sweet spot.


You might want to retest that code because I'd imagine the GFLOPS between your Macbook Pro and the 8600GT will differ wildly as well.

More to the point, you didn't actually mention the ATi card you tested against.

Reply Score: 2

RE[2]: Comment by malkia
by MamiyaOtaru on Mon 10th Aug 2009 07:40 UTC in reply to "RE: Comment by malkia"
MamiyaOtaru Member since:
2005-11-11

More to the point, you didn't actually mention the ATi card you tested against.

He didn't test against an ATI card. "to the AMD cpu based one." and "for AMD/OpenCL (CPU)" should be hints.

Reply Score: 2

RE[2]: Comment by malkia
by malkia on Tue 11th Aug 2009 10:22 UTC in reply to "RE: Comment by malkia"
malkia Member since:
2005-07-17

I said 8600M GT, not 8600GT

Reply Score: 1