AMD Does Reverse GPGPU, Announces OpenCL SDK for x86

Thom Holwerda 2009-08-06 AMD 10 Comments

“AMD has announced the release of the first OpenCL SDK for x86 CPUs, and it will enable developers to target x86 processors with the kind of OpenCL code that’s normally written for GPUs. In a way, this is a reverse of the normal ‘GPGPU’ trend, in which programs that run on a CPU are modified to run in whole or in part on a GPU.”

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

10 Comments

2009-08-07 3:40 am

Tuishimi
This competitive stuff is fun!
2009-08-07 5:06 pm

tylerdurden
OpenCL for the CPU earlier than for the GPU?

At least NVIDIA already has a working OpenCL developer release for their GPUs, you know the intended target audience of the system. Bah…

2009-08-08 12:08 am

tyrione
OpenCL for the CPU earlier than for the GPU?

At least NVIDIA already has a working OpenCL developer release for their GPUs, you know the intended target audience of the system. Bah…

So does AMD. Just go sign up and download it.

Edited 2009-08-08 00:08 UTC

2009-08-08 3:23 pm

Xanady Asem
…. to be fair NVIDIA has had the opencl sdk available for developers for a while, whereas ATI just released theirs (which brings down our linux devel system for stream really hard, so it seems they are not ready for production yet… so this x86 opencl release seems to be a stopgap to not keep developers waiting)

2009-08-08 4:15 pm

Tuishimi
…the purpose of the GPU was to offload some work from the CPU? I understand that this would be good for mini/micro systems – where GPU is limited. Would this help them work together in some way – act like a multi-processor GPU? That would be nice – maybe preprocess the commands before popping the results off to the GPU for final calculation and rendering.

2009-08-08 7:35 pm

grabberslasher
OpenCL is a lot more than just a GPU accelerator for code; it’s designed to be a system where the code you write can run on CPUs, GPUs and other accelerator cards without you having to do any other work. It also maximizes performance and threading usage for compute tasks running on the CPU.

2009-08-09 8:56 pm

malkia
I’ve “ported” the nvidia nbody sample from their OpenCL package, to the AMD cpu based one.

I can’t comment on NVIDIA OpenCL benchmarks as I’m under NDA, but compared to the same CUDA sample it was 1.3GFLOPS for AMD/OpenCL (CPU) vs. 28GFLOPS CUDA (can’t comment on NVIDIA OpenCL). Though I need to test again

MacBook Pro (MacBookPro3,1)

Intel Core 2 Duo 2.6 Ghz, 1 Processor, 2 cores, 4MB L2 Cache, 4GB Memory, Bus Speed 800Mhz

GeForce 8600M GT (PCI-Express x16 width) 256MB

Bootcampe’d Windows XP 32 bit Service Pack 3 using 190.38 nvidia drivers (bit modified to install on the 8600M – thanks to laptop2go)

That to be said, CPU based OpenCL is exactly what we need. There are lots of servers, where graphics cards are not present, also if you have Remote-Desktop’d into such machine, the graphics driver is replaced and you can’t use CUDA (won’t comment on OpenCL). VNC is too slow for solution (maybe only HP RGS). And OpenCL would be there for the PS3 SPE’s…

The beauty of it, is that it offers a more restricted “C”-based language, where you can still program normally (not an assembler), but it would still run efficiently. From that perspective, you can forget all your worries about using C++ as dominantly performance language (through boost, and other template libraries), and use your favourite high-level language (javascript, C#, java, lisp, rub, python, perl, etc.) as long as it has some form of freezing foreign array data (e.g. the garbage collector should not move it) and accessing it later.

Put the management decisions of what work need to be done in the high-level language (it would be easier to organize such tasks there), and then write directly the low-level workers in OpenCL – most likely all OpenCL implementations would always have the compiler loaded so you can even change on the fly, instead of recompiling – much like GLSL in OpenGL.

It would take quite time to catch on, but I think it might hit the sweet spot.

2009-08-10 7:02 am

tyrione
I’ve “ported” the nvidia nbody sample from their OpenCL package, to the AMD cpu based one.

I can’t comment on NVIDIA OpenCL benchmarks as I’m under NDA, but compared to the same CUDA sample it was 1.3GFLOPS for AMD/OpenCL (CPU) vs. 28GFLOPS CUDA (can’t comment on NVIDIA OpenCL). Though I need to test again

MacBook Pro (MacBookPro3,1)

Intel Core 2 Duo 2.6 Ghz, 1 Processor, 2 cores, 4MB L2 Cache, 4GB Memory, Bus Speed 800Mhz

GeForce 8600M GT (PCI-Express x16 width) 256MB

Bootcampe’d Windows XP 32 bit Service Pack 3 using 190.38 nvidia drivers (bit modified to install on the 8600M – thanks to laptop2go)

That to be said, CPU based OpenCL is exactly what we need. There are lots of servers, where graphics cards are not present, also if you have Remote-Desktop’d into such machine, the graphics driver is replaced and you can’t use CUDA (won’t comment on OpenCL). VNC is too slow for solution (maybe only HP RGS). And OpenCL would be there for the PS3 SPE’s…

The beauty of it, is that it offers a more restricted “C”-based language, where you can still program normally (not an assembler), but it would still run efficiently. From that perspective, you can forget all your worries about using C++ as dominantly performance language (through boost, and other template libraries), and use your favourite high-level language (javascript, C#, java, lisp, rub, python, perl, etc.) as long as it has some form of freezing foreign array data (e.g. the garbage collector should not move it) and accessing it later.

Put the management decisions of what work need to be done in the high-level language (it would be easier to organize such tasks there), and then write directly the low-level workers in OpenCL – most likely all OpenCL implementations would always have the compiler loaded so you can even change on the fly, instead of recompiling – much like GLSL in OpenGL.

It would take quite time to catch on, but I think it might hit the sweet spot.

You might want to retest that code because I’d imagine the GFLOPS between your Macbook Pro and the 8600GT will differ wildly as well.

More to the point, you didn’t actually mention the ATi card you tested against.

2009-08-10 7:40 am

MamiyaOtaru
More to the point, you didn’t actually mention the ATi card you tested against.

He didn’t test against an ATI card. “to the AMD cpu based one.” and “for AMD/OpenCL (CPU)” should be hints.
2009-08-11 10:22 am

malkia
I said 8600M GT, not 8600GT