Linked by Thom Holwerda on Wed 8th Feb 2012 23:22 UTC
Hardware, Embedded Systems "Researchers from North Carolina State University have developed a new technique that allows graphics processing units and central processing units on a single chip to collaborate - boosting processor performance by an average of more than 20 percent."
Order by: Score:
20%??
by looncraz on Thu 9th Feb 2012 00:44 UTC
looncraz
Member since:
2005-07-24

For a first step, 20% is a pretty significant jump. AMD has been planning on doing this for some time - I am glad to see that this was AMD's research investment paying off...

However, with the poor performance of Bulldozer, it will take more than a 20% boost to match Intel's current performance... MAYBE some semblance of this tech will be in Piledriver... but I seriously doubt it :-(

Reply Score: 1

RE: 20%??
by galvanash on Thu 9th Feb 2012 06:04 UTC in reply to "20%??"
galvanash Member since:
2006-01-25

MAYBE some semblance of this tech will be in Piledriver... but I seriously doubt it :-(


That would be a pleasant surprise if it happened, but I agree - I wouldn't bet on Piledriver having a shared L3 (or any L3 for that matter).

This makes more sense for Steamroller once they hit 28nm. Hopefully by 2013 they can go back and tweak the layout to make some room for an L3.

Reply Score: 2

RE[2]: 20%??
by phoenix on Thu 9th Feb 2012 16:50 UTC in reply to "RE: 20%??"
phoenix Member since:
2005-07-11

Sounds like AMD's planned HSA (heterogenous systems architecture) for future APUs, wherein the CPU and GPU both have access to the same caches, RAM, buses, etc.

Anandtech.com has a bunch of slides up from the recent AMD financial analyst day that cover this.

A summary article that lists all the other articles:
http://www.anandtech.com/show/5503/understanding-amds-roadmap-new-d...

Reply Score: 2

RE[3]: 20%??
by zima on Wed 15th Feb 2012 22:12 UTC in reply to "RE[2]: 20%??"
zima Member since:
2005-07-06

HSA (heterogenous systems architecture)

I can't help but notice it could just as well mean homogeneous systems architecture... ;)

Reply Score: 2

RE: 20%??
by ndrw on Thu 9th Feb 2012 06:52 UTC in reply to "20%??"
ndrw Member since:
2009-06-30

If the performance boost is really only 20% then in most cases it is simply not worth the effort put in the optimization. Partitioning computations into two separate programs is not trivial, putting the same effort in other optimizations techniques may produce better results.

Reply Score: 3

RE[2]: 20%??
by bnolsen on Thu 9th Feb 2012 17:24 UTC in reply to "RE: 20%??"
bnolsen Member since:
2006-01-06

That's always been the problem with GPGPU programming. Is the extra trouble really worth it? Especially with the obsolecense window still being something like 6-12 months or so, and the huge difference in scaling between cpu and gpu technology makes heterogenious design a headache.

Edited 2012-02-09 17:29 UTC

Reply Score: 2

RE[2]: 20%??
by looncraz on Fri 10th Feb 2012 02:16 UTC in reply to "RE: 20%??"
looncraz Member since:
2005-07-24

I was taking the article to be hinting at a FREE (from the programmer's perspective) 20% additional performance. I would expect this to be achieved by the CPU intelligently off-loading FPU tasks to the GPU.

If this is just more of the same, then I don't see what the big deal would be... Intel already does it, and I do it already here with AMD APP with my video encoding tasks...

Reply Score: 2

RE: 20%??
by fithisux on Fri 10th Feb 2012 15:25 UTC in reply to "20%??"
fithisux Member since:
2006-01-22

However, with the poor performance of Bulldozer, it will take more than a 20% boost to match Intel's current performance... MAYBE some semblance of this tech will be in Piledriver... but I seriously doubt it :-(



This research is maybe hinting at a full open source GPU that does opencl also.

I wish ARM/VIA did the same as AMD.

In any case AMD is brave enough and it will be the target of my new desktops. I do not care that much about performance only. If the model is open and easy to program for Illumos/*BSD/Haiku without blobs then I am ready to buy.

Edited 2012-02-10 15:26 UTC

Reply Score: 2

RE: 20%??
by zima on Wed 15th Feb 2012 22:05 UTC in reply to "20%??"
zima Member since:
2005-07-06

However, with the poor performance of Bulldozer, it will take more than a 20% boost to match Intel's current performance... MAYBE some semblance of this tech will be in Piledriver... but I seriously doubt it :-(

I wish people (the sentiment is promulgated all over the place) would get some perspective over this one... the performance of Bulldozer, or of AMD CPU offerings in general, isn't exactly poor - it's just worse.

Truth is, lately, CPUs are quite universally way more than powerful enough for (most likely) strong majority of people - and AMD products can be even easily seen as preferable in some segments, for example the one covered by Fusion series (addressing, with its more decent GPU, some of the nowadays few areas where the processing power is still not necessarily enough; and even, via "fuller" GPGPU support, making up for some of the CPU power differences)

The areas "never enough power" got quite rare... (and at least one, not really tested by benchmark sites and such, might be curious here - AMD supposedly geared Bulldozer for HPC uses, it almost looked like at the cost of general desktop performance; who knows what will yet come out of it)

Reply Score: 2

The PR article from that website
by tylerdurden on Thu 9th Feb 2012 01:41 UTC
tylerdurden
Member since:
2009-03-17

and the actual research presented in their paper, seem to have played a real bad game of "telephone."

Edited 2012-02-09 01:42 UTC

Reply Score: 4

Bill Shooter of Bul Member since:
2006-07-14

Wow! North Carolina State University is releasing a new phone! Awesome! I hope it will finally introduce a fully open source stack, maybe with a little Haiku inside ;) In any case, this is sure to restore the wolfpack to their glory days. Duke & Carolina's days are over!

Go Pack!

Reply Score: 3

Seems a bit specific?
by bloodline on Thu 9th Feb 2012 10:20 UTC
bloodline
Member since:
2008-07-28

Doesn't this suggest that a more generic solution would be to simply have a small ARM core on the GPU with some cache memory? That can do the "complex" functions to feed and assist the GPU... leaving the CPU free for more important tasks... Maybe the transistor budget really won't allow that, I don't know ;)

-edit- I just realised that is exactly what Broadcom have done with the chip that the Raspberry Pi team are using*! Except they are using the ARM core to run an OS ;)

*If we are to believe the Raspbery Pi team (which I do), then the chip they are using is a gfx chip with an ARM core for support, rather than a CPU with an integrated GPU...

Edited 2012-02-09 10:26 UTC

Reply Score: 1

RE: Seems a bit specific?
by Fergy on Thu 9th Feb 2012 10:23 UTC in reply to "Seems a bit specific?"
Fergy Member since:
2006-04-10

Doesn't this suggest that a more generic solution would be to simply have a small ARM core on the GPU with some cache memory? That can do the "complex" functions to feed and assist the GPU... leaving the CPU free for more important tasks... Maybe the transistor budget really won't allow that, I don't know ;)

I have had the same thought for years. Just put a cpu on the gpu card.

Reply Score: 2