Linked by Thom Holwerda on Mon 23rd Jan 2012 11:29 UTC
Hardware, Embedded Systems "The CPU design firm Venray Technology announced a new product design this week that it claims can deliver enormous performance benefits by combining CPU and DRAM on to a single piece of silicon. We spent some time earlier this fall discussing the new TOMI (Thread Optimized Multiprocessor) with company CTO Russell Fish, but while the idea is interesting; its presentation is marred by crazy conceptualizing and deeply suspect analytics."
Order by: Score:
Comment by kokara4a
by kokara4a on Mon 23rd Jan 2012 12:25 UTC
kokara4a
Member since:
2005-09-16

From linked article:

The "Power consumption" graphs show Oracle's maximum power consumption for a system with 10x Xeon E7-8870s, 168 dedicated SQL processors, ...


WTF is a "dedicated SQL processor"? Google doesn't show anything.

Reply Score: 2

RE: Comment by kokara4a
by galvanash on Mon 23rd Jan 2012 13:44 UTC in reply to "Comment by kokara4a"
galvanash Member since:
2006-01-25

WTF is a "dedicated SQL processor"? Google doesn't show anything.


My guess would be that should have been "processes" instead of processors... A 10 CPU E7-8870 would have 100 cores and would be capable of running 200 simultaneous threads. 168 sounds like it might be a a good target for the maximum process count for such a system (just a guess but it sounds plausible).

Reply Score: 3

RE: Comment by kokara4a
by Bill Shooter of Bul on Mon 23rd Jan 2012 15:54 UTC in reply to "Comment by kokara4a"
Bill Shooter of Bul Member since:
2006-07-14

A company called kickfire ( now owned by terradata) , used FPGA's to do the sql parsing pseudo-natively. They called those SQL processors. Maybe Oracle does something similar?

Reply Score: 3

RE: Comment by kokara4a
by lucas_maximus on Tue 24th Jan 2012 12:14 UTC in reply to "Comment by kokara4a"
lucas_maximus Member since:
2009-08-18

It is where you assign specific processor(s) to an SQL instance.

Depending on what the SQL box is doing it normally isn't needed. The Server mentioned is probably set up in a very particular way.

This is how to do it on Microsoft SQL

http://msdn.microsoft.com/en-us/library/ms187104.aspx

EDIT: after reading again ... I think it should be processors as well

http://www.osnews.com/permalink?504217

Edited 2012-01-24 12:24 UTC

Reply Score: 2

SoC ? µC ?
by boulabiar on Mon 23rd Jan 2012 12:27 UTC
boulabiar
Member since:
2009-04-18

But, System on Chip and Microcontroller already have dram, flash, and other stuff inside the same package with the CPU !

Here you only put a better CPU.

Reply Score: 3

RE: SoC ? õC ?
by kokara4a on Mon 23rd Jan 2012 12:41 UTC in reply to "SoC ? µC ?"
kokara4a Member since:
2005-09-16

Yeah. But this processor is on the same chip as the DRAM. So it has very wide bus that connects it to the DRAM (4096 bits). However, I guess it's still subject to the same latencies.

It's a bit like the Cell's SPEs and I guess it will be as hard to program for.

Reply Score: 1

RE[2]: SoC ? �õC ?
by smashIt on Mon 23rd Jan 2012 14:12 UTC in reply to "RE: SoC ? õC ?"
smashIt Member since:
2005-07-06

Yeah. But this processor is on the same chip as the DRAM. So it has very wide bus that connects it to the DRAM


sounds like the gpu from 10 years ago that never materialised

Reply Score: 2

Fergy Member since:
2006-04-10

"Yeah. But this processor is on the same chip as the DRAM. So it has very wide bus that connects it to the DRAM


sounds like the gpu from 10 years ago that never materialised
"
Yeah whatever happened to those GPU's with 12MB of eDRAM? I read Intel wants to do something similar to get their GPU up to speed.

Reply Score: 2

RE[4]: SoC ?
by Brunis on Mon 23rd Jan 2012 17:00 UTC in reply to "RE[3]: SoC ? Ã��Ã�Ã"
Brunis Member since:
2005-11-01

You mean Glaze 3D from BitBoys Oy ?

It's on this top15 Vaporware list:
http://pcworld.about.net/od/technology/The-Top-15-Vaporware-Product...

:)

Reply Score: 1

RE[5]: SoC ?
by smashIt on Mon 23rd Jan 2012 17:05 UTC in reply to "RE[4]: SoC ?"
smashIt Member since:
2005-07-06

You mean Glaze 3D from BitBoys Oy ?


yeah, that's the one!

Reply Score: 2

RE[6]: SoC ?
by n4cer on Mon 23rd Jan 2012 18:46 UTC in reply to "RE[5]: SoC ?"
n4cer Member since:
2005-07-06

"You mean Glaze 3D from BitBoys Oy ?


yeah, that's the one!
"

There's also the Xbox 360's GPU, which has eDRAM on die.

This is the SoC version in the current 360.
http://www.tgdaily.com/hardware-features/51228-microsoft-details-ne...

Reply Score: 3

RE[7]: SoC ?
by n4cer on Mon 23rd Jan 2012 19:29 UTC in reply to "RE[6]: SoC ?"
n4cer Member since:
2005-07-06

"You mean Glaze 3D from BitBoys Oy ?


yeah, that's the one!


There's also the Xbox 360's GPU, which has eDRAM on die.

This is the SoC version in the current 360.
http://www.tgdaily.com/hardware-features/51228-microsoft-details-ne...
"

I need more sleep. Clearly the eDram is on a separate die.

Edited 2012-01-23 19:30 UTC

Reply Score: 2

RE[6]: SoC ?
by kamil_chatrnuch on Tue 24th Jan 2012 13:19 UTC in reply to "RE[5]: SoC ?"
kamil_chatrnuch Member since:
2005-07-07

ATI bought them.

Reply Score: 2

interesting design
by bnolsen on Mon 23rd Jan 2012 18:49 UTC
bnolsen
Member since:
2006-01-06

Be interesting to know if this moves the bottlenecks around.

A big part of a modern CPU involves legacy instruction decode and instruction/data fetch/prefect. Removing the whole issue of board traces and DRAM interfacing/serialization etc with a legacy free instruction set hopefully might change the rules enough to make something like this viable.

On to the benchmarks!

Edited 2012-01-23 18:49 UTC

Reply Score: 2

RE: interesting design
by tylerdurden on Mon 23rd Jan 2012 20:40 UTC in reply to "interesting design"
tylerdurden Member since:
2009-03-17

The 80s happened 30 years ago. Oh, and is it that hard to read the article?

Reply Score: 3

RE[2]: interesting design
by bnolsen on Tue 24th Jan 2012 00:50 UTC in reply to "RE: interesting design"
bnolsen Member since:
2006-01-06

The criticisms are there, but should that really change anything? So today its just 22k transistors. Maybe next time its a few hundred thousand. Nothing wrong with promising technology.

Today an SOC has to have external independent RAM. that requires either traces on a PCB or an external POP package. The next evolution in SOCs is to put it all on one die and have a true sigle chip solution. These guys are actively pursuing one approach and I applaud them for this.

Reply Score: 1

RE[3]: interesting design
by tylerdurden on Tue 24th Jan 2012 02:54 UTC in reply to "RE[2]: interesting design"
tylerdurden Member since:
2009-03-17

Yeah, except that is now what they are doing. At. all.


Seriously, is it that hard to read the article?


Anyhow, this is by all means not a new idea. And it has always failed, because the programming models for these sort of architectures simply are not there, or have never proven practical for generalized algorithms.

The big thing is that they are doing the CPU using DRAM processes. So probably they will end up being a patent factory, just like the previous startup from these guys.

Reply Score: 3

scary, maybe
by transputer_guy on Mon 23rd Jan 2012 21:57 UTC
transputer_guy
Member since:
2005-07-08

Before I even read the article I was thinking about the Forth chips and Chuck Moore.

The last section though was pretty scary but the Futurologists like Ian Pierson make it sound pretty lame stuff.

There are DRAMs that are literally 20 or more times faster than regular DRAM, so they can start full almost random accesses every 2.5ns, not the usual 60ns of todays commodity chips.

With Micron RLDRAM, you can sustain certain types of compute processing at up to 400M I/Os per sec. It is based on 8 concurrent banks of 8cycle 20ns latency DRAM blocks sharing a split I/O bus structure in a 1Gbit DRAM process. It has full address and data I/O lines on dedicated pins like an SRAM with modern DDR pin speeds. The networking industry uses them, in fact Atiq Raza the Nextgen/AMD architect used these RLDRAMs in a custom network processor at RMI now NetLogic.

The question is can you build a useful general purpose computer that can get 5 operations or so for each memory cycle at 2000M ops/sec.

You can only do this on highly threaded designs, and you have to pay for the effect of making the 8 banks look like one address space.

In practice with an FPGA you have to use the slower version at 300MHz, and the penalty for the single address space is about 1/3 of memory bandwidth is lost. So you are left with about 1000M ops/sec and it takes around 20-40 odd threads that will need some communication between them and other nodes. Such a processor can be built in FPGAs like Virtex series and you can effectively get 40 simple 25Mips cores per node. The 40 threads actually are spread on 10 or so 4 way cores.

Is that useful to anyone, probably not to usual punters, but I wouldn't mind having one. The big advantage is that every memory cycle has no effective Memory Wall, you get a big Thread Wall instead. If you can deal with that then you can also expand the system up many times, more Thread Wall though.

If you could implement the RLDRAM and processor on the same chip, then the clock rate can go up a few times, and the whole node replicated as DRAM capacity allows. The processor can then get decent FPU as well.

In my MVC analysis of graphics apps I have written, I know that the more complex Control part needs very few cycles and can happily run with a few MIPs, the Model part usually needs cycles. The View part can usually be partitioned quite nicely over dozens of small tiles, it is a question of organizing the graphics into parallel pipelined structures.

Since we already have a Thread wall with typically 4 x86 processors, might as well go full hog. I have 2 Intel Quad PCs and 99.xx% of the time those spares are never used.

Perhaps Venray is thinking along the same lines, dunno.

Reply Score: 2

Sweet
by Soulbender on Tue 24th Jan 2012 02:03 UTC
Soulbender
Member since:
2005-08-18

This is awesome. Really.
* Take someone's venture capital
* Make some prototypes that arent even close to what you want to sell.
* Creae some fictional performance numbers and make some completely insane predictions
= PROFIT!

Or, more likely, go bust and be forgotten in a year or so.

Reply Score: 5

Not new
by torbenm on Tue 24th Jan 2012 10:19 UTC
torbenm
Member since:
2007-04-23

I saw the idea more than a decade ago (1996), where someone proposed a Sparc processor in DRAM technology. It came out as fairly promising, but AFAIK nothing was ever built. See http://dl.acm.org/citation.cfm?id=232984

Whether the case has gotten better or worse in the meantime, I can't say.

Also, ARM2 used only 24K transistors, not the 30K claimed in the article, so the TOMI is about the same size.

Reply Score: 2

CPU+DRAM
by rhfish on Wed 25th Jan 2012 23:49 UTC
rhfish
Member since:
2012-01-25

Nice discussion of the architecture.

A little easier to read explanation can be found in EDN.
http://www.edn.com/article/520059-The_future_of_computers_Part_1_Mu...
http://www.edn.com/article/520499-Future_of_computers_Part_2_The_Po...

Best Regards,
Russell

Reply Score: 1