Linked by Thom Holwerda on Wed 14th Apr 2010 23:51 UTC
This article describes a real-word software port, with examples of how various porting challenges are resolved. If you are a software developer porting software to UNIX, you will find these techniques invaluable in avoiding common pitfalls, resolving bugs, and improving your productivity.
Thread beginning with comment 419534

Member since:
2006-02-07

foobar,

I claim that software emulation is 5-10x slower than running native code. As a response, you object:

"[Actually], Hercules emulation is 5-10 times slower".

I didnt get that. Could you explain again? Aren't we stating the same thing?

Let me quote you again. You orginally wrote this:

Software emulation is 5-10 times faster than native code.

Regarding your claim that a IBM Mainframe z10 with 64 cpus gives 30.500MIPS, and not 28.000 MIPS. Ok, let us assume that the correct number is 30.500 MIPS. This means each z10 cpu gives 30.500 / 64 = 480 MIPS.

Cofounder of emulator TurboHercules claims 8-socket Nehalem-EX is predicted to give 3.200 MIPS. This means each Nehalem-EX gives 3.200 / 8 = 400 MIPS. So we see that z10 cpu is slightly faster than Nehalem-EX. But if Nehalem-EX could run native code (instead of emulating a Mainframe) then it would be 5-10x faster. That is, 400 mips x 5 = 2000MIPS. 400 Mips x 10 = 4000 MIPS. Let us average, and say a Nehalem-EX would give 3000 MIPS per cpu (natively). Then, you need 10 Nehalem-EX to reach 30.000 MIPS.

I dont see where this calculation is wrong? But feel free to point out my errors and lecture me.

You are not accounting for cores/chips/processors correctly. A 64 processor z10 is actually 64 cores. An 8 socket nehalem is 64 cores. If you want per core performance, then it is 3200/64 = 50 emulated mips, and 50 * 5 = 250. So you need 120 nehalem cores by your hoakie numbers.

While we're throwing around hoakie numbers, lets account for the other cores in the z10. Not channel cards, but the z10 cores characterized as SAPs and CFs. Since, IBM actually measured that 30,000 MIPs number, we should include the cores doing IO. For the biggest machine, that's another 11 cores. 11 * (30,500/64) = 5200 MIPS. So we can estimate that a z10 is really capable of 35,700 MIPS.

http://www.redbooks.ibm.com/redbooks/pdfs/sg247515.pdf

Let's not forget that z10 shipped in the beginning of 2008, and your comparing it to a brand new 2010 intel product. Then you conclude that, somehow, the mainframe is obsolete because it doesn't keep up with today's hardware.

It's rumored that there will be a new mainframe this year.

"Rubish, you're twisting the numbers and conveniently leaving facts out."

The other source, regarding this Linux expert. He claimed back in 2003, that compared to a single core Pentium 4, a Mainframe MIPS equals 4MHz. 1MIPS == x86 4MHz. Now, an Nehalem-EX has 8 cores, each running 2.3GHz. Then a Nehalem-EX has a total of 18.4 GHz per cpu.

Now, one P4 clock tick, does very little work compared to one Nehalem-EX clock tick. Maybe the Nehalem-EX does twice as much as work as one P4 clock tick? Then, this means that the Nehalem-EX which gives 18.4GHz, does more work than if you collected 18.4GHz of Pentium 4. In fact, you would need twice as many Pentium 4 clock ticks, to match 18.4GHz of Nehalem-EX.

So, one Nehalem-EX gives you actually 36.8 GHz worth of Pentium 4 clock ticks.

If you do the calculations, you will see that you only need a few Nehalem-EX to match 64 Mainframe CPUs.

If you spot an error, please point it out so I can correct my calculations.

See above.

In either case, we see two independent highly technical sources that claims that Mainframe CPU are dog slow. Hence, an Mainframe could NEVER consolidate several x86 servers. Actually, it is the other way around, x86 servers could instead consolidate Mainframes instead. Now I talk about cpu power. I know Mainframes have lots of help cpus, so they have good I/O. But in terms of cpu speed, they are dog slow. Now imagine if you gave x86 servers some help cpus...

Why don't you get some updated numbers from that list? It would give you a better argument.

Regarding IBM does not release benches, I wrote wrongly. I meant to say this: "IBM does not release Mainframe benches where they compare to other machines, such as Unix machines or x86". Of course IBM releases numbers where they show Mainframe strengths, such as good I/O. But IBM does not compare, for instance cpu power to other machines, such as Unix or x86. IBM is afraid, becuase then it will be evident how slow IBM mainframes actually are, in terms of CPU speed.

This is obviously true (for those who are not fooled by IBM marketing) - how much money is spent on developing x86, and how much money is spent on developing IBM Mainframe cpus? How many companies are involved in each architecture? How many engineers? It would not be very smart of you, if you think that IBM's 50(?) mainframe cpu engineers can match all cpu engineers that Intel and AMD have.

50? You're guessing again. It is many more than that. I know it is hard for people to fathom large numbers like 1.5 or 2 billion transistors. At least make an educated guess.

x86 is only the holy grail for certain applications in the short term. If intel is the only supplier left of high performance processors, we're all screwed.

Reply Parent Score: 1

Member since:
2007-07-27

foobar,
Oh, of course I meant that "software emulation is 5-10x slower than native code". Thanx for pointing it out.

"You are not accounting for cores/chips/processors correctly. A 64 processor z10 is actually 64 cores. An 8 socket nehalem is 64 cores. If you want per core performance, then it is 3200/64 = 50 emulated mips, and 50 * 5 = 250. So you need 120 nehalem cores by your hoakie numbers."

I didnt get that. Could you explain again? It seems that you claim that 64 Mainframe CPUs have in total: 64 cores? Then it must mean that Mainframe CPUs are single cores. I thought they were quad core?

Who (except IBM) is interested in performance per core? I am comparing one Mainframe cpu vs Nehalem-EX cpu. Not core vs core. I dont claim that Nehalem-EX core is faster than Mainframe core. (This is typical FUD from IBM: shift focus from cpu vs cpu, to something else, such as core vs core.)

Could please explain again, why an Nehalem-EX cpu is slower than a Mainframe CPU? Note that I wrote "CPU" not, core, or ALU, or registers, or whatever. Just because one part of the CPU is faster - it doesnt say anything about the entire cpu. "My car has a better ignition mechanism than your car, therefore my car is faster than yours" - plain FUD. You must compare car vs car, not some small part vs another part. No one is interested in that small part.

"While we're throwing around hoakie numbers, lets account for the other cores in the z10. Not channel cards, but the z10 cores characterized as SAPs and CFs. Since, IBM actually measured that 30,000 MIPs number, we should include the cores doing IO. For the biggest machine, that's another 11 cores. 11 * (30,500/64) = 5200 MIPS. So we can estimate that a z10 is really capable of 35,700 MIPS."

Come on, this is really silly of you. You dont want to do this comparison. You would be really upset if I claimed (just like you do): "Well, the latest Nvidia card is capable of TeraFlops, therefore the Nehalem-EX server must be faster than the Mainframe server".

Now THAT comparison would be hoakie, dont you agree? But it is ok if IBMers do this comparison, right?

BTW, I talked with another IBMer who claimed that: despite you need four POWER6 cpus to match two Intel (ordinary) Nehalem, the POWER6 is faster. Because it has higher clocked core, or something weird. I never understood his logic. It was really weird. Then he started to talk about pricing, the POWER6 software licenses would be cheaper, therefore the POWER6 is faster. I dont get it, where all IBMers find that weird stuff to say? The funny thing is, they BELIEVE it is true! :o) Even today he is convinced that POWER6 is faster than Nehalem. I couldnt talk him out of it. No matter what I said, he refused to listen. :o)

Reply Parent Score: 2

Member since:
2006-02-07

""You are not accounting for cores/chips/processors correctly. A 64 processor z10 is actually 64 cores. An 8 socket nehalem is 64 cores. If you want per core performance, then it is 3200/64 = 50 emulated mips, and 50 * 5 = 250. So you need 120 nehalem cores by your hoakie numbers."

I didnt get that. Could you explain again? It seems that you claim that 64 Mainframe CPUs have in total: 64 cores? Then it must mean that Mainframe CPUs are single cores. I thought they were quad core?
"

Yup, I didn't think that you would actually look at the redbook that I linked. The big z10 machine has 4 books. Books have many chips on the MCM. 5 of them are processor chips. Each processor chip has 4 cores. There are a grand total of 80 cores in a z10. After subtracting 11 for SAPs, 2 for spares, and 3 for cores that don't pass manufacturing tests, there are 64 cores for customers to run their favorite operating systems on.

In IBM-speak a way or a processor is a core. It's been that way for 10 years with the i/p/z stuff. You are very quick to make claims about how bad IBM is, yet you don't know such details? Are you making judgments based on you're own personal biases, or the technical facts?

Why don't you read a little about the hardware that you are bashing? Next time we can have a better argument

Who (except IBM) is interested in performance per core? I am comparing one Mainframe cpu vs Nehalem-EX cpu. Not core vs core. I dont claim that Nehalem-EX core is faster than Mainframe core. (This is typical FUD from IBM: shift focus from cpu vs cpu, to something else, such as core vs core.)

If you want to argue cpu vs cpu, then you need to state that up front instead of making unsubstantiated claims and then babbling about extrapolated emulation performance.

Like it or not. In the real world, outside of the desktop, hobby realm, there's an awful lot of applications that are still priced per core. This is true for all platforms.

Could please explain again, why an Nehalem-EX cpu is slower than a Mainframe CPU? Note that I wrote "CPU" not, core, or ALU, or registers, or whatever. Just because one part of the CPU is faster - it doesnt say anything about the entire cpu. "My car has a better ignition mechanism than your car, therefore my car is faster than yours" - plain FUD. You must compare car vs car, not some small part vs another part. No one is interested in that small part.

Could we please back up and remember what this thread is about? You made an unsubstantiated claim about a "dog slow", and "1/10th of the speed":

As I have shown, a modern x86 cpu is roughly 10x faster than a Mainframe CPU. Why port from fast x86 cpus, to dog slow cpus? I dont understand why you want to get 1/10th of the speed?

Then when I challenged you, you built an argument on a Wikipedia post that had extrapolated performance numbers. Somehow you twisted the numbers to conclude that an 8 socket nehalem machine can replace a fully populated z10.

If you would have kept your argument brief and used technical facts, I would have admitted that 4 z10 cores on a chip vs 8 Nehalem cores on a chip is probably 60%. But you chose to talk in circles about emulation and leave out important details about each machine.

""While we're throwing around hoakie numbers, lets account for the other cores in the z10. Not channel cards, but the z10 cores characterized as SAPs and CFs. Since, IBM actually measured that 30,000 MIPs number, we should include the cores doing IO. For the biggest machine, that's another 11 cores. 11 * (30,500/64) = 5200 MIPS. So we can estimate that a z10 is really capable of 35,700 MIPS."

Come on, this is really silly of you. You dont want to do this comparison. You would be really upset if I claimed (just like you do): "Well, the latest Nvidia card is capable of TeraFlops, therefore the Nehalem-EX server must be faster than the Mainframe server".

Now THAT comparison would be hoakie, dont you agree? But it is ok if IBMers do this comparison, right?
"

I wouldn't have used the word "hoakie" if I was serious. Remember, you started the silliness with extrapolated emulation numbers when you really wanted to compare native applications.

BTW, I talked with another IBMer who claimed that: despite you need four POWER6 cpus to match two Intel (ordinary) Nehalem, the POWER6 is faster. Because it has higher clocked core, or something weird. I never understood his logic. It was really weird. Then he started to talk about pricing, the POWER6 software licenses would be cheaper, therefore the POWER6 is faster. I dont get it, where all IBMers find that weird stuff to say? The funny thing is, they BELIEVE it is true! :o) Even today he is convinced that POWER6 is faster than Nehalem. I couldnt talk him out of it. No matter what I said, he refused to listen. :o)

At this point, I really don't believe you. I think you're making stuff up now. Anecdotal evidence is an oxymoron. See what happens when I change a few words:

BTW, I talked with another Sunshiner who claimed that: despite you need four Niagara cpus to match two Intel (ordinary) Nehalem, the Niagara is faster. Because it has lower clocked core, or something weird. I never understood his logic. It was really weird. Then he started to talk about pricing, the Niagara software licenses would be cheaper, therefore the Niagara is faster. I dont get it, where all Sunshiners find that weird stuff to say? The funny thing is, they BELIEVE it is true! :o) Even today he is convinced that Niagara is faster than Nehalem. I couldnt talk him out of it. No matter what I said, he refused to listen. :o)

Reply Parent Score: 1