Ars’ Hannibal takes a look at the announcements concerning quad-core processors from Intel and AMD. “The past few days have seen a burst of quad-core-related news items from both AMD and Intel. In this post, I’ll take a look at announcements from both companies and try to put them in the larger context of the Intel-AMD rivalry.” He concludes by saying: “This is a tough one to call, but don’t expect a blow-out on either end. Also, the performance of the K8L core is the big wildcard, since it could go either way.”
Quad-Core Race Heats up
About The Author
Follow me on Twitter @thomholwerda
2006-08-18 4:21 pmpcbsdusr
Most Software makers didn’t have a reason to make their software optimized for more than one core upto now and you don’t need to buy the latest and greatest to have a great pc from now on*. I am glad this race is heating up.
*Update (Unless you plan on using Microsoft’s systems and software i mean)
Edited 2006-08-18 16:22
2006-08-18 9:28 pmBluenoseJake
Windows does SMP quite fine, whatever it’s faults (and it has many) SMP is not one of them
2006-08-18 5:27 pmrajj
There are more ways to achieve parallelism besides multithreading. Just having different processes running on seperate processors is benefit enough. It really depends on your workload.
Parallel make can always use more processors :-D.
2006-08-18 5:31 pmbutters
It won’t end. They’ll keep piling on the cores, making the core wider and wider, and adding specialized execution units for everything from graphics to TCP.
The n-core race, like the GHz race before it, is just a glamorous way to package the transition from clock frequency to bandwidth as the key to increasing throughput. Likewise, the core competencies for processor design are shifting from process technology to bus technology.
This couldn’t be better news for AMD. Intel better have some awesome bus technology in the pipeline (no pun intended) if they want to get mileage out of the n-core race. As it stands, AMD’s dominance in bus technology trumps Intel’s one-year lead in process technology, especially in the server and high-end client spaces. In the previous design cycle, AMD bet on HyperTransport and Intel bet on NetBurst. Only one of those bets paid off in the long run.
On the software side, server applications have been supporting various threading models for years, and most of the notable UNIX-like operating systems are scaling quite nicely in this area. There is still room for improvement, especially in virtual memory management, but for the most part, this work can happen independently of the application developers.
On the client side, we won’t be seeing the same kind of single-threaded performance scaling we’re used to, and that’s not really a huge problem. The only code that’s hard to parallelize is control code, which is usually more I/O-bound than anything. That’s why we’re seeing huge caches in recent processors: to help hide the I/O latency of single-threaded control code. Intel definitely has the advantage in this area, so I would expect them to be significantly stronger than AMD in the midrange client space.
Three or four years ago I was bullish on Transmeta’s prospects in the low-to-midrange mobile market. I really didn’t think that Intel and AMD’s architectures were suitable for that kind of application. It turns out that Transmeta was trying to market to a gap in demand. People either want palmtop-type gadgets or full-blown notebooks with rich-client functionality. There’s no market in between. Now that Transmeta is out of the processor business, Intel and AMD can play the performance-per-watt card with impunity.
2006-08-18 11:44 pmbutters
If you’re interested in processor architecture at all, then you’d probably want to read this excellent article from AnandTech:
The executive summary is that the Core represents the logical progression of the P6 architecture toward a wider, fatter, and more out-of-order design, resulting in significantly higher instruction-level parallelism and single-threaded performance. In other words, what would have come after the Pentium III if the company wasn’t run by the process engineers at the time.
The most impressive feature of the Core architecture is the ability to reorder loads ahead of stores very early in the pipeline, before the addresses of previous stores are calculated. The premise is similar to branch prediction, where you gain a tremendous increase in efficiency >99% of the time and pay a relatively small penalty once in a while. This helps keep the load unit busy and minimizes the effects of a cache miss.
In short, Intel has the superior processor architecture, AMD has the superior bus architecture. Intel will do better in single-socket systems, AMD will do better in multi-socket systems. Intel will probably have the superior gaming CPU for the forseeable future, but AMD is a formidable force in the server market.
Now, if Intel really wants to shake things up, they’d bolt on some in-order vector units with independent L1 caches. Then wave slowly as they leave the Cell in the dust (in both performance and programmability) and walk away with the next generation console contracts.
If you have an extra core sitting mostly unused inside your computer, wouldn’t that make it much easier to use compile-from-source distributions like Gentoo? They could compile in the background while you work. And it would be great if there was a precompiled live CD/DVD that would allow you to work while the base system compiled in the background.
Just a thought…
2006-08-18 6:19 pmWes Felter
Or you could use the extra core to do nothing; the effect is the same.
2006-08-18 7:17 pmJonathanBThompson
GCC isn’t exactly the fastest compiler around, and even fast compilers and linkers take real processing time to do their work. If you have large things to compile, SMP/multi-core machines can make builds go much faster if you’ve got the build environment setup correctly. Except for the final executable linking phase, compiling a large application can quite readily benefit from throwing as many cores at it as you have the I/O subsystem to feed it with files, and the RAM to hold all that the compiler needs for building without hitting swap. Because a compiler is typically single-threaded, there’s no significant CPU cache thrashing when doing SMP builds, because they’re using separate processes.
For those that don’t have such readily parallelized applications, and don’t have several major processes that are doing computational work in the background (drive defragmentation, anti-virus drive-scanning software, a real hog, drive data indexing, such as Spotlight, SETI stuff, ray-tracing, etc.) then beyond a few cores, it’ll largely remain idle in most systems. The important thing to remember, though, is if you don’t have enough RAM or fast enough drive system to keep these hydra monsters fed, it doesn’t matter how fast the individual cores are, or how many you have: they’ll all be waiting at exactly the same speed for data to arrive.
2006-08-18 6:39 pmjziegler
I don’t think the 2nd core would be “doing nothing”. Even in the simplest case, you have numerous kernel threads (not sure though, if these are CPU-movable) and at least a few running processes.
Currently, I have 122 threads running:
$ ps aux |wc -l
And that is a notebook computer, doing nothing special. If it had 2 cores, the kernel would definitely utilize both of them.
2006-08-18 6:48 pmsbergman27
And how many of those processes are actually doing anything? I rarely have more than one cpu intensive process active at any given time. In fact, even on my customers’ desktop servers, serving up XDMCP desktops, having more than one cpu intensive process running at a time is the exception rather than the rule. And that’s with 20-30 desktop users logged on.
I must confess to being baffled by all this multicore desktop excitement.
I just got myself a new processor for my own desktop machine. I chose the fastest single core AMD I could find. And looking at my loadavg history, I have absolutely no regrets about that decision.
2006-08-18 7:01 pmjziegler
That would depend on what you are doing. Right now, as I’m browsing the web, there’s a small load.
Should I be developing something, or working on my photos, I’d rather have more slower cores than one faster. Should save some taks switches and the involved cache flushing.
I’d rather have the gcc (or gimp) running on one 1 GHz core and the X server + window manager, etc. on a second 1 GHz core than having it all running on one 2 GHz core.
I’m not claiming that multicore CPUs are necessary for desktops, but I sure can see how they can be utilized on a desktop. Even more so on what one would call a workstation.
2006-08-18 7:18 pmsbergman27
I find that, typically, even when I have more than one process using the processor intensively, one usually wants all the processor and the other only wants some.
I think this would fit your “gimp” example.
In that case, with 2 processors at half speed, the really intensive one runs half as fast as it would on a single core that was twice as fast. The less intensive one would get, say 30% of the other other processor.
In this case, a single core at twice the speed should run both processes together about 50% faster than the 2 slower cores.
(I’m disregarding cache flushing, etc. here, but the single core processors I use these days have 1-2MB of L2. I doubt the effect is going to be noticeable. Plus with single core, you don’t have the overhead of the SMP kernel paths, which I also don’t think is going to be noticeable to the user.)
Now, compiling is a different case. That task is quite cpu intensive and parallelizes trivially.
The way I look at it, with a single core, you get the benefit of every clock cycle the cpu has. With dual core, and even more so with quad core, it’s very, very iffy that you are going to see any benefit at all.
Edited 2006-08-18 19:19
2006-08-18 7:49 pmsbergman27
Just as a point of interest, and rather than talking hypothetically, here is some actual data, for today, from one of my machines running as a desktop server. It currently has about 25 people logged in running Gnome desktops and doing all the usual business desktop tasks. Openoffice, Firefox, email, etc. It is also running several dozen instances of a point of sale and accounting package.
Note how the 1 minute load average rarely ever goes above 1, even with 25 users.
This is why I am skeptical that multicore is going to help the average single user desktop.
07:00:01 AM 0.32
07:10:01 AM 0.04
07:20:01 AM 0.18
07:30:01 AM 0.18
07:40:01 AM 0.18
07:50:01 AM 0.68
08:00:01 AM 0.31
08:10:01 AM 0.10
08:20:01 AM 0.61
08:30:01 AM 0.12
08:40:01 AM 0.37
08:50:01 AM 0.27
09:00:01 AM 0.42
09:10:01 AM 0.56
09:20:01 AM 0.38
09:30:01 AM 0.28
09:40:01 AM 0.91
09:50:01 AM 0.55
10:00:01 AM 1.03
10:10:01 AM 0.69
10:20:01 AM 1.81
10:30:01 AM 0.58
10:40:01 AM 0.79
10:50:01 AM 0.85
11:00:01 AM 0.36
11:10:01 AM 0.24
11:20:02 AM 0.82
11:30:01 AM 0.58
11:40:01 AM 0.25
11:50:01 AM 1.01
12:00:01 PM 0.36
12:10:02 PM 0.32
12:20:01 PM 1.30
12:30:01 PM 0.35
12:40:01 PM 0.85
12:50:01 PM 0.29
01:00:01 PM 0.94
01:10:01 PM 0.56
01:20:01 PM 0.55
01:30:01 PM 0.52
01:40:01 PM 0.86
01:50:01 PM 0.45
02:00:01 PM 0.37
02:10:02 PM 0.68
02:20:01 PM 0.44
02:30:01 PM 1.33
2006-08-18 8:06 pmjziegler
First, I said “I’d rather have”. Maybe I should have been more verbose about it. For the same price, I’d rather have two times lesser speed than one time higher speed.
However, you probably won’t have this choice when buying a new system. Or not in such an extreme way as I might have suggested.
As for GIMP. Yes, it will run 2 times slower in my example. Personally, I can wait for an operation to take 60 seconds instead of 30. However, the second core can run my window manager, browser, music player and they will be as responsive as if they were on a single core, without the GIMP process.
Also, with every task switch, you have to flush at least all CPU registers and the MMU lookup tables. I’m a bit rusty on my OS theory, so I’m not 100% sure about the CPU caches. Feel free to (re-)educate me . And throwing out (and almost immediately bringing back) _anything_ for a process that demands 100% of a CPU core is a waste. Might not be big, but still is a waste.
Again, I’m going to reiterate my conclusion. If a single core floats your boat, good for you. However I disagree with your opinion that there is no use for a multicore on a desktop. I think there is. Even more so on a workstation.
More workstation uses come to mind – CAD, MatLab – in all of them, you can have one core busy with the “main” work and one doing your desktop duties. Having two cores handle interrupts should help I/O as well.
2006-08-18 8:18 pmsbergman27
Well, just to be clear, I’m not saying that there is no use for multicore in a desktop box. If you’re a developer, it’s a big win. If you are running some sort of multi-threaded rendering software, or have multiple cpu intensive apps that run simultaneously, it can be a big win.
My point is that the hype around multicore desktops *far* exceeds the actual benefit for the vast majority of desktop users.
If we narrow that to “workstation users”, I imagine the picture looks quite different.
2006-08-18 7:50 pmfirl
for those who don’t know, gentoo allows for you to specify the ammount of cores to utilize while compiling.
and Gentoo has a live CD installer for the 32 bit environemnt that allows you to do a lot.
What I usually do is just browse the web while compiling using
What use does a workstation have with a dual quad core configuration. 8 cores? iNTEL obviously will croak until they get CSI out. But outside the server space (at least if they can get the memory speeds needed) the software needs to catch up before there is a reason to add on cores each year…
Now give me less *heat* instead, then I would be happy