Here’s something you probably don’t know, but really should – especially if you’re a programmer, and especially especially if you’re using Intel’s compiler. It’s a fact that’s not widely known, but Intel’s compiler deliberately and knowingly cripples performance for non-Intel (AMD/VIA) processors.
Agner Fog details this particularly nasty examples of Intel’s anticompetitive practices quite well. Intel’s compiler can produce different versions of pieces of code, with each version being optimised for a specific processor and/or instruction set (SSE2, SSE3, etc.). The system detects which CPU it’s running on and chooses the optimal code path accordingly; the CPU dispatcher, as it’s called.
“However, the Intel CPU dispatcher does not only check which instruction set is supported by the CPU, it also checks the vendor ID string,” Fog details, “If the vendor string says ‘GenuineIntel’ then it uses the optimal code path. If the CPU is not from Intel then, in most cases, it will run the slowest possible version of the code, even if the CPU is fully compatible with a better version.”
It turns out that while this is known behaviour, few users of the Intel compiler actually seem to know about it. Intel does not advertise the compiler as being Intel-specific, so the company has no excuse for deliberately crippling performance on non-Intel machines.
“Many software developers think that the compiler is compatible with AMD processors, and in fact it is, but unbeknownst to the programmer it puts in a biased CPU dispatcher that chooses an inferior code path whenever it is running on a non-Intel processor,” Fog writes, “If programmers knew this fact they would probably use another compiler. Who wants to sell a piece of software that doesn’t work well on AMD processors?”
In fact, Fog points out that even benchmarking programs are affected by this, up to a point where benchmark results can differ greatly depending on how a processor identifies itself. Ars found out that by changing the CPUID of a VIA Nano processor to AuthenticAMD you could increase performance in PCMark 2005’s memory subsystem test by 10% – changing it to GenuineIntel yields a 47.4% performance improvement! There’s more on that here [print version – the regular one won’t load for me].
In other words, this is a very serious problem. Luckily, though, it appears that the recent antitrust settlement between AMD and Intel will solve this problem for at least AMD users, as the agreement specifically states that Intel must fix its compiler, meaning they’ll have to fix their CPU dispatcher.
The Federal Trade Commission is investigating Intel too, and it is also seeking a resolution of the compiler issue, but the FTC takes it all a step further than the Intel-AMD settlement. Since the latter only covers AMD, VIA could still be in trouble. Consequently, the FTC asks that Intel do a lot more than what’s described in the AMD settlement:
Requiring that, with respect to those Intel customers that purchased from Intel a software compiler that had or has the design or effect of impairing the actual or apparent performance of microprocessors not manufactured by Intel (“Defective Compiler”), as described in the Complaint:
- Intel provide them, at no additional charge, a substitute compiler that is not a Defective Compiler;
- Intel compensate them for the cost of recompiling the software they had compiled on the Defective Compiler and of substituting, and distributing to their own customers, the recompiled software for software compiled on a Defective Compiler; and
- Intel give public notice and warning, in a manner likely to be communicated to persons that have purchased software compiled on Defective Compilers purchased from Intel, of the possible need to replace that software.
Fog also offers up a number of workarounds, such as using GNU GCC, whose optimisations are similar to that of Intel’s compiler, “but the Gnu function library (glibc) is inferior”. You can also patch Intel’s CPU dispatcher – Fog even provides a patch to do so in “Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms“.
This is a particularly nasty kind of anticompetitive practice, as it really requires deep knowledge of matters in order to find it out. God knows how many benchmarks have been skewed in favour of Intel simply because people unknowingly used Intel’s compiler in good faith. Intel’s compiler is seen as the cream of the crop and delivers superior performance, but apparently only if you stick to GenuineIntel.


Wow I remember more then 5 years ago on that topic. I even used a (perl?) script that removed that check from a compiled library or the compiler itself. I still use intel’s compiler, but each time I use it I’m aware that the compiler is a “Defective Compiler” on anything else then Intel. Luckily for me (or for intel…) most developpement was on intel machines.
I kind of gave up hope on a real fix for that. I am wonderfully surprised to see intel migth be forced to fix it! It is good news.
On the other hand, since the time I’ve read on this, GCC has come back from crappy to almost par with icc. And it’s always a good thing to compile/run on many compilers; you’d be surprised to see how many errors slip through which a certain compiler accept!
Would it be illegal for AMD and VIA to just put “GenuineIntel” in the CPUID, and use another field for AuthenticAMD and CentaurHauls?
I look at it as comparable to Internet Explorer putting “Mozilla/x.x (compatible; MSIE x.x)” as the User-Agent. (Or Opera’s “Mozilla/x.x (compatible; MSIE 6.0; Opera x.x)” user-agent from a few years back.)
I highly doubt it’d be legal as “GenuineIntel” is most likely a registered trademark and all. Besides, doing that wouldn’t help for any of the processors already out there, only for new ones.
This compiler “defect” has been a really shitty move from Intel and it gives me yet another reason to stay away from their hardware. Just for the sake of lining their own pockets they intentionally cripple the performance of millions of end-users all around the world..
How exactly is intel “crippling” millions of users around the world?
I don’t particularly agree with this approach. But honestly, nobody is stopping AMD from developing their own compiler suite if they really so much about all those “millions of users around the world”
Microsoft’s compiler supports them well.
GCC supports them well.
LLVM supports them well.
Binaries made with the Intel compiler are AMD’s only issue, and making their own compiler would not fix that problem.
Try getting those developers to use it.
They never should have put it in place to begin with. Personally, I think they aught to have the shit slapped out of them.
It shouldn’t be though I suspect Intel would claim it was trademark or copyright infringement or perhaps fraud. So long as AMD, Via or whoever was clear they did so to the customer and/or allowed them to toggle the ‘feature’ it should be allowed.
It would undoubtedly be a violation of the x86 licensing agreement AMD and Via have with Intel.
I think AMD spoofing Intel’s hardware identifier is much closer to Palm spoofing Apple’s hardware identifier. In both cases, the reason for needing the spoof is pretty scummy.
The real solution would be to fix icc so that it’s no longer leveraged to impose intel chip lockin. Apple at least has some grounds for bundling iTunes/iPod though it really should be music player separate from media manager if it was really about the end user. Intel modifying a generic code compiler to cripple non-Intel; I’m not seeing any grounds for that.
Actually, I think the developers out there who do have to work with icc should all call into Intel once a week if not once a day until it’s fixed. Overwealming the call center should eventually get the point across.
That’s a misunderstanding and/or misrepresentation of the facts.
Intel’s compilers are Intel’s own work, and never were ‘generic code compilers’, they were always aimed strictly at Intel’s own products.
Intel didn’t ‘modify’ anything, nor did they ‘cripple’ anything.
Optimizing for anything other than Intel’s own CPUs was just never part of the goals of the Intel compiler suite.
You can’t ‘modify’ or ‘cripple’ something that has never been different anyway.
Intentionally using misinformation is for kids.
I’m open to my misunderstanding the situation though. Still reading along through the various discussions with an open mind.
(now, a thread between the very first poster, the one with a shovel and yourself could be interesting given the poster’s history of modifying icc.)
Short version:
Intel Compilers are a closed-source product, not based on any other products.
Hence they are not other products modified by Intel, and other people cannot modify them either.
What is described above, with perl scripts, is modifying some of the code that the Intel compiler generates for checking CPUs.
So it’s not the Intel compiler itself that is modified, but rather some of the code that it generates.
I understand your point now, after having read all of your other posts in this topic. You could have saved us all a lot of confusion by clearly explaining in this comment ( your first one) why simply just checking feature specific flags might not be the best way to optimize processors.
You’re actually two steps ahead of most posters, but because of that you seem like you’re a step behind.
Story of my life
People don’t seem to be listening anyway. Agner Fog is actually saying the same thing as what I’m saying, but it seems that people skip over the details of family/microarchitecture selection, and cry foul.
They just want to hate Intel, they don’t want to understand the problem.
Basically I’m saying just three things here:
1) I agree that Intel’s CPU dispatching isn’t the best solution for non-Intel CPUs.
2) Since Intel’s compiler doesn’t have a significant marketshare, I think it’s Intel’s right to do what they do, and no government or other organization should have the legal pull to change that. It would basically mean that any company can sue any competitor over anything they don’t like, and I don’t think anyone should want that.
3) I think it’s a big problem that tech/news sites such as this have editors who don’t check their facts properly, and just post false accusations and lies, and even try to defend them when someone points them out. The editor of this article has clearly not understood Agner Fog’s article entirely, and has drawn false conclusions, and has already convicted Intel for it.
To all editors of all tech/news sites everywhere:
Guys, lose the ego. You can’t be an expert in every field, so nobody expects you to fully understand everything tech-related. It’s okay to consult people who are experts in a specific field.
In fact, I think it is your RESPONSIBILITY to have your stuff checked by others who can verify the technical details in your article BEFORE you post it online.
Now if there need to be any lawsuits, I think it’s this. Far too many websites just throw rumours and lies around, and sling mud towards various large companies (the companies people love to hate, such as Microsoft, nVidia, Intel).
Edited 2010-01-05 09:08 UTC
If you don’t like the code generated by the Intel compiler… don’t use it. Why should they be forced to pay attention to competitor’s products and make *their* compiler compatible with them unless ICC customers demand it? Using the most generic path is the only practical option when not knowing the specifics of the architecture.
The only thing anti-competitive here is people advocating Intel be made to do what they want them to by force.
If you don’t like the code generated by the Intel compiler… don’t use it. Why should they be forced to pay attention to competitor’s products and make *their* compiler compatible with them unless ICC customers demand it? Using the most generic path is the only practical option when not knowing the specifics of the architecture.
You’re totally off-the-base here. First of all, Intel themselves market ICC as being compatible with AMD and Via processors. Secondly, even if the compiler didn’t do any kind of architechture-specific optimizations it could still choose the most appropriate path based on CPU’s reported capabilities, ie. f.ex. if it supports SSE3 choose the path which uses SSE3. ICC intentionally chooses the slowest path for any other than GenuineIntel processors, even when the CPU reports its capabilities correctly.
Marketing it as completely compatible compiler and then pulling off such tricks actually IS anti-competitive.
Edited 2010-01-03 21:15 UTC
No such thing. Intel markets their compiler as being compatible with the X86 and EMT64 (and IA64) instruction sets from Intel processors. Have you even used icc?
Technically they still produce compatible code. Nowhere in intel marketing/literature they claim to produce optimized code produced for AMD microarchitectures.
The continual moving of the goal posts in order to fit some narratives can be fascinating.
Intel develops compilers for intel processors. Is that such a hard thing to comprehend? Or is there any sort of entitlement from your part that would bind intel to spend money and effort to schedule instructions for a competitor’s part?
Such attitudes are even the more ridiculous, if you consider that there is a perfectly viable (and in most cases quite competitive) alternative like gcc which is completely free. Good grief….
Edited 2010-01-03 22:51 UTC
If they developed a compiler that produced code optimized for intel cpus, but which would execute exactly the same code on compatible non intel cpus…
Such an example, would be a version of gcc where the cpu type options for non intel cpus have been removed.
What the article talks about, and what people have a problem with, is the fact that the intel compiler intentionally chooses a less optimal approach when dealing with non intel cpus.
What are you talking about?
Intel only guarantees to produce optimized code for intel microarchitectures. For starters, AMD will most likely refuse to share a lot of internal microarchitectural info with Intel. Heck, part of the reason why GCC will always lag in certain performance scheduling is because they don’t have access to the same privileged internal information that intel themselves have about their microarchitectures.
All I see is a bunch of comments by people who most likely have never used ICC in a production environment. Is it a douche move, probably, but nowhere does intel claim to produce optimized instruction schedulings for non-intel microarchitectures. Note that I used the term microarchitecture, not ISA, I assume a lot of posters in this site don’t fully get the difference between the two.
As I said, nobody is stopping AMD from producing their own compiler optimization (although I am aware they provided a lot of support to the GCC folks) suite, or paying intel to support their architectures in ICC.
Edited 2010-01-03 23:18 UTC
Where is this mentioned explicitly in the compiler documentation? I am not saying it is not there, just that I have not found it.
In their product brief they have this quote “The Intel® compiler generated faster code than other compilers for most of our tests on both IA-32 and x86_64 platforms, which helps us deliver the performance our customers demand.”
See how it mentions x86_64 platform and not Intel EMT64 processors only? Granted this quote is not by an Intel employee but it is in their marketing material!
Why is this relevant? They do not need to! Just using the right ISA extensions is all developers are asking for! You are deliberately twisting the facts here. Intel does not have to invest anything because code generated for Intel processors which is just using the right ISA extensions would do. Nobody is asking for micro-architectural optimizations, just that they use the right ISA extensions.
GCC has very different problems than this.
Yeah right, the “everyone is a moron except me” metality. I have used the ICC compiler in commercial products (care to find me another decent FORTRAN compiler?) and I really do not get why customers are standing for this. It is not as if the compiler is free or anything. Intel is locking in their compiler customers into their architecture. Recompiling older software is mostly something to be avoided, so when system upgrades get discusses, guess which platform scores best?
I understand the difference quite well and I am not asking for optimized instruction scheduling on AMD processors, just that they use the right friggin ISA extensions.
That would be great. An AMD compiler for AMD processors and an Intel compiler for Intel processors. Guess x86 is not such a great “standard” after all. The lesson to be learned here is do not use the Intel compiler and support GCC and LLVM. I have a very high regard for Intel’s technological achievements but I despise their business tactics.
In the product brief, see my first post in this thread.
They specifically mention Intel platforms.
So a client not mentioning Intel nor AMD, but just IA-32/x86-64, is your argument?
Not very convincing. It could stil be in the context of Intel CPUs only (which it most probably is).
Not in the way you think it is.
x86 standardizes the ISA, but there are tons of completely different x86-compatible microarchitectures.
You can’t optimize code ‘for x86’, you can only optimize for a specific microarchitecture, as the x86 ISA doesn’t say anything about how instructions are supposed to be implemented, let alone how they should perform.
There are plenty of examples of instructions/operations that are very fast on one x86 march, but very slow on another.
I am very much aware of this problems and this is exactly one of the problems of x86. There are so many ways to do exactly the same thing, which all have different performance characteristics, that in order to get good performance out of x86 you have to optimize for a specific micro-architecture much more than you have to on some other ISAs. For example, Core2 is much better in Read Modify Write instructions than K8/9/10, which is faster if you do a Load/Store approach.
But again, this is not what I was referring to. If a CPU reports to have SSE3, then use it. I do not think Intel should optimize their code generator for AMD CPUs but their CPU dispatcher should not look at vendor string but use the available ISA extensions as reported by the CPU.
Perhaps, but strictly speaking it doesn’t change anything, does it?
Eg, say Intel would compile a certain SSE3-routine with a lot of read-modify-write operations, for Core2. Then people will still complain that it’s unfair to AMD, and they should use load/store code instead.
And who’s to say what’s going to happen with future architectures? (Just like Pentium-optimized code was often a drama on Pentium Pro/PII, and Pentium 4 performed poorly on anything that wasn’t specifically optimized for it). With every major march-revolution, the rules of optimization change completely, with potentially disastrous results.
The real problem is not with the Intel compiler, but rather with people using it on non-Intel systems.
The solution is also not with Intel and modifying there compiler, but rather with these developers switching to a more neutral compiler.
Edited 2010-01-04 12:46 UTC
In my opinion it does since I think that Intel is deliberately crippling AMD’s performance by using for instance x87 when SSE could be used. You are talking about micro-architecture optimizations, while I am merely referring to using the extension mechanism correctly. These are two separate issues.
The ISA extension mechanism is there so compilers can query the CPU’s extensions, and it should be used as such. Changing the CPU dispatcher would cost Intel nothing and everybody would be better off.
The problem with micro-architecture optimizations are here to stay and I see this as a separate issue. I do not expect Intel to have profiles in their compiler to optimize for AMD processors.
No, they aren’t separate issues.
I’ve already given real-world examples where optimized code on one micro-architecture would cripple another micro-architecture.
Would everybody be better off? I doubt it.
In fact, there are examples of AMD CPUs being faster with x87 code than with SSE2, as AMD’s SSE2-implementation was never very good anyway.
Please do us all a favour and read the linked article. The article clearly states that Intel deliberately disables the fast code paths on non-GenuineIntel CPUs. So they disable SSE2, SSE3… although these CPUs are perfectly capable of it, and have the respecting flags. Also with respect to Intel only making a compiler for Intel processors:
from http://software.intel.com/en-us/intel-compilers/ :
(my emphasis).
So yes Intel is claiming that the compiler is compatible with AMD and VIA processors. And that does not mean deliberately crippled.
It is compatible, the code generated runs on compatible processors. What they do not claim however is that the compiler will optimise for those processors.
The following snippet is taken from http://software.intel.com/en-us/articles/intel-c-compiler-111-relea…
Click “English” under Intel C++ Compiler Professional for Windows to get the PDF
1.3 System Requirements
For an explanation of architecture names, see http://software.intel.com/en-us/articles/intel-
architecture-platform-terminology/
A PC based on an IA-32 or Intel® 64 architecture processor supporting the Intel®
Streaming SIMD Extensions 2 (Intel® SSE2) instructions (Intel® Pentium® 4 processor
or later, or compatible non-Intel processor), or based on an IA-64 architecture (Intel®
Itanium®) processor
o For the best experience, a multi-core or multi-processor system is recommended
Notice where it says “compatible, non-Intel processor”?
Compatible means supporting the instruction set so the only check the compiler should make is for the existence of (sets of) instructions.
An intentional check for the CPU manufacturer is discriminatory unless it’s a workaround for a known bug, hey like that famous Pentium bug.
But why stop there – how about not enabling advanced features for non-Intel NICs and WLAN adapters. And what about SSDs? No need for the non-Intel ones to operate optimally, right?
Notice that these are the system requirements? That’s what it takes to RUN the Intel Compiler, not the architectures it TARGETS.
Intel is perfectly right in stating that their compiler works on non-Intel CPUs.
The problem is if a company, say Oracle for example, uses Intel’s compiler to create one set of binaries, these binaries will run slower on AMD machines rather than on Intel’s. It is in companies’ interest to produce one set of binaries that run on all platform, and Intel is now using their monopoly position to ensure that those binaries are crippled when run on AMD machines by providing a faulty compiler.
This is like Microsoft releasing a version a Windows on which a previous product won’t run. Wait, did anyone say Lotus?
One could argue Microsoft doesn’t have to make sure that Lotus can run on its operating system, but once Microsoft is a monopoly, then the rules change. And so do they for Intel.
Nope, because Intel does not have a monopoly on the compiler market.
I don’t think you can ever really have a monopoly on a compiler anyway.
With an OS or CPU it’s different. An application for Windows/x86 simply won’t run on other systems.
With a compiler… it doesn’t matter. Whether I use the Intel compiler, Microsoft compiler, gcc, or some other option, as long as it generates code for the intended OS and CPU architecture, it will work.
So you can’t have any kind of lock-in with compilers. Even if the rest of the world uses the Intel compiler, nothing would stop me from using gcc.
That’s different from an OS or a CPU, where you will be locked out of software that isn’t compatible.
most importantly, AMD and VIA are LICENSED to use the SSE2/3 etc. technology from intel and design their processors specifically to perform well with ICC generated code that the largest software vendors use. Intel is basically filching on it’s own cross-license agreements for the technology by sabotaging performance of the products of other companies.
Well there are two major issues that make this wrong, first Intel isn’t telling its customers that their compiler deliberately nerfs non-Intel performance, this is most certainly ethically wrong may violate some consumer rights, you do have a right not to be lied to by corporations. Second Intel is a monopoly under anti-trust investigations. Monopolies are held to higher ethical standards so that they do not use their power to squash competition by means other than legitimate competition, doing a vendor id check in their compiler would fall under that kind of violation, an x86 complier should be an x86 complier, if their chips are better they shouldn’t have to cripple their competitors to have advantage.
If Intel had been honest from the beginning this may not have been an issue, but since they lied and they are currently under anti-trust charges, forcing them to stop vendor string checking in their complier would be a reasonable part of their settlement.
Where, please tell us, where does Intel claim (or have they ever claimed) to produce optimized code for AMD processors with icc?
If the same code happened to run slower on an AMD CPU, that would not be a problem.
What is happening is that they are giving the non-Intel chip different code to run. This is very different from just not optimizing for non-Intel platforms.
That is not entirely true.
*Certain* Intel CPUs will also run that same code.
It’s not code that is *specifically* there to spite AMD.
It’s just code that is there as a fallback for when the CPU is not recognized (which could be an older/newer/unsupported Intel CPU just aswell).
A properly commented disassembly of Intel’s CPU dispatcher should show you how it works.
…certain Intel CPUs 10 years old, generally, that do not support the features.
That’s a good excuse for having fallback code paths, but not for checking for GenuineIntel to see whether a faster one should be used.
Or vice versa… new Intel CPUs running code compiled with an Intel Compiler that didn’t support it.
It’s not about the features, but about whether the compiler recognizes the particular CPU or not.
If it doesn’t recognize the CPU, it can make no decision about which codepath would be most optimal. It’s that simple.
Obviously it doesn’t recognize non-Intel CPUs by default.
And obviously Intel tries its best to:
1) Recognize as many Intel CPUs as possible with compiled code
2) Make sure that new CPUs remain recognizeable.
It’s very simple, really.
No matter how badly some of you WANT the Intel compiler to check features, this is not what it DOES. Never has, never will.
Checking features and checking microarchitectures are two different concepts.
Eg, PPro, PII and PIII share the same basic microarchitecture, but NOT the same features.
Conversely, Pentium D and Core2 Duo share (nearly) the same features, but NOT the same microarchitecture.
Code optimized for PPro will be optimal for PII and PIII aswell, although in some cases you may be able to use newer extensions.
Code optimized for Pentium D will NOT be optimal for Core2 Duo, and vice versa. In fact, Pentium D has quite a few performance hazards that a Core2 Duo doesn’t. So not avoiding these hazards (in optimized Core2 Duo code) will cripple a Pentium D. In fact, most of its life, the Pentium IV/D was crippled by having to run code that was compiled for PPro/II/III.
Simple, isn’t it? Repeat after me: it’s not about features.
Edited 2010-01-05 10:41 UTC
Stop using the ‘optimisation’ straw man.
People have repeatedly said here that the issue is the ISA extensions being intentionally disabled. Either you can’t read or you’re deliberately misrepresenting people’s arguments.
Your analysis seems to assume that an extreme form of selfishness is good for society and good for Intel. Fortunately for the rest of us, very few people make this assumption.
And your reply constitutes a massive red herring, and it is equally as invalid.
Since when has there been such a level of entitlement regarding compilers? Do you go around trashing The Portland Group because their compiler does not produce the code you feel entitled for your processor (even tough you haven’t paid for their products). Should we trash IBM because their XL compiler suite does not produce optimal code for the latest embedded core by Freescale?
It’s only a “massive red herring” if you happen to agree with the underlying philosophical assumptions and the social implications of behaving so selfishly. The rest of us are happy to see constraints on such behavior in the form of laws, social pressures, etc.
Using the analogy of an ecosystem, Intel acted as if they were dumping their pollution into another country’s river system.
This kind of practice in the software and hardware industry should be both illegal and socially unacceptable no matter who does it.
disclaimer: I am not a programmer.
but in the article there is a link to a benchmark (pcmark2005), apparently compiled with icc. now as far as I got it they tested via’s nano against intel’s atom and the funny thing is that in various tests after chaging the CPUID on the nano the performance-gain was up to about 50%, with no change of the compiler or compiled software whatsoever, simply by telling the compiler to use a path that was optimized for SSE3 and whatnot.
as far as I know those things are standards and if AMD wouldn’t implement them correctly, they wouldn’t be able to report the capability of using SSE3. the argument was that inspite of reporting those capabilities correctly and no additional work needed from the compiler, it would choose on the CPUID rather than the actual capabilities, resulting in Intel having to tweak their CPUID to be able to have best performance with software compiled on earlier ICC versions, again no recompiling neccessary, the code is all there, but which gets executed is choosen based on the CPUID, not the actual capabilities of the CPU (there are flags for those too as you should know).
The issue is not some oddity within the non-intel processors which Intel should be forced to recognize.
The issue is Intel intentionally writing icc so that it would introduce incompatibility in the resulting binary so that it run worse on other processors.
Think of a company that sells coffee makers. They also produce coffee and recommend it’s use with there own makers of course (nothing wrong so far). They add a chemical identifier into there own coffee which can be recognized by the various models of maker the company sells. One can run any brand of coffee ground through the maker but it’s designed so that it introduces a health risk into competitive brands. You want a refill on that cup a joe?
(edit): my analogy was a little off. It would be like the coffee maker intentionally taking twenty minutes longer to brew through grounds based on them being competitive brands.
Intel should just fix icc and make the issue AMD/VIA not supporting the optimized code rather than this “we get fast path, they get slow path” crap.
Edited 2010-01-03 23:37 UTC
Still wrong.
I think what you’re looking for is something like this:
Last year’s models of coffee makers took 20 minutes to brew.
The new models have a special ‘turbo’ mode, which cuts the brewing time down to 10 minutes.
Competing brands have also added a similar ‘turbo’ mode. However, the coffee of this particular brand will only recognize the ‘turbo’ mode of their own brand of coffee makers, and other brands, like older models of their own brand, will only use the standard 20 minute mode.
There is a difference between ‘not enabling’ and ‘disabling’.
The former is NOT taking action, and the latter is explicitly TAKING action.
Intel only checks to see if the CPUs are their own brand, and then selects an optimized path. That’s different from checking to see if they are any other brand, and specifically selecting a crippled path.
Edited 2010-01-03 23:49 UTC
A fair enough analogy adjustment. I’m learning as I go here and open to correction.
What you are missing is that SSE etc. are standards and the processor has to advertise its abilities, but Intel is deliberately ignoring them unless the processor is an Intel one. So in your analogy it would be:
All coffee maker companies had agreed on a TURBO (actually TURBO1,TURBO2 … ) standard and how the coffee tells the machine which of the TURBO standards they support. Now the Intel coffee makers checks which TURBO standard the coffee supports (note it also needs to do that for its own coffees because some of them only support TURBO1 but not TURBO2), however it also checks if the vendor string and only if the coffee is Intel coffee does it actually enable the TURBO standards.
Now if I bought that machine and it told me it supports coffees and the TURBO standard but it would only cook coffee at a time <20min with it’s own coffee I’d be seriously pissed off because they clearly have been lying.
I’m not missing anything. I think most people here are missing what the goal of the Intel compiler is. The goal is not to deliver ‘standard’ code that runs ‘reasonably well’ on all x86-compatible architectures.
The goal is to generate the most optimum code for Intel processors.
The goal is to generate the most optimum code for Intel processors.
Which would be fine if Intel existed in a vacuum, but Intel has licensed its ISA and extensions to others.
It would be nice if Intel would/could look beyond their own lawn and see that good performing code on non-Intel processors (who support the right instruction set) is beneficial to the x86 ecosystem (and in extension good for Intel itself).
Right now, they let their compiler collection create binaries which only run at maximal potential on Intel Processors and as such they burden end-users of non-Intel procs with extremely generic code. Only a problem with proprietary non-touch code, but still.
The other way around, I wonder how fast Intel would cry foul if GCC and LLVM started checking for AuthenticAMD and CentaurHaul and based on that would use the extensions on AMD and Via CPU’s and serving only lowest common denominator code to Intel….
Intel was FORCED to license its ISA. It wasn’t and isn’t an action they support.
Yea, it would be nice if Microsoft also made linux ports of all their applications and libraries.
Obviously that is not going to happen.
What is the problem with that? The Intel Compiler and optimized libraries are mainly aimed at scientific research, where hopefully people are smart enough to only use these products on Intel systems.
If end-users are burdened because someone uses the Intel Compiler for a commercial product that is to be sold to AMD users aswell, then it’s the fault of whoever made the choice to use the Intel Compiler rather than a neutral compiler.
It sounds like most people here try to convict the storekeeper for selling the rope that someone used to hang himself.
Edited 2010-01-05 19:43 UTC
Wow. Just, wow. Normally I wouldn’t even bother with commenting on an article like this as I’m not a programmer. However, this affects everyone who has ever used a program compiled with ICC on a non-Intel platform. As the majority of my non-Mac systems (past and present) have been AMD, this does indeed affect me.
This is really no different than a jockey drugging the food supply of his rivals’ horses. I think Intel should be punished much more severely than it seems they will be.
Honestly Intel, if you have to resort to actively hobbling your competitors, what does that really say about your confidence in your own products?
The only problem here is that it’s hard to be too outraged. Everybody’s known about this flaw for years. Yeah, it’s dumb, and I can’t think of a good reason for it (besides sloppiness or laziness). And yeah they should want to fix it, it’s just too stupid to leave in. But if you think nobody knew about this forever, you’re mistaken. Some bugs (even the silly but annoying ones) take forever to get fixed, and that’s IF they ever get fixed!
I didn’t know about it until now. I’m not a programmer, so I don’t always read about things like this unless they show up in my news feed (as this did today). And I realize that this is a known issue for a lot of people.
But I’m still going to be outraged, whether I have your permission or not. It’s a shitty thing for Intel to do, and a slap in the face to people like me.
Tell me the truth: If your spouse had been cheating on you for years, all your co-workers knew about it but somehow you never found out until today, would you honestly say “oh well, I can’t be mad since I didn’t know about it from day one”? That’s just silly.
I am for what the subject says.
This is the last straw however, and for many years I rationalized that the performance increases on a year to year basis from iNTEL offset its monopoly position.
This is just another example of GREED in a time when many people are already sick and tired of greedy bosses, greedy companies like iNTEL ripping them off and greedy bankers stealing little old ladies pensions and a government that not only looks the other way, but encourages the practices with rich cash rewards larger than the GDP of some countries.
What we need is a complete revolution in the areas of government, businesses and scientific research that adopts a GPL montra of some sort. Open practices, peer reviewed government, technology and business.
That being said, iNTEL is a large company. What manager was enforcing this despicable practice should be black balled, and never permitted again to work in the computing industry.
-Hack
Sadly, the guy who who kept this bug in place for so many years would be snapped up by the next employer rather than blackballed by being fired from Intel. Think of it from the business side and tell me Corp2 isn’t interested in a staffer from Corp1 who managed to keep a known flaw in place for so long. I’d be hard pressed not to pat that guy on the back if it was my company and I pretty much see everything from the point of view of what benefits the end user rather than share holder.
Nope, the more likely outcome of such a person existing and being fired would be “What, they let that guy go? Someone get me a conference call with him, HR and I.”
It’s right there in the product brief:
http://software.intel.com/sites/products/collateral/hpc/compilers/c…
“Each compiler delivers advanced capabilities for development of application parallelism and
winning performance for the full range of Intel® processor-based platforms.”
Not a word in the product brief about non-Intel CPUs, or other x86-compatibles or anything.
“and support for Intel® processors and compatible processors.”
See the “compatible processors”? Taken straight from their own website, in the freaking front page: http://software.intel.com/en-us/intel-compilers/
The fact is, they do advertise it and sell it as a compiler for Intel and compatible processors, and of course software developers want a compiler which produces an executable which works equally well on all processors so they don’t have to distribute several copies, one for each CPU manufacturer.
Intel has brought this all on themselves; if they’d clearly say that ICC has performance issues on anything other than Intel processors then developers would’ve know about that and would have either chosen another compiler which produces acceptable performance across all CPUs, or could have opted to use two compiler and distribute two binaries. Now those developers who have bought ICC and used it to compile their software will have to recompile it all and somehow distribute those new, fixed binaries to their customers. That’s a lot of unneeded hassle, and the bigger the product the costlier it is to have to recompile it.
Well, the code does work on non-Intel CPU. But Intel makes no claims about how optimized the code is for these CPUs.
Very true. Are people expecting Intel to know or work out the optimised code paths for processors they don’t manufacture?
I don’t know the details of this because I don’t use the product but reading the methodology employed it seem this is nothing more than the lowest common denominator method, which is an accepted method for making sure something works if you don’t know the optimal method. Have you ever followed a “take apart” guide to remove a specific component that in fact had you pulling the whole thing apart when you really only wanted to get that one component? You got to the end and thought “danm, I didn’t need to take half that crap out!” But it still delivered you the component you wanted didn’t it?
Unless Intel used the words “Optimised for…” or something similar when referring to processors from other manufacturers to describe their compiler they really shouldn’t have a case to answer. On the other hand if they did make such claims then the outcome has been appropriate.
Agreed. I have used both ICC and GCC.
It is not like Intel is inserting one billion no-ops if they detect an AMD processor. They simply disable some of the optimizations because they have no clue about the micro details (or they don’t want to bother) of a processor they do not manufacture.
That is why we use math kernels from AMD in AMD processors. I don’t see anyone freaking out because AMD does not optimize their math libraries for Intel processors. What a concept, eh?
I’ll go even further than that…
In theory, enabling a certain optimized codepath without any regard to the underlying microarchitecture could actually result in worse performance rather than better performance.
I’ve seen it many times myself. Not just Intel vs AMD, but especially in the days of 486 -> Pentium, Pentium -> P6 and P6 -> P4… code that was optimal for one microarchitecture could be disastrous for another one.
Which is why many compilers support the concept of ‘blended’ code. Obviously that is not a goal for Intel’s compiler. It has a very specific goal: generate the most optimal code for specific Intel microarchitectures.
There is just no way for Intel to win this. People will just claim that it’s crippling AMD when it selects a different codepath, and it turns out to be suboptimal anyway.
The only way for Intel to win is to support AMD’s microarchitectures specifically, but obviously that is not going to happen.
Edited 2010-01-03 23:35 UTC
Don’t drag “microarchitecture” into this – it’s about INSTRUCTION SETS.
Need I point out that IT WAS AMD who first extended the x86 arch to 64-bits when Intel was still drinking the Itanic KoolAid? Let’s see that change the compiler to not optimize for any AMD-compatible instructions – won’t that be a laugh.
Let me spell it out for you – all modern x86 CPUs are performing sleight of hand as a single x86 instruction gets broken into 1 or more smaller RISC-like operands.
So, since Intel has no insight into the “microarchitecture” of AMD’s CPUs, then how is it that ICC can create code for non-Intel procs at all?
I ( and many, many others ) have said it before and it bears repeating – the only checking should be for instruction sets. If the CPU says “I do SSE3” then the compiler shouldn’t need to know if it’s GenuineIntel or AuthenticAMD – unless it’s a workaround for a KNOWN FLAW and I’m comfortable with Intel not changing their compiler to accomodate AMD bugs.
The whole argument is useless as Intel doesn’t support its OWN CPUs anymore than non-Intel ones, unless it recognizes the family info.
Non-Intel CPUs are NOT a special case.
The issue here is that you and all other people crying ‘unfair’ WANT the Intel compiler to be about instructionsets and extensions, but it ISN’T. That’s a simple fact.
I’m not saying that’s how I think it SHOULD be, but it is how it is. Even the original Agner Fog article explains that.
Edited 2010-01-05 08:45 UTC
This is different. This is Intel SPECIFICALLY writing ADDITIONAL code to REDUCE performance for AMD processors. This is not a bug or a case of “we don’t understand AMD processors” – this is something discussed by management, passed on to the team lead, and executed by actual, code-producing developers.
And that’s not something Intel can get away with – in the same way we would scream bloody murder if Microsoft intentionally crippled Firefox’ performance on Windows.
No it isn’t. You make it sound like Intel went out of their way to add an extra “cripple AMD” codepath, and specifically select that path ONLY if non-Intel CPUs are present.
AMD CPUs just run the same non-SSE codepath as older Intel CPUs without SSE extensions would.
Now let’s discuss whether people who are editors on reasonably large tech sites such as ‘osnews.com’ should be able to get away with posting such false accusations…
Really, no offense… but as an editor of this site, I think you have a responsibility to check your facts a bit better, and not be so quick to throw accusations around. Before you know it, it is copied by thousands of other websites.
You study journalism… you are familiar with the concept of ‘hoor en wederhoor’?
Edited 2010-01-03 23:56 UTC
Uhm…
Got it?
In other words, it checks from whom the CPU is, and then deliberately chooses the slowest code path, even though more optimal ones can be used. You HAVE to code this INTO the program, and as such, this has been A DECISION.
Look, read the original article before accusing me of lying.
Edited 2010-01-03 23:57 UTC
I got it, but I don’t think you did.
The CPU dispatcher can only dispatch codepaths that are compiled into the binary (obviously there is always a CPU dispatcher, as it needs to prevent CPUs from running code they don’t support… Eg, not all Intel CPUs support SSE4 yet, so they need to have a fallback to SSE3 etc.. and a Pentium 4 will need different optimizations than a Core2 Duo, because of massively different microarchitecture).
You claim that there is an additional ‘cripple AMD’ codepath compiled into the binary.
This is not true.
So my previous post remains. It will just pick one of the different Intel-optimized paths.
Thing is ‘slowest possible’ and ‘better version’… those claims are up for grabs. The compiler doesn’t know anything about non-Intel microarchitectures in the first place. How slow the chosen path is, and how much better other paths would be, that is strictly up to chance.
Sure, in most cases it will probably pan out that way… but it’s not a deliberate action.
You need to step lightly in this sort of subject.
Edited 2010-01-04 00:05 UTC
No, I did not make such a claim.
No, it doesn’t just pick a different one. It specifically picks THE SLOWEST one – not “just” another one – the SLOWEST one.
You did, perhaps you didn’t realize it.
You said:
“This is Intel SPECIFICALLY writing ADDITIONAL code to REDUCE performance for AMD processors.”
But as I tried to explain to you, Intel didn’t add anything. The CPU dispatcher was always there, as were all the codepaths.
Wrong again. It doesn’t ‘specifically’ pick the ‘slowest’ one.
It just picks the ‘vanilla x86’ codepath. This may or may not be the slowest code. The decision isn’t based on performance, but simply on the fact that it doesn’t know any details about the microarchitecture and as such cannot make any decision about performance at all, neither the best nor the worst performance (as I said elsewhere, I still have fond memories of software that I had once optimized for Pentium, which would cripple performance when run on a newfangled Pentium II. Did I know it would cripple Pentium II? No, I didn’t have any knowledge of the microarchitecture when I optimized the code).
I’ll just repeat what I said a few posts above:
There is a difference between ‘not enabling’ and ‘disabling’.
The former is NOT taking action, and the latter is explicitly TAKING action.
Intel only checks to see if the CPUs are their own brand, and then selects an optimized path. That’s different from checking to see if they are any other brand, and specifically selecting a crippled path.
This is getting to be more of a discussion of semantics than of technology… and I expect a student of journalism to be able to interpret the logical difference between these formulations properly.
So the discussion ends here.
Edited 2010-01-04 00:22 UTC
But that’s the crucial point it does know about the processors supporting SSE… because all (Intel, AMD, Via …) have specific CPU flags which say I support SSE…, Intel is deliberately ignoring it for non-Intel CPUs.
Wrong!!! Intel still checks if the CPU supports SSE, SSE2, SSE3 … but only if the CPU is also a GenuineIntel CPU does it also enable the relevant instructions.
No, I’m right, and I’m tired of people who don’t know what a microarchitecture is, who try to tell me I’m wrong.
Instructionset capabilities aren’t the ONLY thing that matters.
Writing optimized code for a Pentium 4 SSE2 architecture is vastly different from a Core i7 architecture for example.
So you’ll want to check for SSE2 capabilities AND for the exact microarchitecture, and run the proper SSE2-path for the proper microarchitecture.
You sure? Seems it’s Thom that got, it not you.
Fine, you got this. So if the processor say it support lets say SSE3, it should run the codepath optimized for SSE3 right?
So why then does it also check if it’s a Intel CPU, and chose a generic codepath if not. Rather than the one based on the correct capabilities of the CPU, like support for SSE3.
The brand of microarcitecture is not important, the capabilities are. As as you already described, since this is how different generations of Intel CPUs are handled.
Not really. When extended instruction sets exist, it’s higly unlikely a codepath making use of those will be slower than a codepath using generic instructions.
Yes I’m quite sure.
By the way, I’m the guy who wrote the CPUInfo library: http://cpuinfo.sf.net
I like to think I know a thing or two about determinining different microarchitectures and selecting the most optimal codepath.
Not necessarily. You seem to skip over the microarchitecture-aspect too quickly.
I have a nice story from a friend of mine, who wrote an MMX-optimized MPEG2 video decoder back in the day.
He wrote it on a Pentium MMX. When the Pentium II came out, the MMX code turned out to be very suboptimal, since the instruction latency was twice as high as on the original Pentium. As a result, the tightly timed code would now stall on virtually every instruction, resulting in a PII 233 performing worse than a Pentium 166 MMX. A regular non-MMX routine would be faster.
So it’s not always as simple as just checking “Does it have MMX/SSE/etc?”
As I said elsewhere, there’s no way Intel can win, unless they specifically add support for AMD microarchitectures.
It’s the Intel compiler, built to optimize for Intel microarchitectures.
That’s where you’re wrong.
If the capabilities were the only important thing, then why would there be separate compiler options for Pentium, P6, PIV, Core2 architectures etc?
Unlikely, but not impossible.
Bottom line remains that Intel has no obligation to support any non-Intel CPUs in their own compiler AT ALL, whether you like it or not. So why would they even bother to take a chance?
Ah yes argument from authority very convincing. I can do the same: (from the linked article)
Now the guy who wrote that is Van Smith President of Cossatot Analytics Labs who created an open source x86 benchmark. I guess he knows a thing or two about choosing the most optimal codepath.
This is a strawman argument. As the article clearly stated Via processors were 50% faster when their vendor id was set to GenuineIntel.
Also from the article:
So although they don’t know about future processor microarchitecture they still enable the optimization on the new Intel compilers, but not on AMD/VIA machines??
No you can’t, because I’m not arguing against Agner Fog, I’m arguing against people who twist and/or misinterpret his words.
I’ve known Agner Fog for a long time, and I respect him as an authority on microarchitecture optimizations.
He certainly is not the one I have issues with.
No, it’s not a strawman.
Your argument is a strawman. You have one example for one non-Intel CPU, and want to generalize this into a statement that Intel Compiler-generated code will ALWAYS be faster on ANY CPU when they would just ignore any microarchitecture details.
I’m pointing out that this generalization is false by offering a counter-example.
Please read more clearly. Agner clearly states that Intel manipulates the CPUID family numbers so that the code still THINKS it’s running on an older CPU.
Firstly, this means that Agner supports my point that the Intel compiler doesn’t just check if it says “GenuineIntel”, but the family data is also used in the check.
Secondly, it explains that Intel solves it ‘backwards’. As long as new CPUs are similar enough in terms of microarchitecture, they keep the CPUID info the same, so the optimized code keeps working (it just thinks it’s running on an older CPU that it already knows).
Intel can control this because they make both the CPU and the compiler.
If a new CPU has a microarchitecture that is vastly different, they’ll just give it a new family name, which becomes a new optimization target.
Exercise for the reader: track down the family info from Pentium Pro -> PII -> PIII -> PIV -> Core2.
Do you see any peculiarities?
By the way, it wasn’t suppposed to be an argument from authority in the first place, but rather a response to people claiming I “don’t get it”.
I think I’m perfectly in my right when I say I do get how CPUID works, and how one can extract information about architecture, cache and extensions… CPUInfo is proof of that.
No offense, but it seems like the majority of people responding here just doesn’t have a clue of what they’re talking about, and telling someone like me that I “don’t get it” is a pathetic display of ignorance, arrogance and general lack of respect.
Why do especially the clueless always think they know it better than others, and why do they need to have an opinion on things they know nothing about, and force that opinion on others via insults?
This is again a matter or interpretation (not compilation ).  Thom what you see as a deliberate act of sabotage on Intel’s behalf I see as nothing more than the lowest common denominator method.
 ).  Thom what you see as a deliberate act of sabotage on Intel’s behalf I see as nothing more than the lowest common denominator method.
You will notice it says “most cases” (my bold in the quote). To me this would mean it checks to see if it’s dealing with an Intel processor and uses a more optimised path because the Intel developers who wrote the compiler specifically know it will work, otherwise it resorts to the lowest common denominator code, which is the code set that they know will work on all compatible devices. Nothing more, nothing less. There may be certain situations where they knew particular optimisations work for other processors and therefore use them, hence the most cases. If it was to use an optimised path that another manufacturer claims works with their processor and it doesn’t work who would people blame? I guarantee it would be “this stupid {expletive deleted} compiler!!!”.
If this is truly what this case was about I think Intel have been unfairly treated. If there are routines in the compiler to add superfluous code to purposely slow execution on other processors it’s a different matter, but I see no evidence of this here, and there’s nothing stopping other manufacturers from writing their own compilers optimised only for their processors, just like there’s nothing stopping media player and phone manufacturers from writing their own apps to fully support their devices…
I think I can resume the problem with a simple code. We want the CPU scheduler to find the best path possible for the actual processor. What we would expect is something like this:
Now what Intel did is to encapsulate this processor check with a check for GenuineIntel. The choice of path is not even done if the processor is not intel. Something like this:
Now this addition is minimal. Not much extra is done. BUT, as Thom said (but maybe not in the best formulation) is that somebody at Intel decided, for marketing reason, that the check should not be executed for non-GenuineIntel. The tricky thing is that ICC does not do _extra_ stuff just to cripple AMD. They just added a CPUID check _before_ detecting cpu capabilities. This is clever because it does what Intel wants (cripple competitors) while not doing something explicit for the competition. That way they can do what they want while protecting themselves.
So they _did_ added something to cripple competition. That something is there to _prevent_ enabling an optimized path.
Intel clearly did this to hurt competition. ICC is cheap (compared to PGI for example) but if it only works on Intel cpu, then it _will_ be a strong incentive to buy intel: for similar price in processor, if you choose intel you will have a highly optimized compiler for it that wont cost much. If you go with amd, then you will need to buy another compiler that will cost you 10 times more. This is what intel is aiming at.
Personally I think it is an abuse of position by intel since it is aimed at hurting competitors, not improving there product. By hiding this fact and by being quite vague on what the compiler does on non-intel processors, they are hurting sane competition between companies and abusing their position.
What I hope for, as many others, is that Intel would be forced to remove the CPUID check. It’s a single line to delete, not even to add.
Intel does not check for caps, they check for microarchitectures.
If the CPU is not “GenuineIntel”, they simply have no data for microarchitectures at all, hence they’ll just default to the most basic x86 path.
If the CPU is “GenuineIntel”, then they can tell from the family, model, stepping and extension caps bits what microarchitecture it is, and what codepath it should take.
Also, your representation of best/second-best etc optimization levels is also wrong.
All paths are equally optimized, just optimized for different microarchitectures.
What is optimal for a Pentium 4 with SSE3 is not optimal for a Core2 Duo and vice versa…
So you may find multiple SSE3-paths in a single binary, for example.
If you don’t understand this, I’m going to cry.
Edited 2010-01-04 16:30 UTC
Are you sure?
According to Agner Fog’s “Optimizing C++”[1, page 126]
Note that this is only based on his observations which could be wrong.
He also provides a replacement for the cpu scheduler which detects instruction sets, not microarchitectures. This is based on the function InstructionSet()[2] which detects the instruction sets, not microarchitectures.
[1] http://www.agner.org/optimize/optimizing_cpp.pdf
[2] asmlib-instructions.pdf, Section 3.2 – InstructionSet, page 7. Included in “asmlib.zip”, available here: http://www.agner.org/optimize/#asmlib
You guys all don’t get it! SSE et al. are standards, there is specific ways the compilers ask the CPU if they support these. For each of these Intel checks if the CPU supports it (they do need that because their own CPUs don’t all support the same instructions either). They then perform and additional check for the CPUID and disable the SSE etc. instructions. So they do not choose the lowest common denominator, they deliberately choose a suboptimal path.
They don’t say “optimised for …” but they do say that they support SSE etc., and they do not say “supporting SSE only on Intel”
Yes, they do and Yes, Intel knows how to do it.
Even Intel processors do not all share the same feature set, so the code has to check for CPU capabilities anyways. From the technical point of view, acting on the vendor string is just a waste.
If it was JUST the vendor string…
But it’s not.
They check for the CPU extensions (MMX, SSE1/2/3/4/etc), for the family, model, stepping, AND for the vendor.
If you don’t check all three, you don’t know exactly what CPU you’re dealing with.
AMD uses the same family, model and stepping ranges as Intel, but obviously the CPUs are quite different.
So yes, in light of the other information, the vendor string is quite important.
If you look at the sourcecode of my CPUInfo library (http://cpuinfo.sf.net), you’ll see that I even have to check the vendor string for certain CPUID functions. The same index gives completely different results on Intel and AMD CPUs (and possibly VIA).
No, they don’t. They check for vendor, and if vendor != Intel they don’t check the other stuff at all. You are in a small minority in thinking this is acceptable. Astroturf is a crappy playing surface.
Yes they do.
Notice my use of the word “AND”.
A certain codepath is only chosen when ALL the conditions for that codepath are met.
One of the conditions just happens to be “GenuineIntel”, but it is not the ONLY condition. If this condition is met, other conditions are checked aswell, and if they are not met, you still get to the same default codepath that non-Intel CPUs also run. It’s not a “cripple AMD” path. It’s the “I don’t know this particular microarchitecture”-path. Since it doesn’t know all Intel microarchitectures either, certain CPUs that DO report “GenuineIntel” will still run that path.
And I resent the Astroturf implications.
So basically, you are blaming intel for your poor reading and comprehension skills? You have to play a very creative game of twist in order to fit you narrative in this case, me thinks.
BTW, if you had ever used icc you’d realize that none of the optimization flags even remotely claim to be targeted for amd microarchitectures.
There is a big difference between “supporting” an architecture and “optimizing” for such an architecture. Oh, and if we’re going to label Intel as the evil ones. Guess what I am sure AMD themselves would not provide too many low level details for their microarchitecture to Intel (as that would provide a lot of privileged information which I am sure AMD does not feel like giving intel for free).
Clang now builds LLVM.
In a year, Clang will be complete and you won’t see me using either Intel or GCC for compiling.
The debate here is getting side-tracked from the real context of this topic.
The context is not that Intel is fraudulently selling a compiler which doesn’t do what they claim. They could well advertise ICC as “The best x86 compiler that executes crap on AMD” – that isn’t the issue.
The issue is that their ICC compiler product is being connected to their broader anti-competitive actions to protect their CPU monopoly. Opening up ICC is akin to the EU opening Windows to multiple browsers or opening Windows’ server protocols to all, or pre-Bush US Anti-Trust Court proposing to simply break up Microsoft into separate companies as a remedy to an anti-competitive monopoly.
If they didn’t have a monopoly, all these actions may not be illegal in and of themselves. But they DO have a pretty effective monopoly, and they have done many low-handed things to perpetuate it. Making ICC CPU vendor-neutral is just ONE potential remedy for Intel’s predatory monopolist behavior.
This is correct they are marketing something and making false claims about it.
But inside the Microsoft/Linux world of PC’s how many people are actually making production released products that are built with the ICC?
That’s all I want to know, while this is shady of Intel to do how big has the side effect of this deliberate bug been? I just think many people are blowing this way out of proportion. I thought this compiler was mainly used by institutions and research type outfits.
But Intel does NOT have a monopoly in the compiler market. In fact, icc is pretty obscure. The problem is not with icc either, as it is perfectly valid to have an optimizing compiler for Intel-only systems. I mean, if you have a server park with Intel CPUs, and you compile your (open source) software to get the best possible performance, what’s wrong with that?
Before x86 became the de facto standard, it was very common to have an optimizing compiler for your CPU brand, especially since most brands had their unique ISA in the first place.
The problem is that benchmark suites have been using a compiler that is not vendor-neutral, and doesn’t have any intention to be vendor-neutral.
But how is that Intel’s fault?
This discussion is moot anyway.
Intel already admitted its wrongdoing in the compiler case in the AMD settlement. In other words, if even Intel agrees its actions with the compiler are questionable – who are you to argue with them?
Read my posts more clearly. I said that I don’t approve of Intel’s actions. In other words, I already said that I found them questionable. So I’m not arguing that. In fact, I’d want for them to change it.
However, that does not mean that all accusations that people throw around are valid, nor that Intel’s actions are illegal (again, semantics? Questionable and illegal are NOT the same thing).
Edited 2010-01-04 10:12 UTC
just because developers want something, doesn’t mean they are “right” in wanting it – and i speak as a sw developer
ICC is a proprietary compiler developed a cpu maker for their own cpu’s…
one shall call luck if it retains compatibility (as a side effect of supporting intel’s own legacy cpu’s) with competitors’ cpu, at all (but more on this later) – demanding it optimizes for Athlons and Phenoms as well as for Core i7’s is a bit too much…
you make it sound like something absurd, but it was actually how things used to go, at least with smart people where i studied and then worked
ICC for the intel target, MSCC for the “anything else” target was the norm – people who took for granted that a compiler developed by a chip maker should work equally well (if not better) for competitors’ cpu’s were rightfully mocked or scolded for not adhering to the norm
or at least reminded some factual and historical details they had overlooked, like cpu specific optimizations (in that every single cpu family, eg Core rather than Netburst or P6, and sometimes revisions inside processor families, see the P4 Prescott VS Northwood has peculiar features and idiosynchrasies and require a specific optimization strategy) and errata (in that individual revisions of individual families may or may need workarounds/fixes – then again, separate code paths) – now, intel is already busy with implementing their own optimizations and workarounds, expecting them to go beyond the minimum required for compatibility and fully implement workarounds for competitor’s cpu’s is unrealistic
historically, every major platform (/CPU architecture from a single vendor) came with its own custom C compiler (or more recently, a custom port of the GCC), to which it was practically tied
besides, intel has always finely tuned its compiler for their own processors to make for the sometimes pesky architectural features of the latter (like the first pentium’s asymmetrical pipelines – two instructions could be executed at once, but with several restrictions, s.a. a simple and a complex one, or an integer and a FP one – or the P6)
since, not being able to leverage proprietary compilers and having to cope with a vast installed base of existing code, they were forced to design their chip in more elegant ways in order to achieve better performance with generic i386 code, the above was actually one of the very reasons of AMD’s (and others’) competitiveness in the performance field, starting with the mid 90’s
Edited 2010-01-04 16:35 UTC
It should be mentioned that the crippling is not even an inherent or very old feature. It was at least on Linux introduced with icc 8. I know because this was the version where I stopped wasting my time on icc support in KDE, until then I had improved support for the Intel compiler in several OSS projects, but that work was to support a superiour compiler, not the crippling one version 8 and later was.
Small thoughts…
On the argument that the compiler is Intel specific:
– Just an excuse… Actually we are talking of a compiler from which code will be used in OTHER compilers. Not a CPU, and the compiler will NOT work as expected nor advertised.
– Stating that it is JUST for Intel CPUs is to accept the accusation of MONOPOLY practice and CARTEL-like practices where the compiler is used in association with CPU to achieve unfair advantage. It is called sabotage in any dictionary.
On the argument that “don’t like it, don’t use it”:
– Just looking the other way, as code spread like a virus, without anyone except the first users to know what was used in the first place. Not even so… as the majority doesn’t have a clue of what is going on.
– Again, the terminology to be used is sabotage… and that is not excused, nor justified by a little note posted “just-in-case” the “trick” is found. No crime is justifiable by “just following orders” nor “I warned you”. Specially in a virus like spread of such code in libraries, tools, objects.
– The total ignorance of what is hidden under the statement “optimized just for Intel” has now a different meaning. And the different meaning is a revelation of what is under the sheets. Not an excuse, but an evidence of motive, intention and action.
Concluding…
1 – Let nobody states that a warning is fair, when no REAL information is stated to the statement to be understood. Just a legal precaution to commit a crime with impunity.
2 – Let nobody state that you have a choice when no choice is given, as reality is hidden with intention, away from a compilers expectations, and Misinformation is the game. Results have a viral spread.
3 – We have victims, motivation, and actions. These cannot be erased, whatever the statements to disguise facts. And facts are a the fees for courts, not sweet-talk (or at least they should).
4 – Legal actions are much away from the justice concept as delaying is a battle tool… as is misinformation… pseudo-statements to the future…
This is my honest interpretation. Is there another?
—
“it was in front of them but they could not see it”
Problem with accusations related to monopoly is that the compiler is a piece of software (and sold separately from CPUs). Intel may have a huge marketshare in the hardware market, but the Intel compiler doesn’t have a big marketshare. Microsoft is the biggest player there, and gcc is second.
The Intel Compiler is just a commercial product. Since anti-trust laws won’t apply, it is indeed down to the old ‘voting with your wallet’. If you don’t like it, don’t use it.
While I don’t approve of Intel’s approach here, I think it would be a huge loss for the free market in general if Intel were forced by any organization to modify their product. It would create case-law with repercussions that I don’t even dare to think about.
Hi,
Sorry by some glitches in the text but was very late and the edit window is very small. Being unable to edit after posting, glitches become permanent.
Naturally your point is clear, from the legal point.
However (and this was not clear) when using the word “crime”, a strong word with a broad range, the meaning was the original. Not the interpretation that is becoming current. I refer to confusing crime with something that is illegal.
Everyone knows that crimes existed BEFORE laws. Maybe do not notice that crime is an ethical evaluation. Legalities only come very late. That is why the connection is very loose and some “laws” allow crimes and some forbid (or “criminalize” what is no crime at all).
The perspectives stated, also depend on another concept. The one of passive crime as complement to the the usually active that anyone understands. I believe its called “crime by omission”… or something like that. An example: Seeing somebody dying and just walk away when one could have saved a life.
This was an extreme example, but shows the obligations any citizen has towards to society. The same is applied to economics as the only justification for a corporation to exist (and profit from society) is to be useful to society. Nothing less, in the ethical point of view that should be the base for laws and it’s enforcement.
A corporation is a virtual citizen, not a king.
And only while it is useful to accept it as such. Not by right… but by allowance. Even if that is forgotten, it is still true… and to be reminded.
Anyway, la enforcement has a double problem, and bigger one is that law is quite linear in a world that is not linear, but complex. Where perspective is more important that justifications.
Take ecology for example. And that applies to every situation in society. Laws cannot, EVER, command every possible situation. And that is the genesis of trouble when the wordings become more important than the reasons for laws to exist.
You where very clear and precise in your comment, unlike mine. I thank you for that.
It was also missing the mention to the common practice of ACTIVELY forbid developers to do what they are doing, or could do very naturally. And even worse. We have seen many “internal memos” proving that is a common practice.
That gives a long range of opportunities when a compiler chooses what routines to use depending from what CPU is present. If it a PASSIVE way to get a “result” or an ACTIVE one… The common practice gives us a clue of what is more likely.
Anyway, laws become a maze were justice is lost.
So fairness is more important than ever.
And I guess that you feel the same.
What really chocks me is the respect a compiler has for it’s knowledge of hidden internals… and then that respect and confidence being (actively or passively) used to get unfair results.
To me, it does not matter if it passes bellow the legal sieve. The reasons for laws to exist are more important. And that were fairness lies. Specially when a tool is respected, used… and present in libraries used by third parties… with confidence… and unexpected implications on others.
It’s all a question of perspective. Not words, justifications or laws. Laws should follow their goals, not the reverse. That is what they are for… or should.
And thanks, again, for pinpointing the more practical side. Regards.
P.S. – Sorry to have been a bit long. And hope not to have any glitches this time.
As, again, it is late, and need to rest.
More than an “interpretation” I’d consider your post an ode to logical dissonance?
Do you even know what a compiler does? The difference between ISA and microarchitecture?
Furthermore, have any of you even used ICC… and if doing so, a real purchased and supported copy of ICC (not just one of them copies which fell off the internet?). Did you realize no where in the contract does Intel claim to produce optimized code for non intel microarchitectures?
Basically intel just promises that in exchange for buying ICC, they guarantee ICC will produce highly optimized code for their processors. If you want to build an application, and are using ICC, you are aware of that. Maybe it is a douche move by intel, I don’t know… but that is not the point.
Next thing, we’ll hear how evil Apple is because they did not optimize OSX for your AMD hackintosh.
ICC is a tool, and it does exactly what it claims to do: produce very optimized scheduling for intel microarchitectures. Trying to blame intel for not supporting competitor’s products simply because you didn’t bother to learn about the tool, or feel a level of entitlement… is a tad disingenuous.
Edited 2010-01-03 23:23 UTC
Really, when it comes with habits, or less information than needed, small information is worse than having none. And we can miss what is in front of us. So…
To better understand the compiler status let’s examine it from different scenario points. This give us a more clear picture, even in absence of other indications, as it allow us to compare and thus to scale the importance of a factor…
Imagine that the compiler:
A – Refuses to work on some CPUs.
B – Makes lousy code, in contrast to be the better one.
C – Uses the best routines available elsewhere as the programmers accept to be compromised or non-competent.
Now Check:
A – Fair… but too disturbing, and everybody complains
B – Unfair… no disturbance, and no complains
C – Fair… no disturbance, better product
I suppose that clears the picture, even to someone unable to evaluate the significance of some facts, and it’s significance. Just a try (and an hope)…
The main people this bug bothers and has hindered:
People with AMD systems using the ICC compiler. (which would make me ask why)
Intel hasn’t hindered AMD in anyway when it comes to the normal XP/Vista/7 user running an AMD system, different C compilers were used to generate their programs and optimizations have been put in place by other developers.
Sure this make Intel look like bastards for their tactics with their own compiler. But if this was some huge sabotage move like people are making it to be Intel would’ve been sued into oblivion by now.
Uhm, they HAVE been sued into oblivion, and the compiler was part of said lawsuit between AMD and Intel (and now it’s part of the settlement). It’s also part of the FTC probe.
It’s all in the articles. Didn’t you read them?
“People with AMD systems using the ICC compiler. (which would make me ask why)”
…because nobody uses both Intel and AMD processors.
…because nobody uses binaries they did not personally compile.
Just a little perspective, even though I’m aware it’s a bit of an extreme data point:
About three years ago, small scale clusters (esp. capacity machines) employing AMD Opteron processors and Gigabit Ethernet were in full swing in the HPC community, beacause they offered a very nice bang for the 10k€ – 100k€ Buck. Nothing earthshattering and certainly not Top500 material, but – especially due to the integrated memory controller – very nice machines to do test- and development runs before blocking the production machines with unoptimised code. As a consequence, AMD was able to capture a lot of ground of what was traditionally Intels hometurf.
Despite AMDs inroads in the hardware department, a lot of scientists (e.g. the “end users” of the HPC facilities) still defaulted to the intel compiler suite, due to various reasons: For Linux and non-commercial use, the compiler is gratis (this does not cover academic use, but as far as I’m aware of virtually nobody gives a damn) and is not limited compared to the commerical option, which makes it a lot more attractive than for example the PGI suite or the NAG compilers. Additionally, or even because of that, even if you have to deploy your code on a different x86 based cluster system (for example for a “real” production run) odds are good that the intel compilers are available, widely used and therefore well maintained.
Combine this with the fact that bulk of scientific code in many disciplines (like for example lattice QCD) is still written using a wild mixture of Fortran 77/90 + properitary Compiler extensions and neither gcc nor g95 up until recently having competative fortran compilers and you have quite a big hurdle to move away from the intel compilers (some might even go as far as calling this a potential vendor lock-in, those obnoxious brats, tz tz tz).
The majority of scientists in “my” field (high energy physics) are more interested to actually “get the job done” (oh how I hate this phrase but once in a while it’s really appropiate) and not to fiddle with optimisation options and processor specific behaviour.
Consequently, they and the folks that have to actually maintain these cluster systems tend to be a conservative bunch, so that the un-sexy work of fine-tuning and optimising the code for a given compiler has to be done ideally only once.
If turning on optimisation options for SSEx/vectorisation yields not the performance gains people expect or are even used to, the “new” component (aka the non-Intel hardware) tends to receive the blame and not the compiler (“scales as expected on my Xeon desktop workstation, sucks at your shitty SUN / Opteron cluster. Fix that”) . And although this behaviour was discovered some time ago, you would be surprised how many “end users” of HPC facilities are not aware of the implications of this performance regressions.
As a side note: If you are a programmer and interested in the ins- and out of optimising code, make sure to check out the sotware optimisation ressources on Agner Fog’s site, they are imho a classic read:
http://www.agner.org/optimize/
It’s not Intel’s fault that AMD doesn’t offer an alternative product.
By the way, the runtime selection of codepaths is just a compiler option.
It’s perfectly possible to compile only a single codepath and effectively ‘force’ your CPU to run SSE/whatever code.
I’ve been using it on my Athlon back in the day, and got pretty good results.
Of course it isn’t Intels fault. But in my opinion it’s not AMDs fault if people who react allergic to compiler flags blame AMD for poor performance if an compiler silently drops back to a fallback scenario with – again, imho – unreasonable performance regressions.
EDIT: I accidentially deleted the second sentence, I suck at typing at a computer with a touchpad.
Which is precisely the reason why I’m not really ´buying into the “you have to know the internals of the CPU to provide stuff like SSEx”. It may not be optimal, but it sure as hell is faster than vanilla, non-superscalar, non-vectorised code.
My comment was just intended to highlight that there are people who use AMD based systems extensively and to the limit of their capabilities, yet have difficulties switching to a different compiler plattform for their production runs.
Edited 2010-01-04 10:11 UTC
That’s not the same thing though.
I didn’t say you have to know the internals to USE SSE. Obviously you can run SSE code on any CPU that reports that it supports it (in the case of forcing a certain architecture without runtime CPU-dispatching, you just get a crashing executable on CPUs that don’t support it).
However, you DO have to know the internals in order to select the most OPTIMAL path, be it SSE or something else.
So that is the difference…
Intel never claimed that they would run SSE-code or whatever on any CPU that reports support for it, nor do they make any claims about the level of optimization for CPUs that aren’t directly supported (which is a recent subset of Intel CPUs only).
It’s frustrating to see that so many people don’t seem to understand the difference between instructionsets and microarchitecture.
Here’s a simple question for those people to contemplate:
Considering that Core2, Core i7 and Phenom all support the same basic x86-64 and SSE instructionsets (up to SSSE3 at least), how is it possible that they do not perform the same, even if clockspeed, cache, and other factors are kept equal? And that the performance difference is not constant, but varies from application to application?
The answer is: microarchitecture.
Edited 2010-01-04 10:39 UTC
At the risk of repeating myself:
It goes without saying that better knowledge about the microarchitecture usually translates into better optimisation strategies / options(duh!), e.g. I’m not arguing against the fact that it is reasonable to expect the intel compiler to perform best on intels own processors.
But there is imho a vast difference between graciously degrading agressiveness and sophistication of optimisation strategies / code paths (e.g. providing generic SSEx implementations that may not be optimal performancewise but allow for a better utilisation of the hardware features / registers /etc. at hand, especially given that the generic code paths in question are already part of the compiler) and falling back to the behaviour of a glorified 386.
Intel is the king of the hill in the x86 processor buisness and so every move they make (or in this case: don’t make) that treats their competitors/licensees significantly worse compared to their own platform is bound to raise questions of abusing their dominant market position.
As I said before: Intel is hardly a large player in the compiler market, so any ‘dominant market position’ rhetoric is just nonsense.
Developers need to specifically BUY the Intel compiler, while gcc comes free with most OSes, and in Windows, people generally use Microsoft Visual Studio, which comes with its own compiler aswell.
Both gcc and Microsoft are quite capable of generating well-optimized code, so most developers don’t see any reason to spend money on the Intel compiler. It’s a nice product, mostly interesting for scientific computing (Fortran and/or getting the most out of your high-end hardware).
Actually, they do, in cooperation with Sun: see
http://developers.sun.com/sunstudio/index.jsp
Sun Studio Express compilers for Linux are available
free of charge for commercial and noncommercial use, and are probably the best AMD-targeting compilers out there since the demise of PathScale. They do a very good job for Intel processors too, for that matter (typically within
1% of Intel, according to benchmarks I’ve run).
It’s sad that they haven’t advertised this better…
Edited 2010-01-04 15:36 UTC
And they don’t seem to support the biggest platform: Windows…
Because that’s what it’s all about… PCMark05, a Windows benchmark, written in C++.
AMD, VIA, and anyone else interested in making an x86 compatible chip should start throwing code at the LLVM project.
See above. In the weeds are at least two mentions of LLVM.
INVENTORS – DO NOT TRUST INTEL
I invented a CPU cooler – 3 times better than best – better than water. Intel have major CPU cooling problems – “Intel’s microprocessors were generating so much heat that they were melting” (iht.com) – try to talk to them – they send my communications to my competitor & will not talk to me.
Winners of major ‘Corporate Social Responsibility’ award.
Huh!!!!
When did RICO get repealed?”
Be advised
1) I am prepared to kill to protect my IP (Intel HAVE NOT stolen it AFAIK – so you can’t Sean Dix me) and
2) I am prepared to die to get TRUE patent reform.
IPROAG – The Intellectual Property Rightful Owners Action Group.
The One Dollar Patent.
People actually use the Intel reference compiler for … production code?
Since when?!?
That said, Oh noes, a companies software that works best on their hardware, still FUNCTIONS on others (though not as well)…
Who do they think they are, Apple?
Edited 2010-01-04 02:38 UTC
HPC users, because ICC will usually generate the tightest and fastest code on their Intel based clusters. Of course those sorts of users tend to benchmark things like compilers before they use them, so the woeful performance on non-Intel CPUs has been known for some time (although perhaps not the mechanism).
So I’m no expert on compilers or code in general but here’s a thought I just had. When new cpu’s come out, various websites benchmark them using certain benchmarking software. People then often buy cpu’s based on results they read about. If the benchmarking software got compiled using Intels compiler, would they be optimized specifically for Intel cpu’s and therefore the results could be skewed in favour of Intel’s cpu’s at a disadvantage to AMD?
Just curious what kind of affect this could have.
Yes this is exactly what was reported in the article. If they spoofed the cpu so it was reporting to be intel instead of amd/via, then the same benchmarking program would give up to 50% increase in performance! This is a serious problem as people will base their decision on this when shopping.
Yes, it’s a problem, but the blame should be with the benchmark developers, not with the Intel compiler.
I agree the benchmark developers should be held responsible for a part of the blame. I still think intel is taking advantage of it.
The conclusion to this is probably: do your own benchmark, with your own code. You cannot trust a program you don’t know what is doing or how it was compiled. Benchmarking is an extremely hard art.
heh, oops. Next time I should read the article before commenting.
thanks for the answer though.
Anyone recall ScienceMark?
Historically it was one of the few benchmarks that Athlons performed well in…
Look at these results for example:
http://www.extremetech.com/article2/0,2845,2014652,00.asp
An Athlon64 FX-62 about as fast as a Core2 Duo X6800?
Amazing, no other benchmark shows results even remotely similar…
The plot thickens when you realize that some of ScienceMark’s developers are/were AMD employees.
(eg ‘redpriest’, as he himself says here:
http://www.hardforum.com/showpost.php?p=1034771780&postcount=142
“Full disclosure: I am an engineer that works for AMD (in CPUs and not in graphics)”)
Funny that in all those years there never was any news site that picked up on this. I guess Intel just generates far more hits than AMD.
Edited 2010-01-05 13:18 UTC
I’ve summarized most of what is said on my blog:
http://scalibq.spaces.live.com/blog/cns!663AD9A4F9CB0661!238.entry
I’ve also linked to an optimization challenge that I participated in on an assembly forum a while ago.
It clearly shows that the fastest code on one CPU is not necessarily the fastest on another, and could actually severely cripple performance.
Best example is an MMX routine, that was the fastest on an Athlon XP, but among the slowest on both Core2 Duo and Pentium 4. They were better off with solutions not using MMX, because of microarchitectural differences with the Athlon XP in the MMX implementation (to be exact: the penalty on the EMMS instruction).