One of the features appearing in the upcoming GCC-3.4 release is so-called precompiled headers, which potentially enable you to build applications at a faster pace. Read the article at DevChannel.
How useful are precompiled headers?
2003-12-02 General Development 39 Comments
is this what Apple XCode does?
Also – how is this differenct from Distcc – see here for more info:
Although I never had a problem with compiling things in the background or overnight, I am sure some Gentoo people would be happy about this. Hopefully it would bring a few more people over to Gentoo since some people have resisted Gentoo for the long-compiling times…
And yes, the other source based distros should like this as well…of course the potential support headaches are there…
A while ago I was an intern for a company that writes software for the Windows platform. I did quite a lot of MFC programming there, and grew a fair bit to MSVC++ (compiler and IDE).
The part I like in particular about MSVC++ is the speed of the compiler. Maybe not generated-code speed, but compiler speed. When I’m developing a program that uses a huge library (like Qt of MFC), compilation times increase a lot. When developing however, I couldn’t care less about runtime speed.
Compilation speed of GCC 2.95.3 was slower than MSVC++, but I could still live with it. However, GCC 3.x became the standard on most distributions, so I had to make the move as well. I switched to 3.x, but after a while I abandoned development of C++ Linux applications. I just can’t live with the fact that it (Qt application development) is a LOT slower than developing MFC applications. MFC seems to be a lot bigger, but because of the PCH support it’s a lot faster.
Things like ccache and distcc help a bit, and I hope GCC 3.4 will bring compilation speeds on par with MSVC++.
Personally I prefer two compilation modes… Development (Debug) and Release. Debug builds should do no speed optimization at all, while Release should take its time for optimization.
I hope that one day I will feel good about picking up Qt development on Linux again.
Don’t get me wrong, GCC is a great compiler, for what it’s worth (being cross-platform, etc). But doing development with it, is not great in my opinion.
until they come up with some nice User tools for system configuration and services set up, both for server and client side.
“Development (Debug) and Release. Debug builds should do no speed optimization at all, while Release should take its time for optimization. ”
You can already have this. Just take a look at your compiler’s manual and modify your makefiles accordingly.
Then what causes the immense slowness?
Even a simple application like
cout << “hello world!” << endl;
takes a *LOT* longer to compile on GCC 3.2 than 2.95.3.
Yes, even when compiled with -O0.
Maybe this will speed up the install process of Gentoo
> . I just can’t live with the fact that it (Qt application development) is a LOT slower than developing MFC applications. MFC seems to be a lot bigger, but because of the PCH support it’s a lot faster.
Well, it’s because GCC doesn’t even have PCH support until GCC 3.4 (or Apple’s version)
If you have a commercial Qt license on Windows, and use Visual C++ on it, it compiles as fast as MFC apps since it supports PCH.
PCH support in GCC 3.4 should DRAMATICALLY help the compiling speed of the following apps:
Any Qt apps
Any KDE apps (even more than pure-Qt apps)
These apps are all C++ based, and use a great deal of header files. C++ header files tend to get VERY large.
I’m a gentoo user as well, but I won’t complain about development on such a system. In this case, it’s my own choice to compile the software, so I’ll just have to deal with the fact that it’s going to take a while to compile a well-optimized system.
It would be nice to have a Gentoo system up and running faster, but a lot of people think that a faster GCC *only* benefits the adventurious users. Or they don’t (seem to) care about the developers hell, anyway
Yeah gcc generates code slower than commercial compilers, but that’s not what takes the most time. Especially in the case of using Gentoo (where everything is compiled) we have to sit through the ./configure stage even for a tiny program. I read discussions in the Gentoo forums of ways to speed up Portage, they argue over converting Portage into a binary app instead of just scripts. That’s when someone pointed out that Portage itself isn’t slow, it’s running each package’s config script that makes it feel slow.
Now with precompiled headers, it makes sense for large programs with many source files. Programmers will appreciate this more than the average user because we can make little changes to source files and not have to recompile headers every time.
RE: I am resisting Gentoo
Funny you mention that because I’ve stuck with Gentoo because I’ve found it the easiest Linux distro to configure. I resist trying something else because I’d have to relearn certain things. Anyone who has tried both RedHat-based and non RedHat-based distros understands.
If you’re asking for GUI admin tools, I have to say I rather do without. Example: I tried YellowDogLinux 3.0 on my iBook. It’s RedHat based. Any GUI applet I fire up for configuring my system it would crash. Send a bug report? Hah! I can’t even setup a network connection to send the bug report!
Or just code in C instead of C++. 😉
The author stated that maintaining PCH for every app on disk would require an immense amount of space to do so. That would be true … if you are compiling everything at once!
If you are a developer, chances are you are only going to be actively working on a small handful of projects anyway, so you r penalty for PCH isn’t that large relative to the space that the individual object files are consuming.
If you are building a Software distribution (Whether as the end-user or distributor), You are only going to use them as-needed anyway. Let ‘make’ build the PCH once and ‘make clean’ get rid of it. It isn’t that hard.
There is one glaring weakness in GCC’s PCH implementation: You can only have one PCH involved in the compilation of any given object file. So, you cannot for example make a PCH that includes the Gtkmm headers for your project, and another for your private headers, and another for libstdc++ (which would minimize the reprocessing done when you change your headers: only update the local set), You must make a single PCH from a header file that includes all of the above, which forces you to rebuild it everytime one of your headers changes.
So, the logical choice is not to make a PCH of your private headers during development (as the author suggests), but rather a PCH of all of your dependant libs (Boost, libstdc++, Gtkmm/QT, whatever).
I usually compile WxWindows with gcc (mingw + devcpp) and it takes forever compared to vc++. Now with this and devcpp’s ability to import vc++ makefiles i have a great build system.
Can someone explain to me some things?
It’s always been my impression that “pre-compiled” headers aren’t compiled at all, but rather “pre-parsed” so that the system need simply load in a symbol table and syntax tree. Being that the cpp is supposed to simply include these files, and that they can be included ANYWHERE, how does one “compile” that?
Most include files have little code at all anyway (granted C++ can differ [a lot] from normal C headers), so there’s rarely any code to compile at all!
Maybe this is a C++ thing, where they treat the class definitions as whole items to compile and then link with later.
Do GCC precompiled headers compress well? If so it might be better if the the compiler zipped them before writing. The space/IO savings at read time would be worth the unzip time. Of course something like this would need to be measured to make sure it is a win.
Err, made a typo here:
[quote]I’m a gentoo user as well, but I won’t complain about development on such a system.[/quote]
I’m a gentoo user as well, but I won’t complain about installation times on such a system.
I think the difference is that C++ libs and programs tend to place a lot more information in the headers than C. Many libraries are also template libraries in which case the entire declaration *and* definition must be in the header. #including a Boost (mostly a template library) header file can easily add megabytes of code to preprocess every time.
Finally, you are right in that there really isn`t any compilation going on (I think. They could be partially compiling inline functions.) it is mostly just preprocessing. But consider that the majority of the lag that people complain about in the compiler *is* parsing.
cout << “hello world!” << endl;
That will take quite a long time to compile because you’re making use of C++ iostreams. Take a look at how those are defined, and you’ll see a lot of templates being used here and there. Because templates are expanded at compile time, expanding those templates contributes a lot towards the compile time.
PCH support in GCC will definitely help speed up the compile times of large scale C++ applications, particularly if you’re heavily into template meta programming. However, it would be really cool if they could make GCC compile as fast as Borland C++, which is arguably the fastest C++ compiler around (measured in compile time, not resulting application speed).
I fail to see the problem of having a couble Gb of precompiled headers on a recent computer, which I suppose is the case for people who develop in C++. On a 80 Gb Harddisk, who cares for a little space used this way? There are people having much more in mp3 files on their HD.
yes, in C++ you get include files of Class definitions AND Member functions. this is from the ability to reuse code in C++ very efficently, though, the code reuse should really only be in the form of Binary class libraries linked to your source, in OSS you don’t get binary class libraries.
Guys, why don’t you guys try Free Pascal? The entire language is perfectly suited for separated compilation, which the compiler perfectly exploits, and the basic compiler speed is already awesome.
(On my system, P4 1,8, GHz the compiler recompiles itself in 48 seconds, it’s 10MB of source code).
But that’s just me, please read what our users write
If I am not mistaken, gcc 3.4 is supposed to include a new c++ parser. This is a parser written directly in c or c++ instead of lex/yacc. From what I hear, the new parser will be faster and will fix some of the bugs in the current parser.
In 1998, some people wasted a lot of time migrating the parser in Free Pascal to lex and yacc for better maintainability. The attempt failed, compilation speeds weren’t even near the performance of the native parse and the savings in source code weren’t as much as was hoped.
Guys, why don’t you guys try Free Pascal?
I myself was wondering why C & C++ people were only now contemplating something Modula-2 had accomplished decades ago; it’s nice to see another implementation of Wirth’s elegant languages has brought that in 🙂
What’s performance like in FP compared to C, C++? Do you have a webpage for that?
What’s performance like in FP compared to C, C++? Do you have a webpage for that?
No, not really. It would also be hard to do a fair comparision; you would need a very large program available in two different languages to do such a comparision.
But, I just did a quick benchmark. I did our “make cycle” test, which compiles the runtime library and the compiler. The resulting compiler is used to start the process again and the process is repeated 3 times. After the cycle, the generated compilers are checked for differences, to make sure the compiler is bug free.
I did the cycle 2 times to make sure the caches were warm. The last few lines of output of the second cycle:
make: Weggaan uit map `/home/daniel/fpc2/fpc/compiler/utils’
make: Weggaan uit map `/home/daniel/fpc2/fpc/compiler’
make: Binnengaan van map `/home/daniel/fpc2/fpc/compiler’
Start 19:47:36 now 19:48:28
make: Weggaan uit map `/home/daniel/fpc2/fpc/compiler’
So, the 48 seconds were too convervative, I can compile the compiler 3 times in 52 seconds.
How many lines of code did I compile? That is three times 215000 lines, totalling 645000 lines. Thus the compiler can do about 645000/52 ~~ 12400 lines/second.
Now GCC. I’ll do a test with the Alsa sound driver. Configure already executed, warm caches.
It took 113 seconds.
The alsa sound driver consist of almost exactly 200000 lines of source code. 200000/113 ~~ 1769 lines/second.
So, in this primitive benchmark, Free Pascal compiles about 7 times faster than GCC.
i don’t think precompiled headers are usable because of gcc flags and preprocessor directive. here a alternative : ccache, it stores pre-processor output, and the next times, if the command lines arguments are the same, the cache is used instead of calling again and again cpp. it needs additionals disk space but it is really impressive when compiling several time a project. workd with C and C++, because it just caches cpp output
works better when combined with great tools likes scons
>until they come up with some nice User tools for system
>configuration and services set up, both for server and client
It’s called a configuration file. I find editing config files MUCH easier then dealing with GUI’s.
Isn’t it the template code in headers that is the real cause of slow compile times for gcc? In other words, if you were just using <stdio.h> and using non-template c++ code would your compile speeds be comparable to just using straight c? We didn’t use templates much in our code because we were targetting an embedded system, so I’m no expert on templates, but I do know compiling the gtkmm library(which uses templates heavily) compiled very slowly.
« We didn’t use templates much in our code because we were targetting an embedded system »
C++ template are compile time, so there is no overhead at runtime. you shoudl really try template and especially metaprogramming, because with inlining you’ll be able to produce better code which lore important than compile time, especially on small systems
I wasn’t talking about the speed of the templated code at runtime. I was referring to code-bloat and the compiler’s ability to actually do templates properly. This was 5 years ago or so, when almost no compiler did templates right. And I’m sure the compiler we were using was even worse than the average. The embedded product has since moved on to linux, but since I’m no longer there I have no idea how much they’re using templates.
because with inlining you’ll be able to produce better code which lore important than compile time, especially on small systems
Inlining, aside from possibly improving execution speed by removing the function call, increases the size of your executable. I don’t know much about embedded systems, but I’m under the impression that storage space is particularly expensive on such systems. Which is probably why C and assembler are still king.
a good template policy about instanciation can reduce a lot code bloat. in a real project, if you create a class Foo<T>, you’ll use a finite number of Foo<MyType>. therefore you are able to seperate definitions and declaration. some compilers provide facilities.
i made a small scientific app, the most used type was My:Array<long double>. first i #include it every where needed : seperating declaration and definition reduced compile time by 3. i guess feature than make programer able to use another instanciation model (not the code bloating inclusion one), like import, will give a boost
What’s the deal on export? I thought that was one of the last hurdles that compiler vendors were having with being fully compliant and had to do with separating declaration and definition. Recent compilers might be different, but it was my understanding that the compiler would be producing identical code for each .cpp file that #include’d in a template definition. So that in effect if you have vector<int> for multiple .cpp files, there would be identical produced template code over mulitple compilation units which produced the bloat. Of course, I might be totally wrong about this.
Not everyone has mastered Modern C++ Design, you know 🙂
@TLy: That GCC is slower than commercial compilers is a myth. It used to be true, but isn’t any longer. Go check out the Coyote Gulch benchmarks of GCC vs Intel C++. In cases where ICC’s auto-SSE-vectorizer isn’t invoked, GCC and Intel C++ are very competitive.
@Roy Batty: For modern compilers, its not a bloat issue, but a compile-time issue. You’re right that the compiler will generate multiple versions of, say, the vector<int> template, but an optimizing linker (like GNU LD) will prune out identical code sections like that. The compiler emits each template instantiation to an ELF (or PE) section named “.gnu.linkonce.<something>”. The linker will then look through all the sections that begin with “.gnu.linkonce” and get rid of any duplicates. What export saves is potentially compile time, so that the vector<int> template is only *compiled* once.
However, most people regard “export” as dead-on-arrival. Only a few people really support it, and it has turned out to have some major limitations that make it not as benificial. IMHO, C++’s entire include mechanism needs to be scrapped and replaced with a proper module system. A good module system would make for better optimization (note the hoops ICC goes through to do IPO) and shorter compile times.
My take on it is that the real solution would be multi-level precompiled headers. A single PCH is not very useful, because if any of the headers in the PCH changes, the entire PCH has to be recreated. Therefore you can only put very rarely changing file into it. I believe that at least 3 levels of PCH would be necessary. Level 1 would be the library level, containing files such as boost, Qt, wx, windows.h, whatever system libraries you use. They hardly ever change. You can also put some of your most stable libraries in that PCH. The second and thrird levels would contain application level headers, grouped by frequency of change. In a big application there are very rarely changing portions, rarely changing portions, and frequently changing portions. Even during the development phase you could re-organize levels (priorities). If you work on a certain part of the application, you remove that from the level 3 PCH. I could utilize even more than 3 levels, but I think 3 would be enough in most cases. Even a single PCH is better than nothing, but if I was a compiler writer, I would try to think about more than 1 PCH. Nobody has it, and it would be a killer feature.
Precompiled headers are a very good thing, for developers. The Author seems to miss one thing about them: Borland C++ 5.5.1 creates a PCH file *.tds in the directory where you compile your programms. Because it isn’t global to the system, there can be a custumised PCH for every app, which greatly speeds up compilation time.
A “make clean” when you’re done … where is the problem? Better compilation speeds with all your (local) switches and no waste of diskspace.
@Roy Batty – Just to confirm your point about slower compile times when using templates I compiled an iostream ‘Hello World’and an stdio version with g++:
Weatherleys-Computer:~ james$ time g++ -O0 -o hello hello.cpp
Weatherleys-Computer:~ james$ time g++ -O0 -o hello hello.c
Weatherleys-Computer:~ james$ g++ -v
Reading specs from /usr/libexec/gcc/darwin/ppc/3.3/specs
Thread model: posix
gcc version 3.3 20030304 (Apple Computer, Inc. build 1495)
@james : focus on the right things. when you use templates, you have to pay for it but you get better code et type checking. yes, when you compile template, there is a compile-time overhead, of course. But what we are talking about here, is that current C++ doesn’t offer a good way to use template in modular programming : most people use the inclusion model and the overhead is tremendous. we are looking for ways to get rid of that burden. for templates, PCH is not a solution, caching cpp output neither.
export may be a solution : you seperate declaration and definition, and at link time, your compiler would generate all template instanciation needed and make inlinings. but implementing export is very expensive.
People use inclusion model, because it is easy to use and portable. you may have a look at g++ -frepo : this is non standard way, but it works.