Jussi Pakkanen, creator of the Meson build system, has some words about modules in C++.
If C++ modules can not show a 5× compilation time speedup (preferably 10×) on multiple existing open source code base, modules should be killed and taken out of the standard. Without this speedup pouring any more resources into modules is just feeding the sunk cost fallacy.
That seems like a harsh thing to say for such a massive undertaking that promises to make things so much better. It is not something that you can just belt out and then mic drop yourself out. So let’s examine the whole thing in unnecessarily deep detail. You might want to grab a cup of $beverage before continuing, this is going to take a while.
↫ Jussi Pakkanen
I’m not a programmer so I’m leaving this for the smarter people among us to debate.
The problem with C++ modules is…
They are pretty much fragile, and practically require recompiling everything anyway. In other words, does not offer significant benefits over just linking with dynamic libraries.
Now people will say “it only triggers if the interface is changed”. Yes… if the interface changes you need to recompile the module. But in practice you also need to recompile the module when your toolkit / compiler is updated, and since everything is based on the standard libraries, you need to recompile them. And even if the toolkit stays the same, but an external library like OpenSSL is changed, you still need to recompile a major portion of your codebase.
Add in the “build system hell” and many different dependency platforms, you see how much of an undertaking this is.
So, while in theory this sounds like a good idea, there is a reason it has not materialized in practice.
I assume you’re talking about linking, not compilation. If your codebase doesn’t change but a dependency gets updated, you only need to re-link, which is much faster than rebuilding everything from scratch. This behavior applies to virtually all statically compiled languages, not just C++.
The author of the article complains about C++ long compilation times, but today’s trendy language, Rust, is arguably one of the slowest, both in terms of compilation and linking. It’s really, really bad. Here’s a comparison I made recently:
Xfce 4.20 (mostly written in C):
Time to build: 7m29.130s
Total size: 59.3 MB
GNOME 49 rc1 (mostly written in C, except some projects in Rust, such as glycin, gnome-user-share, loupe, papers):
Time to build: 17m16.287s
Total size: 145.5 MB
COSMIC (entirely in Rust):
Time to build: 46m04s
Total size: 588.9 MB
Linux kernel 6.16.4 (entirely in C):
Time to build: 8m47s
Total size: 831.5 MB (note: includes downloaded firmware, so not directly comparable)
Versions:
GCC: 15.2.0
Rust: 1.89.0
LLVM (used for linking in Rust): 21.1.0
Build flags:
GCC: -O3 -march=x86-64-v2 -mtune=generic -fno-semantic-interposition -fno-trapping-math -ftree-vectorize -fno-unwind-tables -fno-asynchronous-unwind-tables -ffunction-sections -fdata-sections -Wl,–gc-sections -Wl,–as-needed -Wl,–build-id=none -flto=auto -Wl,-O1 -fno-ident -s -fmodulo-sched -floop-parallelize-all -fuse-linker-plugin -Wl,-sort-common
Rust: -Copt-level=3 -Ctarget-cpu=x86-64-v2 -Ztune-cpu=generic -Clink-arg=-ffunction-sections -Clink-arg=-fdata-sections -Cforce-unwind-tables=no -Clink-arg=-Wl,–gc-sections -Clink-arg=-Wl,–as-needed -Clink-arg=-Wl,–build-id=none -Clto=fat -Cpanic=abort -Cdebuginfo=0 -Cembed-bitcode=yes -Cincremental=yes -Clink-arg=-fuse-ld=lld -Zdylib-lto -Zlocation-detail=none
fulalas,
I think it’s worth talking about the caveats with your data.
1) Comparing arbitrary code bases that don’t do the same thing may not be statistically valid without a larger sample size.
2) Projects may include conditional code that doesn’t get compiled. This is definitely the case with linux, the vast majority of which is never compiled by default because it’s not configured, which obviously needs to be accounted for.
3) You didn’t include any C++ examples, which is kind of important here since complexity-wise rust is more similar to C++ than C and the topic was about C++ anyway.
4) It’s important to count only the source code being compiled instead of just taking the sum of all files since many project files include resources that have no relevance to compilation. To be fair you did mention this, but it still seems like it wasn’t accounted for and therefor the numbers being compared aren’t meaningful.
5) Comments will change the density of actual code versus fluff that doesn’t contribute to compilation complexity.
The “benchmarksgame” has benchmarks for tons of languages including rust that shows make time.
https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/rust-gpp.html
A cursory glimpse does seem to favor C++ for compile speed by a factor of 2-3, but when you look more carefully most of those that beat rust compile time by a factor of 3 performed the worst at run time by a large margin (look at mandelbrot for example). My take on the data is that rust’s runtime performance is excellent, but compiler performance lags. This makes me wonder if the rust compiler is slower because it’s not as well optimized, or if the compile time safety checks fundamentally requires more work.
The total size I mentioned refers to the generated binaries, not the source code.
COSMIC is in alpha stage and is very limited in many ways but it already takes more than twice as long to build compared to GNOME. I’ve seen the code; it’s not their fault. Rust ecosystem might improve over time (like using lld linker by default in the next release), and that could reflect positively in COSMIC.
But aside from that, build times in COSMIC are likely to keep getting worse. You can see that in GNOME actually: since they started migrating some projects to Rust (evince → Papers, and Eye of GNOME → Loupe), build times have been taking longer and longer.
And I haven’t mentioned the RAM usage… The project cosmic-applets, for instance, can take more than 20 GB of RAM during the build process when using 16 threads! It’s really, really crazy!
> rust is more similar to C++ than C
Rust is not object-oriented, so I would compare it to C, not C++. It’s no accident that Linus accepted Rust in the Linux kernel base but never accepted C++.
> This makes me wonder if the Rust compiler is slower because it’s not as well-optimized, or if the compile-time safety checks fundamentally require more work.
They’ll always try to convince you that it takes longer because it’s better. 😀
fulalas,
Thank you for clarifying.
I’m still not that keen to compare the compile times of completely unrelated projects because that’s very inconclusive. But given the data from benchmarksgame I am going to agree with your conclusion anyway.
I’d like to contest this. I strongly favor of rust getting rid of inheritance while keeping all the important parts of OOP in ways that C can’t do. Outside of university, I don’t think I’ve ever used inheritance in my C++ OOP projects. It’s debatable how important inheritance really is given that interfaces in modern languages have largely superseded it while inheritance has gone out of favor. The rust book goes into more detail about how rust being OOP is in the eye of the beholder:
https://doc.rust-lang.org/book/ch18-01-what-is-oo.html
I suspect one of the reasons for rust being slower to compile is because it uses LLVM, which is itself slower than the GNU compiler. At the end of the day there’s no denying that devs pay the compilation costs regardless of what’s responsible, but as a matter of curiosity I’d still like to have a better understanding of what’s responsible for slower performance and whether it’s intrinsic or something that can be solved.
I came across this link where the author compares rust and C++ compilation in great detail.
https://quick-lint-js.com/blog/cpp-vs-rust-build-times/
Similar conclusion to the benchmarksgame in terms of compilation speed.
The author also tried to use C++20 modules, but he couldn’t get them working.
There’s also another rust backend called cranelift that supposed to offer faster compile times than rust’s LLVM backend at the expense of runtime optimization. I don’t have any experience with it, but seems very relevant to the discussion:
https://lwn.net/Articles/964735/
Alfman,
While these kind of exploratory projects are nice, and they bring new ideas to the established domains… them being relevant is still to be seen.
We have discussed this before, but GCC’s arrogance left the entire IDE integration slate to LLVM and derived tools. Cranelift not only needs to overcome basic deficiencies (like the entire set of optimizations that are built into LLVM, which makes it slower in the first place), but they also need to integrate well with language services in IDEs.
“You can use cranelift for faster compilation in vscode, but you still need llvm for refactoring and debugger” would not be a good sales pitch.
Otherwise, it would only serve as an inspiration source for LLVM to take ideas from.
sukru,
Yeah, that’s a different issue.
Mind you I haven’t used it, but going by what I’ve read the goal was to use cranelift for faster developer/debug sessions but keeping the original LLVM stack for optimized production builds. Even when dealing with GNU C/C++ it can be better to disable optimizations: 1) RIP correlates much more closely to the source code, 2) variables aren’t optimized away. To the extent that compile times are a bottleneck for development/debugging, using a faster but less optimized compile path seems fine to me.
C can be subject to more undefined behavior/side effects that can change a program’s behavior under optimization. Rust should be more robust against this because they’ve done lots of work to cull undefined behavior from the language, even in cases that are notoriously difficult for humans as with multithreading. So I think there’s a good case to be made for using a fast compiler path for rust development. Obviously I can think of exceptions to this, but in general this doesn’t seem like a bad approach.
I don’t think this is on anybody’s radar, but in principal even a very slow rust interpreter/JIT compiler might be useful for debugging (sort of like testing adhoc javascript inside of the developer console).
Alfman,
While I can see the point for faster debugging cycles, it comes at the cost of accuracy.
If you have a customer bug report, the last thing you’d want is using a different toolchain when the error was caused by a compiler bug.
(Don’t say compiler bugs are rare. They eventually hit all of us).
fulalas,
I was indeed talking about compile times.
Again, the proposed benefits are theoretical. But realities of actual code bases make them non-existent in practice.
The problem is that the “C++ experiment” has gone on for too long.
Been coding C++ at academic/consulting level for ~30 years.
Two years ago I started evaluating the Modula-3 language as my primary systems-programming language.
After about a year of evaluation, I was hooked …. I made the switch to Modula-3 coding for new projects/research.
As mentioned/implied back in 1990’s, Modula-3 :
–> is more elegant
–> has lower cognitive-load
than C++.
The PASCAL-language creators where smart (“prophetic”) in embracing the “module” concept back in 1976 with the “Modula” programming language; very early in the evolution of the PASCAL-based languages..
Modula-3 is preceded, in reverse-order, by the {Pascal, Modula, Modula-2, Modula-2+}.
Also, Modula-3 had been influenced by Mesa, Cedar, Object Pascal, Oberon and Euclid languages.
Compare this with C++ essentially making the direct jump from C.
C++ first appeared in 1985, whilst Modula-3 first appeared in 1988.
As a person well versed in the Modula-3 language, often I get the feeling that the Modula-3 programming-language-architects realised the “mess” that C++ will develop into.
The “module” system in Modula-3 is very {mature, powerful, easy to use} and modules compile fast.
Generics in Modula-3 (analogous to “templates” in C++) are also modular and relatively simple; unlike C++ templates.
C++ has too much “baggage” to deal with before “modules” can be considered to become a
first-class-practical-safe-citizen.
My enthusiasm with Modula-3 has led me to develop a comment-preserving {C,C++}-to-{Modula-3} transpiler using Griswald’s Icon programming language. This transpiler is a component in my M3-IDE (Modula-3 integrated development environment) developed using the Icon programming language.
References:
https://github.com/modula3
https://github.com/modula3/cm3/discussions/1177
https://github.com/modula3/cm3/discussions/1199
cade117,
+1
PASCAL would be so much worse if, instead of modules, it only had include files like C/C++. It’s thanks to modules that languages like pascal scaled could scale well and we didn’t have to recompile units over and over again. I’ve used it with huge code bases and recompiling was fast. The absence of modules in C and C++ has left both these languages deficient for decades, impeding more straitforward compiler optimizations and frustrating devs the world over. I agree with you about the merits of pascal & modula, but for better or worse unix, standardized on C and became not only the defacto system programming language, but also the industry’s greatest common divisor.
While the lack of modules has been detrimental to software & kernels, it seems very difficult to fix things at this point in time. The ideal time for C++ to get modules would have been around 1985 when it was invented. Now it just seems way too little way too late and the vast majority of code bases will never benefit from C++20 modules and it just serves to fragment libraries and projects over language feature sets.
Pascal is indeed a perfectly valid and competent programming language, without the added complexity of “newer” languages.
https://www.youtube.com/watch?v=dwnaR0687iI
When re-evaluating C++, after decades of use, I chose Modula-3 over Pascal due to:
–> Modula-3 represents a progression from it’s Pascal/Modula-roots suitable for “programming in the large”
–> the extra Modula-3 detail, above it’s Pascal-layer, has a relative small cognitive-load footprint
–> Modula-3 removed unsafe/unecessary detail from Pascal language ancestor, etc. to ensure a less bloated programming language standard; unlike the “1,000,000-pound-gorilla” that is the C++ programming language standard.
–> as noted in 1990’s,. Modula-3 is “elegant” and has the power of C++/Ada whilst having a much smaller mental footprint than those languages (from my experience, serious M3-coding has a much lower cognitive-load than serious C++-coding)..
Kochise and Alfman,
I would argue the fall of Pascal was precisely because of their modules (or rather Units, TPUs, later BPUs and finally DCU/DFM).
Why?
Because they combined implementation and interface into the same module, and did not have proper back compat system, you’d need to purchase the same libraries again and again every time you upgraded Turbo/Borland Pascal or later Delphi.
The “Interface” portion of the Unit was also compiled (or rather preprocessed into something very similar to what we are discussing here). When this is C and PCH (precompiled headers) it is just a simple clean rebuild to gets you up do that. For Pascal? You are out of luck.
This would have been avoided if they had good interface separation, and being able to recompile that.
(Maybe not as far as C headers, but at least something along the lines of Java or even DLLs)
I agree.
Standardisation on C is not a real problem as Modula-3’s access to C-language code is relatively intuitive; i.e. simple foreign-{function,variable}-access model. Coding direct-access to C-language-code feels very seamless. Modula-3 procedures can be called from C code.
Ultimately, with Modula-3 we have a {Modula-3,, C, assembly}-languages development environment.
I like that access to potentially dangerous C/pointer code can be isolated in an UNSAFE module.
I tried using modules in C++23 last year but had problems with compiling code that is portable between Visual Studio, Msys GCC, Linux/Mac GCC. So I gave up on them for the time being,