What’s Coming in C++ ’09

Submitted by Esther Schindler 2006-11-19 General Development 77 Comments

The C++ standardization committee has set 2009 as the target date for the version of the language standard. And a lot will change. C++09 will include at least two major core features — rvalue references and type concepts — and plenty more relatively minor features.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

77 Comments

2006-11-19 7:09 pm

rover
This is mostly the inclusion of some of the boost libraries in the standard library. Saying that “a lot will change” is a serious overstatement. Even with the new concepts type checking C++ templates are still paleolithic.

If you are interested in a substantial reengineering of C++, in something that actually fixes its major shortcomings, try the D programming language.

http://www.digitalmars.com/d/comparison.html

Incidentally, D is scheduled for its gold release January 1, 2007.

2006-11-19 7:24 pm

Sodki
If you are interested in a substantial reengineering of C++, in something that actually fixes its major shortcomings, try the D programming language.

I second that. D is a very nice language with a clear focus. My first impression was that it has the best of Java, the best of C++ and none of they’re major weaknesses.

2006-11-19 7:44 pm

luzr
I second that. D is a very nice language with a clear focus. My first impression was that it has the best of Java, the best of C++ and none of they’re major weaknesses.

Adds one major weekness – its memory model is based on conservative GC, which makes it unpredictable and in reality unusable for some important applications (like cryptography or any other software that deals with noise-like data).

2006-11-19 8:19 pm

rover
its memory model is based on conservative GC, which makes it unpredictable and in reality unusable for some important applications

Not really, automatic memory management is not mandatory in D. You can handle memory manually just like in C++. So if C++’s memory model serves your purposes well so does D’s.

2006-11-19 8:45 pm

luzr
Not really, automatic memory management is not mandatory in D. You can handle memory manually just like in C++. So if C++’s memory model serves your purposes well so does D’s.

Well, in theory yes. In practice, all D libraries will use GC, so not using conservative GC means you are excluded from any library use.

2006-11-20 8:56 am

GreatBunzinni
My first impression was that it has the best of Java, the best of C++ and none of they’re major weaknesses.

Well, except for the fact that D includes unnecessary stuff in the core language instead of delegating it to a library. Where is the need to bloat the core language with stuff which is perfectly well achieved with the basic constructs?

2006-11-20 9:04 am

GreatBunzinni
If you are interested in a substantial reengineering of C++, in something that actually fixes its major shortcomings, try the D programming language.

http://www.digitalmars.com/d/comparison.html

That comparison chart is at least dubious, not to mention dishonest. The chart’s author opted to ignore simple facts about the C++ language, claiming that it doesn’t have certain elements (garbage collection, resizeable arrays, built-in strings, etc..) that it certainly has. If the D programming language is that great why do the people behind it try to spread incorrect information and ignoring simple C++ facts about basic stuff just to add a couple of plus signs in the comparison? Moreover, how come that comparison chart is still up even after the author himself, while trying to spam C++ newsgroups about his new wonderful language, was shown time and again that he incorporated false information about C++?

2006-11-20 11:35 am

phgt
Last time I checked, C++ did NOT have a Garbage Collector.

2006-11-20 12:26 pm

sp222
C++09 will have GC.

2006-11-20 12:38 pm

MORB
Do you have a link to the proposal for that? Don’t remember to have seen this.

2006-11-20 12:59 pm

GreatBunzinni
There are quite a few C++ Garbage Collector libraries which anyone can choose, which is the right way of doing things. After all it enables those who really want GC in their C++ projects to use it and those who don’t to not use it. At least to me that is the right way of doing thins, instead of mandating what you can and cannot do.

Claiming that there aren’t GC in C++ is similar to the claim that C++ doesn’t have resizeable vectors or even strings. It is a claim which is only made by those who are ignorant of C++ or wish to ignore the facts to be able to make some other product look better than it is. Where is the need to lie about such trivial things? Isn’t D capable of standing up on it’s own without having to resort to misleading jabs and dishonest alegations?

2006-11-20 2:40 pm

clayasaurus
Quote -> “Claiming that there aren’t GC in C++ is similar to the claim that C++ doesn’t have resizeable vectors or even strings. It is a claim which is only made by those who are ignorant of C++ or wish to ignore the facts to be able to make some other product look better than it is. Where is the need to lie about such trivial things? Isn’t D capable of standing up on it’s own without having to resort to misleading jabs and dishonest alegations?”

If you read the http://www.digitalmars.com/d/comparison.html page, you’ll realize we are talking about /core/ language features, not features implemented as library add-ons.

Quote from the comparison page…

”

The Hans-Boehm garbage collector can be successfully used with C and C++, but it is not a standard part of the language.

”

Another quote from the comparison page…

”

Part of the standard library for C++ implements resizeable arrays, however, they are not part of the core language. A conforming freestanding implementation of C++ (C++98 17.4.1.3) does not need to provide these libraries.

”

In addition, there is a lot to be gained by implementing these features in the /core/ language rather than as a library add-on. For reason’s on this, I point you to this page http://www.digitalmars.com/d/cppstrings.html
2006-11-20 3:26 pm

MORB
Not having a standard GC forced on me is an advantage as far as I’m concerned.

As for the second part, it seems irrelevant. Whether or not the standard specifies that a comformant C++ implementation should include std::vector, all C++ compilers include it.

Why do the D people have to try so hard to argue against C++ that they have to resort to dissecting obscure details of the spec to try to get their point across?

Please argue purely on technical merit. If your language is really that good, it should be enough.

Here are some facts for you D apologists:

– Standard containers are part of all relevant C++ compilers.

– The standard library ships with all relevant C++ compilers.

– Exception handling in C++ doesn’t involve setting up special stack frames (at least on good compilers, which exclude visual c++)

It is used as part of an argument as to why D, using GC, is faster than C++ – which is, as always when it come to discussing the GC issue, not supported by any kind of benchmark anyway.

Oh, and the builtin string in D help concatenating strings. Nice.

This bit of syntactic sugar doesn’t justify switching to a different language.

While this could be improved in C++, this is not an issue one runs often into, and if it does you can live with constructing a temp string object.
2006-11-20 11:48 pm

clayasaurus
1) D doesn’t force you to use the GC. If you are a competent enough programmer, you can disable the GC and use the C runtime malloc/free to do your memory management with. Walter Bright (language author) wrote his entire Empire game in D without a garbage collector.

2) It is not dishonest to say that C++ does not have a re-sizable array implementation built into its core language spec, because it doesn’t. It is implemented as a part of the C++ standard library. This is mentioned on the comparison page. It is much nicer to have resizable arrays in the /core/ language rather than as a library add on. If the D language author _was_ trying to be dishonest, he wouldn’t create a link on the comparison chart that goes to explain that vectors are implemented as part of the C++ standard library.

3) If you looked at the link, there _is_ a benchmark to support that garbage collection does give you better performance.

http://www.digitalmars.com/d/cppstrings.html

Look at the word count example.

Resistance is futile… you will be assimilated.

~ Clay
2006-11-21 4:38 am

rayiner
The fact that GC performance can beat manual memory management performance is something not enough people realize. The same people who in 2006 bitch about the performance cost of GC have probably never written their own memory allocator. They think malloc() and free() are magic instructions implemented in the CPU. They don’t realize that both usually involve the processor traipsing merrily around non-localized list structures looking for blocks and doing coalescing.
2006-11-21 10:17 am

MORB
Garbage collection involve scanning objects to check which ones are reachable. This is not exactly a cheap operation either, and it is dependent on the complexity of the data. Do you think that garbage collection is magic?

I’ll stick with the solution that doesn’t involve costly, unbounded collection at random points in time, which doesn’t result in heap usage bloat, and which works with resources of all types including files, sockets etc., thank you very much.

Edited 2006-11-21 10:21
2006-11-21 9:41 am

MORB
2) It is not dishonest to say that C++ does not have a re-sizable array implementation built into its core language spec, because it doesn’t.

It’s dishonest because it’s irrelevant. Whether it’s in the language or in the library is an implementation detail.
2006-11-21 6:52 pm

clayasaurus
“It’s dishonest because it’s irrelevant. Whether it’s in the language or in the library is an implementation detail.”

It is not irrelevant because it is much nicer to have these features built into the core language vs. as a library add-on, and this advantage should (and is) noted on the comparison page.

~ Clay
2006-11-21 10:08 am

MORB
By the way, with the code in that benchmark, you essentially get the difference of time to cleanup after executing the program.

The D version will actually never run through an actual GC cycle (because nothing is ever discarded in the example during the execution) and just discard the heap, whereas the C++ version will have to delete the map.

Once again, it’s a situation chosen carefully to advantage D.

Give me a benchmark for a real world application manipulating a lot of objects, including both creating and discarding them, and whose execution does involve garbage collection cycles to actually be performed.
2006-11-21 3:13 pm

DavidM
“Once again, it’s a situation chosen carefully to advantage D.

Give me a benchmark for a real world application manipulating a lot of objects, including both creating and discarding them, and whose execution does involve garbage collection cycles to actually be performed.”

In those cases you can do exactly the same thing you would do in C/C++: use malloc or create a pool of reusable objects. So in that case it isn’t a win for D or C/C++; they are similar in this reguard.

On the other hand using D allows lots of other features not in C++ currently as well as managed memory when it makes sense to let a GC handle it for you(probably at least 75% of cases).

In the end if you have set your mind against D no one will convince you. You are stuck on C++.
2006-11-21 4:11 pm

MORB
Well, it still doesn’t tell me whether a pure garbage collection based solution would be more efficient than a reference counting one.

Reference counting buys me most of the automatism of a garbage collector except for cycle detection (I think an hybrid solution considering for garbage collection only classes that can possibly be involved in circular constructs would be a good compromise)

Yes, reference counting comes with the overhead of calling individually free on every object that you delete, but on the other hand it doesn’t involve going through every live object to find which ones are reachable. So the cost of reference counting doesn’t increase with the number of live objects currently in existence. You pay only the cost of objects actually getting deleted.

And it naturally spreads the cost of deleting stuff over time, instead of deferring it and doing it all at once later. I find a more desirable behavior for interactive applications. I’d rather have my performance being hit as homogeneously as possible than risking having my application freezing, even for a tenth of a second, on a regular basis.
2006-11-21 4:26 pm

luzr
I think that reference counting is dead end. And it is very likely slower than GC in multithreaded code (atomic increments and decrements are expensive).

It is surprising why people are still so deeply buried in pointers, maybe it is the C legacy?

IME, clean deterministic scope based design frees you from all this low-level stuff.
2006-11-21 5:09 pm

MORB
You can avoid the need for atomic incrementing by using weighted reference counting.

The cost of it versus garbage collection, anyway, essentially depends of your usage patterns, and we all tend to be biased toward the usage pattern of the stuff we tend to work on.

In a game engine (which is my subject of interest), there is not a lot of data movement. You update stuff often, but you don’t create and delete objects that often. And you don’t have a lot of multi threading going on, usually, beyond the background loading of resource.

As for pointers, it’s not a legacy problem. It’s just that the pattern of an object referring to another object is necessary to represent graphs (regardless of the exact mechanism behind it and how you want to call it, pointer, reference, index, whatever)

Deterministic scoping is useful, but you can’t do everything with it.

How would you go about representing a 3d scene graph only with deterministic scoping, for instance?
2006-11-21 6:11 pm

luzr
You can avoid the need for atomic incrementing by using weighted reference counting.

AFAIK only for the copy. Destruction of reference still has to lock. But I admit that my knowledge of weighted reference counting is not deep. However, it looks like in the end, all “smart” schemes were abandoned in favor of simple atomic dec/inc….

It’s just that the pattern of an object referring to another object is necessary to represent graphs

Using pointers to point to things is of course OK – you need such tool. But using them to manage lifetime of heap objects is simply way too low-level.

How would you go about representing a 3d scene graph only with deterministic scoping, for instance?

I do not have enough knowledge about game engines to give qualified reply. However, I bet that the whole scene is bound to the certain scope

If you want trivial answer, you can always have something like

struct Node {

vector<int> pointsto;

….

};

struct Scene {

vector<Node> node;

……

Obviously, things are not always that simple… But again, I do not have enough experience with 3D scenes. Anyway I know that I was able to find similar solution for any single problem I have encountered in past 8 years.
2006-11-21 8:52 pm

rayiner
Well, it still doesn’t tell me whether a pure garbage collection based solution would be more efficient than a reference counting one.

Almost all studies of the subject have shown that reference counting is slower than a GC. Think about it this way. For both reference counting and GC, you need a table of reference counts for each object. For GC, you build this in a simple scan over the heap. In reference counting, you collect the same exact data, but you maintain it incrementally. Common sense suggests that the former method will cost more, as doing things incrementally almost always costs you throughput.

There is an interesting powerpoint here that explores the duality of ref-counting/gc: http://www.sct.ethz.ch/teaching/ws2005/semspecver/slides/takano.pdf

There’s numerous implementation issues that make reference counting a bad idea (at least, pure ref counting). Boost’s shared pointers are pretty good, but they still turn every pointer copy into a rather expensive operation, involving a non-local memory access, some arithmetic, and usually an expensive, pipeline stalling locked compare/swap operation. The latter is a bitch, because the synchronization method has to be robust and fast, and is potentially invoked on every pointer *copy*. Combine this with copy-happy STL data structures, and the resulting performance is not that great.

Reference counting buys me most of the automatism of a garbage collector except for cycle detection (I think an hybrid solution considering for garbage collection only classes that can possibly be involved in circular constructs would be a good compromise)

Hybrid solutions are desirable for another reason. They preserve the incrementality of reference counting while offsetting some of its performance losses (and collecting cycles to boot). If you read the above PPT, you can see that such hybrids and clever GCs can be seen as the same sort of beast.

Yes, reference counting comes with the overhead of calling individually free on every object that you delete, but on the other hand it doesn’t involve going through every live object to find which ones are reachable.

It’s not just that overhead. Ref/counting cannot be faster than malloc()/free(), because it needs an underlying malloc()/free(). As I explained in my response to luzr, a GC needs something resembling malloc()/free() as well (actually, something more resembling a page allocator in a kernel), but the latter can take a lot of implementation shortcuts. Reference counting schemes cannot take advantage of that.

So the cost of reference counting doesn’t increase with the number of live objects currently in existence.

No, but it increases with mutation rate. Reference counting costs you more as you manipulate a larger number of objects.

I find a more desirable behavior for interactive applications.

A good GC is not going to result in pauses a user is going to notice. 100ms pause times are something you’re not likely to find outside of server apps with multi-gigabyte heaps. State of the art GCs have max pause times of sub-10ms, and average pause times of sub-1ms. You’re going to have that kind of latency just running on Windows, and an order of magnitude longer than that just doing disk I/O.
2006-11-21 7:40 pm

tdemj
Give me a benchmark for a real world application manipulating a lot of objects, including both creating and discarding them, and whose execution does involve garbage collection cycles to actually be performed.

Not only that, but real applications have object inter-dependencies, with small changes as well as large bursts. A good benchmark has to simulate the following real-world scenarios:

– small and large containers, insertions and deletions in the middle

– alternating small and large object allocations / deletions, with a mixture of various object lifetimes (some objects deallocated shortly after allocation, other living long)

– heavy interlinking between objects residing in 2-3 containers. For example, a map pointing to objects in a vector. Insertions and deletions being performed periodically

– simulating periodic small changes, with occasional bursts

Just think about a CAD application or a drawing package, or a word processor, where you have an object tree, with some inter-linking, slow changes with occasional bursts.

Although that string benchmark is not entirely useless (it demonstrates some real-world situations), a large application behaves completely differently. It makes sense to benchmark string concatenation, but you can’t draw a conclusion regarding the GC based on that alone. Before I decide to use a GC, I have to benchmark a system I outlined above, in order to learn the strength and weaknesses of it. I may very well decide to use GC in 90% of the code, and use reference counting techniques for the rest. One can’t predict how a sophisticated, highly interlinked data structure with occasional bursts will behave without actually trying it.

It’s too easy to write a micro benchmark that illustrates a certain advantage in one product, but has absolutely nothing to do with real-world applications.
2006-11-20 5:20 pm

GreatBunzinni
So in one issue (the GC) the D apologist claimed that he listed C++ as not having because it isn’t included in the standards but in another issue (string, vector) he claimed that he listed C++ as not having because, although it is very much part of the standards, it isn’t part of the core language.

Don’t you find it a bit dishonest that the author of the comparison arbitrarily claims that C++ doesn’t have something that it indeed has just because the author feels like it? And what does it say about the D programming language itself if the D authors are only capable of pushing it by lying about C++’s capabilities?

2006-11-20 9:12 am

vsmi
why you all start to glorify your favourite/own language by saying c++ is bad? in fact java and c# and whatever else just copied most features of c++ and added some marketing… if you cannot write good portable code this is not problem of c++…

2006-11-20 10:47 am

luzr
why you all start to glorify your favourite/own language by saying c++ is bad?

Well, I think this is because C++ is still the most relevant language for its application domain.

2006-11-20 9:57 am

MORB
I’m not really sure about D.

C++ has the very useful property that you don’t pay what you don’t use. The language is entirely designed around that idea.

This is why it doesn’t, for instance, include a garbage collector.

D obviously isn’t designed around that concept, so whenever I use one of the D features, how do I know if it’s expensive or not? Anything that has a runtime cost associated with it in C++, like dynamic_cast, virtual functions, exceptions and so on are clearly indicated as such.

Also, in C++ there is a clear rational as to why a feature is present in the language, ie what real-world, common problem it solves.

D seems a random mishmash of “cool” features without clear explanation as to why they have been implemented.

For instance, the static_if thing: in which situation would you want to use this?

Also, why would I want unit testing to be part of the language? What does it achieves? C++ gives you the freedom of choosing whatever implementation of unit test framework you want.

I also find their D/other languages comparison to be a bit disingenuous.

First column: name of feature XXX as found in D. Other columns: No, even though different but god solutions may exist to achieve the same thing in these languages.

Finally, I bothered reading their rationale for including a garbage collector, and in addition to repeating the same argument twice, some of them are also either plain wrong or dubious.

If I were to nitpick, I’d also mention that the poor grammar at some places in their documentation is a bit of a turn-off too. They should at least fix that.

By the way, that article doesn’t mention every change in C++09. It omits some other very interesting things like vararg templates, static assertions and automatic type inference in declaration.

It might not look very useful features, but along with concept checking it will make template much more useful. Currently, one of the big problem with templates is not what they can do, it’s that doing complicated stuff can very quickly result in nightmarish compilation errors.

2006-11-19 7:11 pm

malkia
Finally something that makes sense.

On the other hand, I wish C++09 contained specification where the compiler can generate more runtime information about the classes themselves (something like what Objective C does). Off course not required, and option switchable, but better than going through OpenC++ for doing a reflection/introspection system.

2006-11-20 9:06 am

vsmi
iirc currently there are tools that generate class info based on sources – this is not necessarily compiler’s task…

2006-11-20 9:39 pm

rayiner
Yes, this is the compilers task, at least if you want to do it right. The compiler is the one that’s actually parsing the code, building the class hierarchy, etc, so why should you depend on some half-assed external tool to do it?

The Open Dylan compiler (www.opendylan.org) is absolutely phenomenal in this regard. The compiler is structured as a library, which the IDE or CLI links to. The library exposes methods for getting at everything from the parse tree of the sources, to notes from the optimizer.

This is the only sensible way to structure things, especially in a language like C++, for which writing a proper parser is a multi-man-year project.

Edited 2006-11-20 21:40

2006-11-19 7:13 pm

CaptainPinko
it bothers me that you need to use pointers in your STL containers instead of refernces. it means you need to cast and check for null every time. It’d be nice if they could do some magic for you.
2006-11-19 7:41 pm

lawina
Will thread support be built into the language?

Edited 2006-11-19 19:41

2006-11-19 8:14 pm

grfgguvf
I very much hope not.

2006-11-19 9:09 pm

luzr
I very much hope not.

Actually, there are many levels of thread support and C++ would need at least the minimal one. At the moment, any multithreaded code has undefined behavior by C++ standard.

We are not speaking here about language constructs to start threads, but very basic stuff like library code for locking, which cannot be (at least in theory) correctly implemented in C++ (because compiler is allowed to perform optimizations that broke many of multi-threaded algorithms).

2006-11-20 1:10 pm

GreatBunzinni
Aren’t threading platform-dependent? Why would a language which aims to be platform-independent adopt platform-dependent features?

Moreover, there are already plenty of threads libraries for C++. As far as I know even boost has a threading library. As I see it, there is no need it in the standard library and even less on the core language.

2006-11-20 1:24 pm

luzr
Aren’t threading platform-dependent? Why would a language which aims to be platform-independent adopt platform-dependent features?

Once again, the problem is that many thread safe algorithms depend on specific order of operations performed, which is not guaranteed by C++ standard.

Therefore, according to standard, any multithreaded code in C++ is broken.

Of course, there are existing libraries that work – but they are not _guaranteed_ to continue working when compiled with another C++ compiler or even newer version with better compiler (with e.g. different optimizer).

2006-11-19 10:38 pm

Terminal.Node
D libraries do indeed use the GC; D is capable of using any kind of GC that can be “plugged in”. I think most developers know that, in most cases, using a GC is sufficient for a very wide range of applications — the majority of them in fact.

Software that absolutely cannot use a GC still have the option of implementing whatever memory model that seems necessary — it’s not just theoretical: practical examples are available in a couple of D kernel projects. The fact that most D libraries will go the GC route is a matter of choice and convenience. For those that need custom fits, D libraries may be developed to fit those rare occasions.

True, such projects may not have the advantage of using /all/ of the standard D library, but these applications are likely specialized enough that they will warrant custom coding anyway. And it’s an advantage to know that it’s absolutely possible for someone to develop a manual memory management library in D, if they feel the need for it. Also remember that D interfaces just fine with C, so at the very least, the whole range of C libraries are available to D.

C++, along with every other language, has trade-offs. Some peole consider the C++ tradeoffs to be too significant these days, including the very issue of memory management. (D has some tradeoffs too, but, thankfully, without sacrificing flexibility; most D tradeoffs are there to fix many glaring C++ problems) People might notice that /many/ C++ programs don’t really need to do all the memory management that they do — most of the task could be managed automatically. Memory management in C++ has become a finicky and repetitous task, and likely an expensive one from a business perspective, for programmer’s developing software: that’s a lot more lines of code, a lot more complexity, and a lot more possibility for bugs and memory leaks! GC’s certainly have an advantage in this area. To pick out a narrow class of applications that cannot use a gc and then to use that as an argument (an invalid one) against D, is not an objective approach, I think.

Furthermore the D template system will likely allow for many features and tools that work in a memory-model agnostic way — that’s just one of the many advantages of compile time features. By the way, if anyone has a look at D, you’ll see that the template system is far surpassing C++’s in numerous ways now too.

🙂

2006-11-20 9:10 am

luzr
D libraries do indeed use the GC; D is capable of using any kind of GC that can be “plugged in”.

I can be missing something, but from examples given by Walter in either D website or in C++.lang.moderated it is obvious that D can work with conservative GC only (otherwise examples given would be broken).

Memory management in C++ has become a finicky and repetitous task, and likely an expensive one from a business perspective, for programmer’s developing software: that’s a lot more lines of code, a lot more complexity, and a lot more possibility for bugs and memory leaks!

Yes, but that is just matter of wrong library design. In my C++ world (http://www.ultimatepp.org/www$uppweb$overview$en-us.html) any kind of _resource_ leaks (not only memory) are non-existent. GC would make things only worse.

By the way, if anyone has a look at D, you’ll see that the template system is far surpassing C++’s in numerous ways now too.

The problem of templates in C++ is not templates themselves, but STL…

Edited 2006-11-20 09:10

2006-11-20 1:09 am

JohnMG
I suspect that C++ may have missed the boat. Just my opinion.

These days, it makes a lot of sense to use a higher level language (like Perl, Python, or Ruby, for example), and then rewrite the performance-critical parts in C, if necessary.

C++ is fine, but if you make use of too many of its features, it becomes very difficult to deal with. Again — that’s just my opinion. C++ is complex, and it’s easy to write incomprehensible stuff with it.

D would be interesting, except (my impression from looking at it a while ago) the author is basically saying, “Here’s a great language, here’s my free (as in beer) implementation of it — which by the way rocks [as I’m sure it does]; maybe the free software community will want to re-implement themselves a copy? Whaddya say, eh??”. Thank you kindly, but we’ve already got a free (as in libre) C++ implementation, and as of just a few days ago, will now have a free (as in libre) Java implementation.

If I were transitioning some company’s future projects from C++, it would only be to another platform with a mature and free (libre) implementation. Now, if Digital Mars wants to GPL their implementation (which would likely require a change in their business plans), I think they might stand a chance of getting some real adoption. Currently though, I think you’re going to see a lot of free software C++ devs moving to Java thanks to Sun’s doing the right thing with Java’s licensing.

Edited 2006-11-20 01:13

2006-11-20 10:28 am

MORB
It’s easy to write incomprehensible stuff with any language.

Perl? I don’t think it needs any example of this.

Python? It can, and willl get horrible in the wrong hands. I had to maintain a 3000+ line python script. And despite the fact that python goes out of its way to discourage you from using global variables (one of the rare things I appreciate in that language), the guy who wrote that script used hundred of them.

Regardless of the language, there are dumbasses out there that WILL manage to write hopelessly awful code with it.

I don’t know about ruby, but I don’t see how it would be different. And I’m suspicious of languages throwing all their eggs in the OO basket like if it was the single only good programming paradigm in existence.

Edited 2006-11-20 10:46

2006-11-20 2:10 am

Terminal.Node
I think that your impression of D is quite outdated :-); however this is understandable: some people that checked D out in its early days were intrigued but figured it was just a toy and never gave it much time. Unfortunaely, mention of D sometimes brings that old recollection to mind in those people, a a memory that is now quite innacurate.

Thankfully, D has matured much over the last couple of years and its future is stabilizing. It deserves another serious look for those of you who didn’t give it much thought the first time around.

That said, D DOES have a GPL implementation — and for some time now — called GDC (based off of gcc). It works on many platforms including, interestingly enough, SkyOS.

Have a look here:

http://dgcc.sourceforge.net/

There are binary packages for multiple OSes; there’s even a mingw and Mac OSX version available. Quite a few people support and depend on this compiler for their day to day use of the D language.

Edited 2006-11-20 02:18

2006-11-20 6:38 am

reduz
I myself am very and extremely interesed in D. However, unless the D frontend is included into official GCC, developing and distributing apps mase with it will be a big hassle, since you have to install a special patched version of the compiler to make stuff with it…

2006-11-20 3:08 am

rayiner
Ah, more changes to make C++ even more difficult to implement. It’s been 8 years since C++ ’98, and today there is still no correct and complete implementation. When the next version of C++ comes out in 2009, it will be 2015 before you can expect high-quality implementations.

And the whole concept of the template mechanism as an embedded functional language is really getting out of hand. First they made it Turing-complete, now it has a type system? What’s next, object orientation? It is, for all intents and purposes, a glorified macro system, so why not just bite the bullet and let you use regular C++ code to do code generation? IMHO, the metafunction proposal was a much better, saner, and more usable thing to add into C++ than extending the pointy-bracket sub-language from hell.

2006-11-20 3:47 pm

axilmar
Ah, more changes to make C++ even more difficult to implement. It’s been 8 years since C++ ’98, and today there is still no correct and complete implementation. When the next version of C++ comes out in 2009, it will be 2015 before you can expect high-quality implementations.

And the whole concept of the template mechanism as an embedded functional language is really getting out of hand. First they made it Turing-complete, now it has a type system? What’s next, object orientation? It is, for all intents and purposes, a glorified macro system, so why not just bite the bullet and let you use regular C++ code to do code generation? IMHO, the metafunction proposal was a much better, saner, and more usable thing to add into C++ than extending the pointy-bracket sub-language from hell.

You could not have said it in a better way. I couldn’t agree more!

And with all these changes, C++ will not have garbage collection!

And the fact that they are introducing yet another reference type makes C++ even more difficult to use…

2006-11-20 5:33 am

Phuqker
Although I realize that C++ is actually a very small, simple language at its core, it somehow ends up being overcomplicated and bloated in actual usage. While C++09 is nice, I don’t see the added features reducing that tendency too much.

For hobby coding and mental workouts, I’ve moved on to Haskell and Ruby. (Haskell is an incredibly elegant language. Wow.)

Edited 2006-11-20 05:33

2006-11-20 1:07 pm

GreatBunzinni
Although I realize that C++ is actually a very small, simple language at its core, it somehow ends up being overcomplicated and bloated in actual usage. While C++09 is nice, I don’t see the added features reducing that tendency too much.

Your code can only start to get bloated and overcomplicated if you intentionally wish that your code becomes bloated and overcomplicated. If you don’t want to use features that you consider out of your reach and difficult for you to understand and keep track then don’t use it.

And of course that isn’t exclusive to C++. It holds true to any other language and it is proportional to your experience and knowledge of it.

2006-11-20 6:47 am

theGrump
i see comments here advocating D. i looked into D but its licensing seemed vague. is it open? i can’t really say.

Browser: ELinks/0.11.1-1-debian (textmode; Linux 2.6.17-2-686 i686; 91×34-3)
2006-11-20 3:27 pm

silicon
The subject says it all. The official D spec is copyrighted by the author. Also nothing prevents him from filing a patent application (maybe he already has one). Large scale adoption of D is risky until these issues are resolved.
2006-11-20 3:30 pm

stodge
One of the things that constantly annoys me about C++ is pointers and references. I admit I’m not a great (nay, good) C++ programmer. I wish there were one or the other. Actually I would prefer references as I’m tired of dealing with pointers.

I would also like more runtime typing – reflection? I currently define macros that you can use to generate code for class typing. I could expand it to include more queries but I don’t need it right now.

When I think about it I guess I keep wanting something more akin to Java. That may be my next logical step, though Java isn’t a solution for all my problems.
2006-11-20 9:43 pm

rayiner
Great Bunzinni: You’re missing the very major benefit of including features in the language standard, instead of depending on third-party libraries: standardization. Tolerable memory management in C++ is possible, but the problem is that everybody does it differently! There are probably dozens of smart pointer classes in use, for example. And good luck using GC in a library, because if another library decides to use a different GC, you’re boned. These things make code integration very difficult, and people are often forced to program at a lower level than desired just for the sake of code portability and reusability.

2006-11-20 11:34 pm

GreatBunzinni
Tolerable memory management in C++ is possible, but the problem is that everybody does it differently!

And tell me, if there isn’t a consensus to what is the best GC or more appropriate method out there, isn’t it better to let the user choose whatever GC is best suited for the task at hand instead of imposing the same generalized component that may not suit anyone’s needs? If everyone does things differently then the best thing that anyone can do isn’t avoiding the imposition of the one and only solution?

2006-11-21 4:29 am

rayiner
People don’t all do different things in C++ because its necessary for the task at hand, they do so because they have no other choice. Hundreds of different ad-hoc memory management schemes don’t exist because they’re really necessary (most are variations on a few basic schemes), but because lack of standardization means that there is no mechanism for picking common methods.

For almost every task for which “modern C++” is useful, a GC is the right memory management scheme to use. The hit of a GC is a lot less than the hit of smart pointers or the large amount of copying involved in STL collections, for example. Entertainingly, these fancy “move semantics” are largely necessary to make up for the over-use of copying semantics in the STL, which are necessitated by C++’s lack of GC.

There are problem domains for which GC is not appropriate, though they are getting smaller as collectors become more advanced. In any case, the majority of those problem domains aren’t appropriate for C++ anyway, at least not in its modern incarnations. If you’re on an embedded platform, for example, and can’t tolerate the extra memory usage of a GC, you’re not going to do very well with the massive memory increase imposed by template code either. If you’re doing hard-real time computation, you can’t use things like smart pointers, because they’re very non-deterministic as well with common data structures (lists, graphs).

2006-11-21 8:29 am

luzr
For almost every task for which “modern C++” is useful, a GC is the right memory management scheme to use. The hit of a GC is a lot less than the hit of smart pointers or the large amount of copying involved in STL collections, for example. Entertainingly, these fancy “move semantics” are largely necessary to make up for the over-use of copying semantics in the STL, which are necessitated by C++’s lack of GC.

I disagree (and I have one million lines of codebase to prove this point;).

I agree that STL’s copy semantics IS THE PROBLEM, but GC is not the solution. The solution is to simply design container library WITHOUT copy semantics requirement, which is completely possible.

In such design, move semantics is still useful, but plays role similar to “break” statement in structured programming (and you do not need language extension to achieve this).

BTW, the main architectural trouble with GC and C++ is that it is impossible to match GC and destructors. You can have GC or destructors, but never both. I vote for desctructors (GC would not close my files).

As for performance claims, many GC advocates do not realize that often old malloc/free implementation are compared to cutting edge GC’s. In the end, famous Boehm’s GC, which is used in most opensource projects, has its tiny (and very smart) “malloc/free” in its sweep algorithm, so it is not cost free either.

You can implement manual allocator based on similar idea as is used in Boehm’s GC – what you get MUST be faster than GC, because it is basically the same code without the sweep phase. Indeed, both malloc and free translate to about 15 machine code instructions executed in the “fast” path. (Note: means I DID implemented memory allocator – I spent years optimizing…;)

To be more concrete, you can check the code at http://www.ultimatepp.org (both for memory “non-management” via owning non-copying containers and fast heap allocator code).

Edited 2006-11-21 08:43

2006-11-21 3:06 pm

axilmar
BTW, the main architectural trouble with GC and C++ is that it is impossible to match GC and destructors. You can have GC or destructors, but never both. I vote for desctructors (GC would not close my files).

No, it is not impossible.

First of all, if C++ ever had GC, it would not be obligatory to allocate objects on the heap. Objects could happily live on the stack, as it is right now, therefore RAII would work as it is right now.

But let’s say you have heap objects that you want to clean up after a specific operation…The RAII pattern works with garbage-collected objects too, using Lock<T> classes on the stack, where T is the type of garbage-collected objects. For example:

RAII<File> r1(new File());

//…bla bla bla process file

//r1 destructor closes the file.
2006-11-21 3:26 pm

luzr

First of all, if C++ ever had GC, it would not be obligatory to allocate objects on the heap. Objects could happily live on the stack, as it is right now, therefore RAII would work as it is right now.

The real question is: Should GC call destructors?

You will find a lot of reason why it should and a lot of reasons why it shouldn’t – but in both cases something gets broken.
2006-11-22 12:40 pm

axilmar
The real question is: Should GC call destructors?

Why not? I do not see the reason why.

RAII is programmable and not hardcoded in the compiler, and therefore RAII classes like ‘Lock’ can use a function other than the destructor to clear resources.

I think it is a mistake that C++ does not have optional garbage collection. There are many programmers driven away from it due to its complexity and sheer number of ‘gotchas’…in the end, C++ will become a niche language.
2006-11-21 8:07 pm

rayiner
I agree that STL’s copy semantics IS THE PROBLEM, but GC is not the solution. The solution is to simply design container library WITHOUT copy semantics requirement, which is completely possible.

There is a reason so much of the C++ world has embraced the STL style. It makes memory management in C++ an order of magnitude less painful. The cost of this is copying objects around everywhere, because sharing semantics would immediately negate that convenience.

BTW, the main architectural trouble with GC and C++ is that it is impossible to match GC and destructors. You can have GC or destructors, but never both. I vote for desctructors (GC would not close my files).

It’s not impossible to match GC and destructors, you just have to approach “destructors” in a different way. Leave memory management to the GC, and handle releasing resources when you leave lexical scopes using with-* macros (or something equivalent — if the metafunction proposal had been accepted, C++ could do it too). The solution is largely equivalent, and you get the flexibility of GC.

As for performance claims, many GC advocates do not realize that often old malloc/free implementation are compared to cutting edge GC’s. In the end, famous Boehm’s GC, which is used in most opensource projects, has its tiny (and very smart) “malloc/free” in its sweep algorithm, so it is not cost free either.

Boehm GC is hardly the best you can do, and is hampered by having to be conservative. Moreover, while its true that there is something like a malloc()/free() in the heap manager of every garbage collector, you’re ignoring that its used in a very different way. First, its invoked far less often. In a generational collector, the normal path is a simple sequential allocation. The heap manager is invoked to handle allocations only when survivors need to be copied out of the young generation. Second, in a moving GC, it can be far more callous about memory fragmentation than a good malloc()/free(). It can handle allocations/frees at a page-level granularity, because the copying/compacting process will ensure you don’t end up with lots of partially empty pages.

You can implement manual allocator based on similar idea as is used in Boehm’s GC – what you get MUST be faster than GC, because it is basically the same code without the sweep phase.

For the aforementioned reasons, that’s not true. Even if you implemented an allocator with shitty fragmentation characteristics that handled things at page granularity, you’d still have to call it on each malloc()/free(), instead of for some (often small) fraction of them.

Indeed, both malloc and free translate to about 15 machine code instructions executed in the “fast” path. (Note: means I DID implemented memory allocator – I spent years optimizing…;)

How many clock cycles is that? When I wrote mine, I was pretty impressed at the brevity of the fast path (though yours is quite a bit smaller) but the cache behavior of accessing list structures is such that one instruction translates to several CPU clock cycles. The fast path of a generational GC, in comparison, can be less than a dozen highly cache-friendly machine instructions. Moreover, the code is so small, it can usually be inlined into the allocation site, saving another dozen cycles of function call overhead.
2006-11-21 8:51 pm

luzr
When I wrote mine, I was pretty impressed at the brevity of the fast path (though yours is quite a bit smaller) but the cache behavior of accessing list structures is such that one instruction translates to several CPU clock cycles.

Obviously, you have to consider this while optimizing and keep cache-lines “hot”. And the fragmentation is (for typical patterns) pretty low too (in fact, it has very less than 1 byte of book-keeping info per block and fragmentation for typical usage pattern is constant number of bytes).

You are correct about “GC using memory allocation” in different way and free being much less frequent operation. Anyway, this holds true only as long as total number of reachable blocks is reasonable (e.g. you are creating a lot of temporary objects – this is best scenario for GC). I believe that as number of active blocks increases, GC advantage (if there was any) diminishes – mark/sweep becomes too complex, a lot of memory has to be moved, cache lines are not kept “hot” anymore etc etc….

Therefore IMO GC is acceptable for small datasets and small applications.

To present an opposite argument, GC can sometimes have an advantage in multithreaded environment, as it requires just single lock (world-stop) for the whole sweep phase. All depends on how much locking or TLS (for per-thread cache of blocks in non-GC) are expensive on the platform.

BTW, I mention Boehm’s GC because it is what D language, discussed here as better alternative to C++, actually uses.

P.S.: Who speaks about list structures?

Edited 2006-11-21 20:58
2006-11-21 8:59 pm

rayiner
I believe that as number of active blocks increases, GC advantage (if there was any) diminishes – mark/sweep becomes too complex, a lot of memory has to be moved, cache lines are not kept “hot” anymore etc etc…

Well, the same is true to a great extent for malloc(), as well. As the number of small allocations increases, you spend a lot more time traversing increasingly fragmented allocation lists. And GC has an edge in one respect, because it doesn’t have to manage individual objects in its heap structure, and can instead can deallocate whole pages of dead objects.

I should also point out that for both GC and malloc(), this penalty comes with increasing number of objects, not increasing amounts of memory. If your program is manipulating a few hundred large matrices, even if your heap is gigabytes, a GC (at least one that has object formats and knows a matrix is a leaf object) isn’t going to do a lot of scanning.
2006-11-21 9:16 pm

luzr
Well, the same is true to a great extent for malloc(), as well. As the number of small allocations increases, you spend a lot more time traversing increasingly fragmented allocation lists.

Well, only as long as you are using allocation lists and have to traverse them….

In the heap allocator used in U++, no scanning of blocks (well, we are speaking about “small blocks” up to 1KB here to be exact) is ever performed. Either you go fast path, using hot cache line, then there is “medium” path – up to about 500 machine opcodes (no scanning lists over the heap) and if that fails, new memory chunk has to be obtained from OS.

See, this is exactly what I was pointing at when I claimed that usually, outdated malloc implementation is compared to cutting-edge GC. Of course, if you are going to use malloc based on algorithm described in K&R, GC must win.

P.S.: Note that I agree about your opinions and info about reference counting and even on advantages of GC. Just the truth is not complete…

Edited 2006-11-21 21:26
2006-11-21 9:36 pm

rayiner
I’ll have to look into the design of U++’s memory allocator, in that case. However, the free-list algorithms are hardly outdated (they’re the basis of the memory allocators in most current OSs), and the GC I’m talking about are hardly cutting-edge (good generational/incremental GCs have been around for a decade).
2006-11-21 9:49 pm

luzr
I’ll have to look into the design of U++’s memory allocator, in that case.

You are welcome do download (www.ultimatepp.org) the stuff, however, if you do not want to get your hands dirty, just read description of implementation details of Boehm’s GC

http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcdescr.html

and try to imagine how the allocation/deallocation part can be easily adapted for manual alloc/free

Very short description:

You allocate everything in 4KB pages, each page dedicated to single block size. Instead of lists, you use memory map to find a block while doing free. Then you simply maintain lists of blocks of the same size – last dealocated block on the top to keep cache hot.

(There are of course many implementation details and optimizations, you would have to study actual sources – Core/heap.cpp).

BTW, I you decide to download, you might be also interested in fact that there are two “delete” statements and no shared smart pointers in the whole widgets library
2006-11-22 2:14 am

DavidM
“BTW, I mention Boehm’s GC because it is what D language, discussed here as better alternative to C++, actually uses”

Incorrect. The GC is custom designed by Walter.

Remember that GC only matters when collecting memory. Allocating memory has little to do with reclaiming it.

This stuff about slowdowns every 10 seconds is pure speculation. If you are 99% usage memorywise, the slowdown will be OS paging, not memory management. The collector only runs when its out of memory.

With all these what-ifs about the GC in D, go actually try it. I have written memory intensive OpenGL stuff and haven’t seen any of the issues proclaimed here.

You guys do realize Walter Bright who designed D markets a C++ compiler, and is the only person to my knowledge to have written one by himself?

I am quite sure he is aware of the tradeoffs versus C++.
2006-11-22 8:25 am

luzr
Incorrect. The GC is custom designed by Walter.

There is always oportunity to update my knowledge.

You guys do realize Walter Bright who designed D markets a C++ compiler, and is the only person to my knowledge to have written one by himself?

Of course. Actually, my main complaint about D is that Walter wastes his time instead of improving his C++ to be standard compliant and available on more platforms

Unfortunately, right now DMC++ is not good enough to compile my code…

Anyway, to my knowledge, DMC is based on Zortech C++, which Walter codeveloped – I think there were more developers than him (but not sure about this point).
2006-11-22 9:00 am

luzr
Incorrect. The GC is custom designed by Walter.

Once again sorry for my misinformatrion, to defend myself, D’s GC IS conservative as can be deducted from this discussion

http://www.digitalmars.com/d/archives/digitalmars/D/35364.html

(although Walter seems to be not publicly advertise this info too much – I guess he knows why

I think there are not many ways how to implement effective conservative GC – you will most likely end with system similar to Boehm’s.

Also, Boehm’s GC is advertised all over DMC website, that is what made me think it is D’s GC.
2006-11-22 12:32 pm

DavidM
You are correct, it is a conservative collector.

Walter had said in the newsgroups it was not Boehm’s:

http://www.digitalmars.com/d/archives/digitalmars/D/26569.html

DMC++ comes with Boehm’s collector bundled in as an optional library. Just include <gc.h> and it replaces new and delete.

A generational/copying collector would be better in the long run and Walter has said we will address pluggable or alternate GC post 1.0.

Walter is a great guy and responds on the newsgroups quite regularly. This is definitely a plus in the D column.

Again I dare people to try it. Not all things are done as they are in C++ but I haven’t run into anything that was harder than C++.

The anonymous delegates are very nice.

Look at this thread:

http://www.digitalmars.com/d/archives/digitalmars/D/39313.html

Cheers.
2006-11-22 12:43 pm

luzr
A generational/copying collector would be better in the long run and Walter has said we will address pluggable or alternate GC post 1.0.

Well, I am afraid that once you introduce the conservative GC into the language, you cannot really change this without breaking existing code-base.

Even many examples posted by Walter in newsgroups would be broken with other kind of GC.

Walter is a great guy and responds on the newsgroups quite regularly. This is definitely a plus in the D column.

Has he any other choice? Pushing a new language is not an easy thing, especially if you are a single person… I guess he is highly successful at that… Responding on newsgroups and being great guy is absolute minimum he must do.
2006-11-22 2:39 pm

DavidM
“Well, I am afraid that once you introduce the conservative GC into the language, you cannot really change this without breaking existing code-base.”

Sorry luzr but you have lost me here.

What code would break by changing collectors? Unless you are storing pointers inside integers I can’t really see a problem.
2006-11-22 2:58 pm

luzr
What code would break by changing collectors? Unless you are storing pointers inside integers I can’t really see a problem.

Yes, that is one possibility. While storing pointers in integers is of course a bad practice, it is not that unrealistic to expect storing pointers in uninitialized block of memory.

2006-11-21 8:35 am

sanders
People don’t all do different things in C++ because its necessary for the task at hand, they do so because they have no other choice. (…) lack of standardization means that there is no mechanism for picking common methods.

In my experience, there are other reasons as well. I can’t count the number of times a junior programmer implemented his own smart pointers or even list structures because he figured he could do better than Boost/STL. I concede that this is partly due to lack of standardization, or perhaps due to programmers thinking they’re “expert C++ programmers” after finishing chapter 1 of their C++ 101 course.

you’re not going to do very well with the massive memory increase imposed by template code either.

This argument I don’t really understand. Templates impose no more code overhead than doing things manually – or do you mean something else?

2006-11-21 8:55 am

luzr
This argument I don’t really understand. Templates impose no more code overhead than doing things manually – or do you mean something else?

Well, this is debatable and relative. E.g. vector<T> instance tends to have about 500 bytes per T.

Now if you would have “Object-based” container system design (like in Java), you would have single container class for use with any T (say 1000 bytes once for any number of element types). It would be slower, but smaller.

2006-11-21 11:14 am

sanders
E.g. vector<T> instance tends to have about 500 bytes per T.

Can you back this up? Which compiler/STL? This seems like an awful lot.

2006-11-21 12:55 pm

luzr
E.g. vector<T> instance tends to have about 500 bytes per T.

Can you back this up? Which compiler/STL? This seems like an awful lot.

Sorry, it is some time I have checked. But it is not at all unreasonable – such implementation will have to have at least code for expanding vector (which is a loop to copy-construct and destruct elements, in many cases with inlined constructors) – say 150 bytes. Then there must be destructor code, another loop calling destructors for elements, say 100 bytes. Throw in 2-3 “advanced” operations (like insert) and you are close to 500 bytes.

Anyway, I guess my original claim holds true even if it was 50 bytes / vector

If size/vector does not bother you, think about map

Mirek

2006-11-22 2:07 am

bouh
Yay finally, concepts are in the language!

No need to use/write enable_if<> checking. That was needed.