Don’t be jealous of programmers using new languages, such as Python or C#! C++ gives you many ways to iterate through data you’ve stored in containers, supports for-each, and has helpful devices called lambda functions. Jeff Cogswell shows you how you can fly through your containers with ease.
I’ve noticed a lot of your article posts are C++ related.
Tempates/MetaProgramming/Boost are probably the saving grace for C++ considering it’s such a complex language. I might pick up the book. Unfortunately, the C++ I work on with others doesn’t use templates because our previous embedded compiler didn’t support them and now people just don’t seem to care.
It’s kindof funny, because Stepanov (I believe that’s his name), who invented templates, believes that OO is a crock of shit. And in a way, he’s right. Too much inheritance is a bad thing. Fortunately, most texts these days tell you to favor composition over inheritance.
InformIT sends out an email with new articles every Sunday, and I am a subscriber so when I find an article of theirs interesting enough, I post it.
Templates, metaprogramming, and all the wierd and wonderful stuff that has recently come out of the C++ community shows that C++ still holds relivance and really is the language containing the best mix of power and flexibility.
When the “VM” part of .NET and Java lives within the CPU and runs very fast, then C++ and other compile-time languages will have to offer more to the developer to make up for their shortcominings.
We will see CPU-based acceleration of .NET and Java in the next generation Intel chips.
“When the “VM” part of .NET and Java lives within the CPU and runs very fast, then C++ and other compile-time languages will have to offer more to the developer to make up for their shortcominings.
We will see CPU-based acceleration of .NET and Java in the next generation Intel chips.”
Why was this modded down? Pravda is right, except that AMD will also offer a similar technology.
I’m not a CPU expert but Vanderpool and Pacifica will provide hardware support in the CPU for virtual environments and that includes Java/.NET VMs.
People don’t like new ideas. Moving the VM to the CPU will make C++’s value proposition drop tremendously. The runtime speed pay-off for “compile time” becomes marginalized.
With the ability of hardware VM assistance, all sorts of interesting languages will become more suited to system programming.
Non-dynamic languages will likely go the way of the dodo.
Hardware assisted VM? Isn’t that an oxymoron? Doesn’t it defeat the whole point of a VM in the first place? Of course I’m not talking about generic hardware virtualization here, but CPU virtualization.
What I refer to is hardware assist of a particular computer language virtual machine — i.e. Java virtual machine, .NET CLR, Perl virtual machine, etc.
It is not an oxymoron: the machine is still a virtual machine from the software standpoint. Of course all virtual machines run somehow on a physical machine. With the new chips, there is help on the physical CPU to make the virtual machine run faster.
This hardware acceleration of the virtual machine should — and I emphasize this — remove the speed penalty of dynamic language vs. compile-time language.
With proper implementation, there will be a revolution in how software programs work.
What I refer to is hardware assist of a particular computer language virtual machine — i.e. Java virtual machine, .NET CLR, Perl virtual machine, etc.
Exactly, and I find that defeats the whole point of a VM. You’re basically implementing a new ISA, totally equivalent, from the functional point of view, to any other ISA out there.
The point of VM’s is that you write the program once, and run it anywhere. If you start producing CPU’s that implement an ISA able to natively run code of a VM of your choice, then how’s that different from any other CPU? Just take the x86 ISA as the VM ISA and you’re done.
Nowadays, Java is not slow because of the JVM, but because of the Java language itself.
A modern JVM uses technologies such as JIT and code caching that makes it, in theory, just as fast as a precompiled executable. The only difference will come from the first time the JIT actually compiles the code.
Also, the server JVM precompiles everything before launching the application (a similar approach would be like running GCC right before starting an application).
Because the Java compiler already took care of the source code parsing, the JIT / Server JVM does not have much to do except generating the actual CPU instructions. Which is why it’s a lot faster than running GCC on a C++ app.
But the reasons why Java feels slow and *is* slow in a lot of cases is because of the language design itself. For example:
– arrays are bound checked at every access;
– every class is virtual;
– containers involve a lot of casting, casting that requires runtime type checking (even with generics);
– etc…
All these can be done in C++, but you’re not forced to use them. And because most of the time you actually don’t need them, a well-written C++ program is bound to be faster than a well-written Java program.
In the above, I didn’t talk about the Garbage Collector, because it’s a different issue. But it is not a small issue.
The way Java is designed with everything being a reference, except for basic types such as int, means that it is impossible to control the lifetime of an object. For an object-oriented language, this is a serious flaw IMO. Not being able to control the scope and lifetime of an objet basically means you cannot control the ressources of your computer (and not just memory, but also file handles, sockets, external libraries handles, or whatever ressources you may need).
Because of this, the garbage collector is not really a feature, it’s a necessity, merely a workaround. And for that matter, as a workaround it does not work very well, because:
– you don’t know when the memory will be released;
– you cannot garantee that memory will be ever be released;
– you cannot garantee that other ressources will ever be released.
Worse, the garbage collector only handles low-level cases (similar to malloc/free). It cannot handle:
– new/delete (which involves a destructor and thus ressource control),
– container memory leak (which is a serious issue most programmer ignore, thinking that the garbage collector handles everything).
Modern C++ has shown repeatedly that the need of a garbage collector is minimal (even useless) in most program; as most C++ program should not have a single new or delete call.
But in Java you just don’t have the choice. This design flaw means two things for performance:
– you get a non-realtime and non-determinist behavior because of the garbage collector;
– you cannot control computer ressources.
Java has its place. But I certainly do not think that it can replace C++ because of its (Java) superiority. Especially not in users applications and games, where determinism and real-time behavior are key element.
The Vanderpool technology (and similar virtualisation technologies) cannot fix some basic performance flaws in Java design.
And I haven’t even talked about everything C++ can do Java cannot do.
It’s not for nothing that Microsoft .NET is language agnostic and can be used with VB, C#, C++ or even Java-like J#.
[Treza]
Your detailed comment is very informative, because I’m actually opposed to all of your points.
– The main performance issue with Java is not the language ( which is not dynamic enough IMHO), it is the VM. The bytecode is too low level for an intelligent runtime that should be able to transform automatically virtual classes to static ones from their actual use, among other things.
– The GC is not a gadget for lasy programmers. It is necessary for advanced programming techniques. ( functional, logical, … ). You must realise that malloc() and free() are complex functions. Hand crafted memory allocation can be worse than a GC.
– You can have finalisers in Java. Your program shall not rely on the precise scheduling of the release of memory and resources. And, actually, in today’s multitasked OS’s, there is no way to actually know from an user app the resources actually available.
– If you want or need to do low level programming, use plain old C.
– C++ is useful for some complex hi-speed tasks (like maths ), but there is not much overlapping with Java.
Sun should have separated the VM stuff ( bytecode, … ) from the Java programming language, which are, to a large extend, independant topics.
[Sorry for my poor English]
– I would agree with the byte code being too low-level, if it wasn’t possible to completely disassemble compiled Java into the original source code (minus the comments). But it is possible, leading me to think that all the necessary high-level info is still available for the JVM.
– I did not say that the GC is a gadget. Don’t get me wrong. What I was trying to point out is that the GC is a consequence of the fact that every object is access via a reference. And while a GC can be useful is particular cases, I still doubt about its general usefulness.
In C++ I rarely use malloc/free or new/delete. Most of the time the standard library containers are enough. Sometimes I also use Boost memory pools when I have a lot of small objects to dynamically allocate/deallocate.
I know that in theory a GC should be faster, as it has more optimization opportunities (especially considering the fact that it can profile the code at runtime). But in practice this has yet to be proven, as the cost of the GC itself far outweight the optimisations it brings.
As you said “Hand crafted memory allocation can be worse than GC”. “Can” is the key word here. And GC automatic allocation can be worse than hand crafted memory allocation. It all comes down to how well the program (and thus the programer) uses its tools. But then again, most implemetation of the C++ standard containers are most of the time better in memory management than Java Collections and the GC.
– The Java finalizer (as well as C# destrutor/finalizer) is a bit of joke. One cannot even know whether it will be called at all. How can one rely on such a function?!
The OS is another part. The problem at hand here, is that an application should control when ressources are released from the application itself.
For example, a Java application that relies on the finalizer to close its files may encounter a serious problem where it has consumed all the available file handlers (because the finalizers have never been called, and thus the file handler never closed).
This is an application problem. No matter the OS.
Plus, I could talk about programming in Java on a real-time OS…
– I don’t want low-level programming. I want high-level programming that allows me to control low-level details when I want to.
– I have less experience in Java than in C++. But for most Java applications I had to work on, C++ often offered superior and more elegant solution (and not necessarly more time-consuming).
I admit that C++ is more complex and harder to grasp. But IMO, this complexity comes from the computer itself (since C++ does not hide most low-level details). Programming in Java while still having a good understanding of what’s happening in your back at the low-level is probably as hard (if not harder) as programming in C++.
[Most people here are not native English speakers, don’t worry ;-)]
This myth has to stop. The performance of C++ new/delete is wildly overblow. First, its harder to write a fast malloc()/free() than a fast garbage collector. In the average case, a generational GC’s malloc() is just a couple of instructions anda few clock cycles. A really fast malloc() is a few dozen instructions and about 100 clock cycles. Freeing is also relatively slow. Whereas a GC can collect megabytes of memory in one pass, a C++ program has to free each object one-by-one (unless it uses a pool, but that’s really a kind of GC, isn’t it?). Second, neither new/delete nor smart pointers nor RAII is deterministic. All of them can lead to indeterminate latency as a result of chains of references dying simultaniously. The underlying malloc()/free() is also (almost always) non-deterministic, since it may have to traverse very long chains of pointers in the case of fragmented freelists.
C++ memory allocation is not deterministic, just as garbage collection is not deterministic. You cannot be “fairly” deterministic. Either you are, or you aren’t. You can say its more predictable, but then you get into the details of latency versus throughput versus the number of programmers who use expensive features like reference counting, etc, and it becomes a very complex question to answer definitively.
I have a feeling that most of the C++ programmers talking about new/delete have never actually written a memory allocator. Once you write one and find the number of O(n) operations you end up having to do, you realize that malloc()/free() is pretty much the slowest operation you can do in any language.
@Rayner
But why would you want to call new/delete on every object when you have standard containers and Boost memory pools?
It doesn’t matter if you call new/delete or not. It gets called eventually. What do you think the standard containers use? magic-deterministic-malloc()?
It doesn’t matter if you call new/delete or not. It gets called eventually. What do you think the standard containers use? magic-deterministic-malloc()?
No, but standard container have a specified complexity by the standard. Specifications also includes ‘when’ memory allocation occurs. And most implementations are far more clever than using new/delete for every object.
The standard containers only specify the complexity of the algorithms they implement. They assume that new/delete (and other OS services) are O(1). The problem is, new/delete is not O(1). Consider a malloc() that uses first-fit. Now, consider what happens when your freelist is heavily fragmented. If the block you’re requesting isn’t in the list, you’ll end up traversing the entire list, then end up having to call mmap() to allocate more storage from the OS. On Linux, this triggers, among other things, a walk of the process’s address space, which is O(n) in the number of memory segments the process has mapped. It also triggers an O(n) iteration over the page-table entries in the newly allocated memory area, another O(n) walk to set some bits in the page table database, and, depending on the VM, another O(n) walk to set up reverse-mapping pointers.
That doesn’t even count constant factors. If you free a list, given the above prodcedure, that’s a worst-case O(n*m), where ‘m’ is the sum of ‘n’s in “delete” and everything afterwards. However, the constant factors can be huge. The above procedure will result in a system call, which is O(1) but takes 100-1000 clock cycles. It’ll likely result in a TLB miss, which takes thousands of clock cycles. It’ll probably result in several cache-misses, which take hundreds of clock cycles apiece. At the end, the constant factor for ‘m’ becomes so large that it dwarfs ‘n’!
I think you’re still missing one important concept.
Knowing that memory allocation only occurs when you call new/delete means that you know when it does *not* occur. This allows to write fully determinist parts in a program. Otherwise the whole idea of a real-time application on a real-time operating system would be impossible.
This is the big difference with Java, where you don’t know when memory (de)allocation occurs.
It is not rare to see the Java GC kicking in for no reason (when no memory allocation/deallocation function is called by the application) and freezing a big application for 15 seconds.
This is unacceptable, even for a non-real-time application
You’re using a very sketchy definition of “deterministic” here. Yes, C++ is deterministic if you don’t allocate any memory (and you’re not running on an OS with VM, or a CPU with an MMU!). However, the same is true for a good GC as well. Most GC’s don’t kick in just for the hell of it. They kick in when an allocation request is made and there isn’t enough memory to handle it. If you don’t allocate any storage (like you would have to in C), then the GC doesn’t run.
In compiled languages with GC, this trick works pretty well. The only catch is that in languages like Lisp, its fairly easy to allocate storage by accident. Uusually, this isn’t a problem, because GC’s are optimized for these sorts of temporary allocations, and to pause time will be negligable.
new and delete can both be O(1) (on average): when using a slab allocator, for instance.
But determinism isn’t about average case complexity. It’s about worst-case complexity. Even using a slab allocator, your worst case complexity is O(n) or worse. Additionally, general-purpose mallocs don’t have the luxury of the user defining a specific set of block sizes for them to allocate. You can always roll your own, but who actually does that? For those 1% of cases where its necessary, fine, use C++. But its pointless to cripple an entire language to cater to that 1%.
Rayiner, your are right about the non deterministic nature of dynamic allocation. The typical example is waht would happen if the mallco requires a page which was swapped on the disk. It is not 100 or 1000 cycles, but millions of cycles wasted.
If you want a deterministic process, at east under linux, you need the help of the OS (scheduler), and avoid all undeterministic function calls. All system calls are undeterministic in a general purpose os, so you cannot use any, which means no dynamic memory allocation, but also no file opening, even no locking facilities from the OS.
In audio, under linux, the standart scheme is to have the processing done in one thread, with special scheduling, no memory allocation, and the synchronisation of data is done through special data which do NOT require locking, through the use of atomic instructions (lock free ring buffer).
See http://music.columbia.edu/pipermail/linux-audio-dev/2003-April/0036…
for example
I wonder if the use of STL container with memory pool can work in this context, though. It should.
I forgot to add that linux has the ability to lock some pages in memory : the OS guarantees that these locked pages will never be swapped, and so, one avoids the extremely costly swap in of a page from the disk to the main memory. These facilities actually are provided by posix 1.b:
http://www.delorie.com/gnu/docs/glibc/libc_63.html
For the scheduler, you have the SCHED_FIFO facility available.
Uhm, STL containers like to sidestep the whole multiple-reference issue by using value copies of the contained objects. But that is just exchanging one set of headaches for another.
— Lars
It still takes like 3 times the code to do the same thing as in a real language – and Qt’s iterators are better yet but still nowhere near Perl.
It still takes like 3 times the code to do the same thing as in a real language – and Qt’s iterators are better yet but still nowhere near Perl.
“a real language”, and you then go on to mention Perl? Bahah. Sorry, but there’s whole classes of programming where you can’t be monkeying around with slow interpreted languages and/or a virtual machine.
Ditto. “a real language”! That cracked me up. Perl was written in C which is the sibling to C++. Yes C, ok, so its not “a real language” like perl, but no one’s perfect.
The STL is a nice thing, and for-each is pretty cool, but C++ lambdas suck massively. Lambdas themselves are awesome, and work great with for-each. That’s no surprise, since both features originated in functional languages (for_each is the same as map()). However, lambdas in C++ aren’t lambdas. They are a hack that uses operator overloading and the template mechanism to emulate lambdas. Basically, the _1, _2, etc lambda parameters overload their operators so every time you use an object to the left or right-hand-side of one, instead of evaluating the parameter immediately, it stores it off in a lambda function to be evaluated later. The problem, as the article shows, is that the resulting semantics are unexpectedly different from the semantics of regular C code. Worse, if you make an error in a lambda expression, the error message is several screens long, and the line number points somewhere in the Boost.Lambda header (because that’s where the template is), not in your code (where the error is).
C++ lambdas are also missing many features (namely closures) that make lambdas more useful. You can’t store a lambda in a variable (well, you can, but the template expansion of the type of that variable is not human-readable), and you can’t capture the lexical context of the function in which the lambda is defined (the closure).* In Lisp you can do:
(let ((my-boat (make-instance ‘boat))
(cargo-list (get-list-from-somewhere)))
(mapcar #'(lambda (item) (load item my-boat)) list))
For Lisp newb’s, “let” is basically a way to define local variables (in this case, my-boat and cargo-list) along with their initial values. “make-instance” is like “new”, while “mapcar” is like for-each, and takes a lambda as the first parameter and a list to iterate over as the second.
Notice how the lambda can refer to the local variable my-boat. That’s because when the lambda is constructed, its not just a function, its also has a pointer to an “environment” containing the lexical context (ie: the local variables) of its parent. This is why functions in Lisp that take callbacks rarely take a “callback argument” parameter like you see in C. It’d be pretty redundant for them to do so, since any parameters you want you can capture from the lexical context.
Basically, lambdas are there to make your life easier. They’re there so you can write quick, one-time callbacks without defining a full function, and they are there so you can package up a function and some data** and pass it around. C++ lambdas don’t make your life easier. They just make your code harder to read, and more painful to compile.
* Well, the phoenix lambda library can, but its an enormous hack. Most importantly, it’s not transparent to the programmer, you have to write code for the capture yourself. At that point, its just easier to stick the parameters in a struct and be done with it!
** Sounds an awful lot like an object, don’t it? Closures, plus some macros to make the syntax nicer, can be used to construct full OOP systems.
PS> The Lisp code is much easier to read if OSNews would keep proper indendation!
It really seems that for_each is more conceptually similar to the loop macro (eh…) or especially dolist than to map in general (but not so different to map with nil as return type parameter), since map conceptually maps every value of a sequence to some value returned by a function, rather than simply performing iteration.
I don’t wanna sound like I’m bashing C++. C++ is a fine language… if you’re writing an OS kernel. If it had a proper macro system, and proper lambdas, it’d be an even better language. Indeed, the idea of not putting in new language features, but construct new features out of existing features is quite admirable. However, the existing C++ features aren’t sufficient to define new features such as lambdas. They aren’t powerful enough, and lead to inferior results. If you find yourself really wanting lambdas, maybe its time to use a language that does justice to the concept…
C++ has tremendous market and mindshare, and it’s probably the best general purpose language ever made. Unfortunately, it’s not ideal for much of anything, but it’s powerful enough that the temptation arises to try to make it ideal. It’s not perfect, and neither is any other language that currently exists.
For example:
– Template metaprogramming: yeah, it’s fun to play with, but if you really need that capability, do it right with Lisp instead of a hacked on kludge to a language that wasn’t designed for it.
– Many languages do lambda expressions and higher order functions natively and well; If it’s too much trouble to use a language that actually fits your needs or use its FFI from your C++ program, you probably don’t really need to.
– The boost library (and most other expression template “wonders”) are great if they do exactly what you need; otherwise, you can usually reinvent the wheel in ANSI C faster than you can even figure out if a problem is yours or theirs. And if your compiler isn’t specifically supported, forget it. Life is too short.
– Garbage collection is pretty much a necessity for a functional language, but trying to hack it in to C++ is like trying to put a waterbed in an airplane: it just doesn’t belong there, it’s not what C++ is about. If you average more than an hour or two a week extra for new/delete, you don’t know C++ well enough to be using it professionally or judging its design.
My test for a C++ “feature”: is the resulting code undebuggable, unreadable, unmaintainable, and unportable? Most of the examples above are all four, particularly “undebuggable.” (Step into the STL in your debugger sometime.)
In fact, I use templates and basic STL fairly often, but only when they make sense, and never because the latest cool writer or Pattern Potentate says I should.
My test for a C++ “feature”: is the resulting code undebuggable, unreadable, unmaintainable, and unportable? Most of the examples above are all four, particularly “undebuggable.” (Step into the STL in your debugger sometime.)
Absolutely! Code is hard enough to debug already, and template tricks only make life harder. This is especially true because C++ has no mechanism of showing you the expansion of a template. So you end up debugging without being about to see the source code you are debugging!
In fact, I use templates and basic STL fairly often, but only when they make sense, and never because the latest cool writer or Pattern Potentate says I should.
Templates are great — for generic containers and algorithms (which is what they were designed for). Use map, vector, sort, etc, and be happy with them. The templated containers and algorithms are easy to use and well-optimized. There is no point in abusing the template mechanism to do something it wasn’t designed to. This is especially true when you consider that the tools (namely, the C++ compiler), just aren’t set up for that kind of template abuse. Templates hackery is fairly unportable, results in long compile times, is error-prone (when you hit template recursive expansion limits), and results in unreadable error messages.
C++ is the worst thing to happen to computer science.
Too much complexity for marginal benefit.
Bjarne took a beautiful, *small* programming language and turned it into a bloated nightmare that makes unmanageable any sufficiently large profect.
C++ certainly has no business in application programming, and it’s case for systems programming is weak.
Bad C++ programmers get the weird idea that they must use all the latest buzzwords and create their own form of spaghetti using every possible bit of the language, just because it is there. While C++ isn’t the prettiest of languages, and isn’t “ideal” for everything (no language is “ideal” for everything) it can actually help make incredibly large projects very readable and manageable, *in the right hands* but it can also be abused and used to create the mess you’re accusing it of always creating.
The reality is that any language you can write anything beyond a few lines and more than one source code file can be used and abused to create the mess you are blaming C++ for creating: an unmaintainable, messy large project. C++ is very well suited for things of very large scale, and is great for general applications as well as for systems programming, because it allows you to choose what level of performance you get out of it, or, if you wish, you can use some of the features C doesn’t have to create reference counted objects that remove the need for the garbage collection (and all the weird and often random performance side-effects it has on the application) and actually make things much easier to manage. Really, it matters how well you know the libraries as well as the language, and your discipline, just like any other language.
Thus, your arguments are quite weak, and you have no basis making such judgments about something you are clearly incapable of using properly yourself.
“…if you wish, you can use some of the features C doesn’t have to create reference counted objects that remove the need for the garbage collection (and all the weird and often random performance side-effects it has on the application) and actually make things much easier to manage.”
reference count implies garbage collection?
i guess that all those weird side-effects are because you can’t know when the garbage collector will run and garbage collectors keeps gathering memories untill suddenly they start the collection.
but with reference count you can have deterministic memory deallocation, just every time when count goes to zero. Is this called “garbage collection” too?
but with reference count you can have deterministic memory deallocation, just every time when count goes to zero. Is this called “garbage collection” too?
Yes, because you as a programmer no longer determine when the object will be deallocated. You know the condition (“reference count falls to 0”), but you now longer control when that condition will be met.
And even with reference counting you can have surprising performance issues, for example when removing the last reference-counted pointer to the head of a datastructure with a bazillion objects underneath.
> And even with reference counting you can have
> surprising performance issues, for example when
> removing the last reference-counted pointer to the head
> of a datastructure with a bazillion objects underneath.
In the traditional implementation (i.e. keep a reference counter in each object) there are other performance issues too. Passing lots of references around means modifying that reference count all the time, which means that the object cannot be removed from the cache, even if it’s not actually used.
“And even with reference counting you can have surprising performance issues, for example when removing the last reference-counted pointer to the head of a datastructure with a bazillion objects underneath.”
but that is a problem in c++ too, and the solution is delaying the deallocation. Reference count it’s aimed only to minimize copying, not to force strict release of resources.
in objective-c there are autorelease pools, so you can create any objects and then choose the best momment to release them all, it’s like a manual garbage collector.
Reference counting isn’t deterministic. If the reference ‘a’ becomes dead, it can cause a whole chain of references to also die. Think of what happens when you drop the reference to the root pointer of a tree. The entire tree is traversed and deallocated. In the worst case, you hit every object in memory.
Reference counting is just a form of garbage collection. Actually, there is a higher-order theory that encapsulates both concepts: http://portal.acm.org/citation.cfm?id=1035292.1028982
Reference counting tends to have lower average pause times than most garbage collectors, but tend to have lower overall performance. Specifically, reference counting is fairly expensive, and reference-counting counting collectors can’t take advantage of certain optimizations (namely really cheap malloc()) that garbage collectors can.
Naive reference counting provides deterministic finalization in acyclic graphs of reference counted objects. If all of the references to a counted object in such a graph are released, then it too will it release the references it holds on other objects in a predictable, repeatable manner. Given the same input, the same course of events will occur in the same order. The degree to which naive reference counting can be used for more determistic behavior is largely in the hands of the programmer (namely in the selection of datastructures as the means of populating them).
Naive reference counting is typically expensive in terms of space, overall time due to reference operations, and interactions with processor cache.
The trivial implementation isn’t the state of the art in reference-counted garbage collection, though. The paper by Bacon et al that you cite for instance refers not only to the duality of tracing and reference counting, but goes on to discuss the manner in which they see optimized collectors from either perspective as being hybrid collectors. More interesting is Bacon et al’s reference-counted collector Recycler. http://www.research.ibm.com/people/d/dfb/recycler-publications.html
And Blackburn’s hybrid comparisons: http://cs.anu.edu.au/~Steve.Blackburn/pubs/papers/urc-oopsla-2003.p…
and of course MMTk which has spun off from Jalapeno/Jikes RVM into an interesting framework: http://jikesrvm.sourceforge.net/info/talks/ISMM04-MMTk.pdf
Garbage collection strategies are all about tradeoffs in complexity, throughput, and responsiveness.
I don’t buy this weaker definition of “deterministic”. Newtonion physics is deterministic. You throw a ball in the air, and you know not only that it will come down, but exactly when it will come down. To use a physics analogy, if you just know the values of the state variables, but not the timing, then you don’t know the derivates of the state variables, and thus your system isn’t completely determined.
This isn’t just a theoretical nit-pick. In the context of programming, the weaker definition of deterministic isn’t a very useful one. Especially in the domain of real time processing, “deterministic” without the time component is pointless.
With regards to the higher-order theory relating garbage collection and reference counting, I agree with you — that’s why I linked to the paper However, C++ programmers don’t do hybrid reference counting. They use the naive implementation.
On the contrary, I think attempting to deny the determinism of naive reference counting is what isn’t useful. Can you know when an object will be reclaimed within some given quantity of error? Yes. Given the same input and a suitable means of execution you can predict when things will be reclaimed and in what order they will be reclaimed. Given any set of known arbitrary input, you can determine the same if you really desire to. As precisely as you feel like calculating, if you really hate yourself. Relying on naive reference counting for hard real-time guarantees is going to make your life difficult.
…for those that know how to use it properly. Yes, its complex and I’ve never considered it a good team language, but it doesn’t have runtime baggage, supports tons of libraries. C#, Java, Lisp, Ruby, Python….just don’t have the raw performance that something like C++ affords you. Those languages aren’t system languages, which C++ is.
No runtime support? Ever try writing a kernel in C++? Exception handling, virtual tables, global objects, certain casts, and a number of other things require runtime support. The C++ runtime in GCC 4.x is 232KB on my machine.
I think to be pedantic he said runtime baggage, though what exactly that is supposed to mean exactly isn’t immediately obvious. RTTI, exceptions, global constructors, and new/delete require code that is typically provided by a support library. It isn’t immediately obvious to me that library support is necessary for virtual functions. All of libstdc++ isn’t this support code, and all of it isn’t necessary to add support for the runtime aspects of C++.
This patch http://netlab.ru.is/exception/LinuxCXX.shtml for example doesn’t exactly increase your kernel image by 232KB.
http://www.microsoft.com/whdc/driver/kernel/KMcode.mspx
This is a mildly interesting look at how various C++ features that are usable in drivers for NT and the caveats.
And there’s always Movitz http://common-lisp.net/project/movitz to cause one scratch one’s head and wonder what Lumbergh is really talking about.
> It isn’t immediately obvious to me that library support
> is necessary for virtual functions.
Therer’s nothing necessary at runtime to support virtual functions. As you said already, runtime support needs to be there for global constructors/destructors, exceptions, RTTI and new/delete. THe code necessary to support that amounts to less than ten kb on most platforms.
Virtual function calls can require runtime support. Intel Intel’s C++ compiler (as of 6.x, anyway), if certain functions weren’t run to setup the vtables, the code would crash when executing a vcall.
As for libstdc++, I’m trying to compare apples to apples. ISO 98 C++ is defined to contain certain functionality (including the template containers, the generic algorithms, iostreams, etc). If you don’t have those functions available, you’re not using C++. When you see a Lisp runtime that is 5MB, you have to realize that it too contains a large standard library that’s required for conformance to the language spec. When C++’s much smaller standard library takes up 2.8MB in a DLL, suddenly a 5MB runtime doesn’t seem so huge.
Why are we including the C++ standard library in a discussion about the ‘runtime’ of C++? See, from the perspective of a kernel which was mostly what I thought we were referring to, you’re simply not going to make use of iostreams and all of that. You don’t have to use the classes provided by the standard in your code, and not doing so doesn’t make your code diverge from the language standard at all. Yes, you need an implementation for making use of something from the standard library. That’s not all that interesting. Your point about the practical need of support to make use of actual language features (basically the contents of libsupc++) was much more insightful than that a standard library requires a library. What’s next, you can’t write a kernel in C unless you include all of the C standard library in the kernel image?
> Virtual function calls can require runtime support.
> Intel Intel’s C++ compiler (as of 6.x, anyway), if
> certain functions weren’t run to setup the vtables,
> the code would crash when executing a vcall.
A compiler can is distinctly different than a language must. Which functions are called by code compiled by Intel’s compilers to perform this manipulation of the class vtables? I really don’t feel like downloading their compiler to verify what they’re doing but it’s still interesting.
> If you don’t have those functions available, you’re
> not using C++.
I’m using a magical language that resembles C++. Well I hate to burst your bubble, but I’m not going to include the entirety of the Common Lisp standard into a kernel image, either.
Thus, your arguments are quite weak, and you have no basis making such judgments about something you are clearly incapable of using properly yourself.
So true, it’s amazing how many would-be C++ programmers discard the apparent complexity of the language over something simpler and then flame it simply because they failed to invest the time to understand it in the first place. It certainly isn’t a ‘beatutiful’ language, but claiming its ‘bloated’, ‘a mess’ or the ‘worst thing to happen to computer science’, just proves that you don’t actually use it.
Take your preference – choose your tools, but don’t make up FUD about the ones you don’t understand.
for_each is nice, but there’s something even nicer that is possible to implement with compilers that support the typeof() keyword, and with all the compilers that will support the forthcoming “auto” keyword usage.
#define foreach(iter, container)
for (typeof(container.begin()) iter=(container).begin(); iter != (container).end(); ++iter)
#include <vector>
#include <iostream>
int main()
{
std::vector<int> foo;
/* fill in foo */
foreach(iter, foo)
{
std::cout << *iter << std::endl;
}
return 0;
}
However, this does have the shortcoming that it doesn’t work with pointers as iterators, but for most usages I think it’s actually a better approach.
It’s actually possible to make that macro work also with arrays (notice, not pointers), by means of a wrapper around the iterator, and template specialization. So that would cover all common usages of for_each.
If you wanted, though, you could use a macro with 3 arguments, one for the iterator variable name, and two for the begin and end iterators, which would then be able to also use standard pointers. This would be totally equivalent to for_each, except you could use it as a standard for(), in a totally polymorphic way.
here’s an implementation of foreach, via macro, which also works on arrays: http://phpfi.com/72243
[Treza]
There is a difference between making a CPU wich looks like a VM ( like the tentative Java CPUs ) and adding features in CPUs for a better handling of modern languages ( Lisp :-). For example :
– Detections of overflows in interger arithmetics for arbitrary precision integers.
– Detection of phantom pointers for the garbage collector.
A foreach looping macro was recently accepted into Boost ( http://boost.org ). It works with STL containers, arrays, and iterator ranges, and it doesn’t require the compiler to support typeof or auto, so it works on older compilers. It even allows you to mutate the sequence in place:
int array[] = {1,2,3,4};
BOOST_FOREACH(int & i, array)
{
++i;
}
// array is now {2,3,4,5}
The docs are at http://boost-sandbox.sf.net/libs/foreach .
You can download the code (foreach.zip) at http://boost-sandbox.sf.net/vault/index.php?directory=eric_niebler .
The code requires Boost 1.32.
—
Eric Niebler
Boost Consulting
http://www.boost-consulting.com
> and it doesn’t require the compiler to support typeof
> or auto, so it works on older compilers. It even allows
> you to mutate the sequence in place:
Well, looking at the code you pasted it avoids using a the typeof or the auto keywords because it requires you to write the type of the iterator explicitely. This means that BOOST’s foreach isn’t polymorphic per se. You must admit my implementation is nicer
Looking at the code of BOOST’s foreach, I see it has a lot of code before the real for loop, which makes it impossible to use BOOST_FOREACH the same way you would use for(), forcing you to wrap it up between braces, in some circumstances.
It’s also true that my implementation has the pitfall that it may evaluate the arguments more than once, though. Well, to solve these issues c++ should let the user define her own keywords… perhaps one day
That’s not true. BOOST_FOREACH expands to a single statement, so you can use it where ever you could use a regular for loop. No extra braces required.
I stand corrected, I had missed the if…else statements there, sorry.
As for the iterator type, I stand corrected there too, although that was more akin to a typo than a real mistake
You are mistaken. BOOST_FOREACH does not make you write the type of the iterator. You only have to specify the element type. This is how the foreach keyword works in C#, and how it will most likely work in C++0x.
libstdc++ is basically a required part of the runtime. Without it, you’re not using ISO 98 C++. That’s nearly 3MB on my machine. A good Lisp runtime isn’t much bigger than that…
Also, I was talking about the determinism of ressource releasing. Again from an application point of view.
And by ressource I meant something a lot more generic than just memory. And there, Java GC is completely useless.
Define deterministic? In C++, you can know when a resource will be released, but you really have no way of knowing how long it’ll take. That’s not deterministic. It’s more predictable than other methods, but that doesn’t make it deterministic.
In any case, the problem with Java is that it doesn’t have RAII. GC is pretty much a prerequisite for advanced language features. You can’t, for example, do closures (well, sanely) without a GC. However, for other resources, RAII is a nice tool. In Lisp, there is (with-open-file …) That opens a file, let’s you use it in ‘…’, then closes it afterwards. You can define your own such constructs using macros, ie: (with-open-socket, etc).
And by ressource I meant something a lot more generic than just memory. And there, Java GC is completely useless.
Except that Java’s GC are meant to recover memory, and memory alone. Anybody relying on Java’s GC to recover anything else deserves what he gets.
@Rayiner
C++ memory allocation is not deterministic, just as garbage collection is not deterministic.
C++ memory allocation is determinist in the sense that it happens exactly when you’re calling malloc/free or new/delete.
You know when it will happen. What you are referring to is “How long will it take”.
In Java it is completely different. You don’t know when it will happen. Again, big difference.
In a lot of applications (no need to even go into hard real-time one), you don’t want the GC to suddenly wake up for no reason with the idea that “Hey? Isn’t it a good time to bother the user and start a 15 secondes long garbage collection pass?”.
In C++, if you use memory allocation correctly, you can create an application whose behavior is completely determinist (as in ‘when’ and ‘how long’).
Except you don’t know, really, when something will happen. All you know is that it will happen, eventually. Take some code as simple as:
delete some_object;
delete some_other_object;
When will some_other_object be deleted? After some_object, of course, but that’s not really useful knowledge. Will it be deleted within X milliseconds? Perhaps, perhaps not. some_object might have a destructor, which might delete something with a destructor, etc. The worst case is no different than a GC: some_object refers to every object in memory except some_other_object, and it takes 15 seconds before you get around to deleting some_other_object. That unlikely, but determinism means “this will happen within X amount of time”, not “it will probably happen within X amount of time”.
There is no way to write a completely deterministic program on a modern computer. malloc()/free() can take a few dozen clock cycles in the best case, and hundreds in the worst case. A cache miss can cost you a couple of hundred clock cycles. A TLB miss can cost you a couple of thousand clock cycles. An unfortunately paging incident can put your app to sleep for millions of clock cycles. And an unfortunately arranged change of destructors can cause you to wander around memory freeing things for hundreds of millions of clock cycles.
Neither GC nor new/delete are deterministic. You don’t know when something will happen, and you don’t know how long it will take. All you can do is speak in terms of averages. “The average pause time is X” or “the average mutator runtime is Y%”.
See the problem you’re making here is that you’re saying that because your execution environment is non-deterministic that the deterministic algorithm itself is inherently non-determistic. What’s especially strange about this is that you later (I think) went on to refer to classical Newtonian mechanics as a good example of determism, when they’re basically an incorrect model that is useful to certain error tolerances for making predictions about what will occur. http://en.wikipedia.org/wiki/Newton%27s_laws_of_motion
I am curious as to what you call DFAs?
http://en.wikipedia.org/wiki/Deterministic_algorithm
Newtonion mechanics is deterministic. It’s just an approximation, but its a deterministic one.
My point in bringing the unlderying execution environment into this argument is that it seems silly to talk about the advantages of a deterministic memory allocator when most real platforms are not deterministic. Again, you can’t be “more deterministic”, either you are or you aren’t.
In any case, your link doesn’t prove your point. It doesn’t matter if the reference counting algorithm itself is deterministic, reference counting implementations are not. In a real scenario, you don’t know the input function, so you can’t predict the output. By your definition of deterministic, a GC is deterministic too! For almost all GC algorithms, if you know the input function (ie: the allocation sequence), you can predict exactly what it’ll do. But nobody considers GC’s deterministic in practice!
>> You throw a ball in the air, and you know not only
>> that it will come down, but exactly when it will come
>> down
No, you don’t. You can calculate the same answer for the same inputs that is merely an approximation that is, well, wrong. You, too, can ignore reality and calculate an approximation of the runtime of an operation using naive reference counting on a suitable execution environment that will peform imperceptibly different from the same for the same input.
> My point in bringing the unlderying execution
> environment into this argument is that it seems silly
> to talk about the advantages of a deterministic
> memory allocator when most real platforms are not
> deterministic.
What is silly is using your execution environment to rationalize denying the presence of determistic use of an algorithm, then referring to using classical Newtonian mechanics to determine physical events. Yep, you can use the same input in a formula and obtain the same output. It won’t represent actual reality, just approximate it. That sounds suspiciously like something else.
> In any case, your link doesn’t prove your point.
What it was, really, was a vain attempt of educating you about the nature of the word’s use.
> In a real scenario, you don’t know the input
> function, so you can’t predict the output
You can. Do you always? No. You don’t tend to know the mass of things, velocity of things, weather conditions, presence of low flying birds, or whereabouts of alien spacecraft either.
> By your definition of deterministic, a GC is
> deterministic too!
Reference-counting is a garbage collection strategy. Contrary to the manner in which you repeatedly seem to classify it, it isn’t orthogonal. And for whatever reason you seem to be having your argument with someone else with me, because I haven’t once referred to “garbage collectors” on the whole as anything.
I think you misunderstand the meaning of deterministic. C++ delete is deterministic. Once it’s called, you can be pretty damn sure that the object is deallocated right at the moment. If it represets a resouce, file, database connection, whatever, you can be sure that it’s close right at the moment, and that’s what matters. Of course you can’t predict how many clocks it will take, that’s impossible in desktop systems anyway. The OS itself is not deterministic in that sense, as the scheduler may take away the CPU from the process at any moment, even in the middle of a variable assignment instruction. You must have a microcontroller running your process only if you want to achieve that level of determinism.
It’s a huge advantage in C++ that deallocation is deterministic, you exactly know when it happens. I don’t argue with that allocation is generally faster in a GC system than using malloc. It’s faster when GC is not necessary. The only problem is that GC may happen when you can’t affort it, such as in the middle of a performance critical operation. If we had a way of adding deterministic behavior to a GC system, such as being able to tell when GC to be performed, it would be a bit better. It’s not good that every thread is blocked during GC. It’s not good that my file or connection is kept open indefinitely, until the GC thinks it’s time to call a finalizer. It’s not acceptable.
From dictionary.com
1. <probability> Describes a system whose time evolution can
be predicted exactly.
If you can tell that it will happen, but can’t tell when it will happen, you’re not exactly predicting its evolution exactly, are you?