Post a Comment
Static analysis has become the rule as opposed to the exception. It started with large commericial projects, but the development of world-class static analysis tools has been helped enormously by open source projects. The extent of their use is so significant that the rest of the commericial software development world is following suit to remain competitive.
Microsoft was mentioned in the article, but my job right now is integrating static analysis into IBM's AIX development process. The tool we use is called BEAM (Bugs, Errors, And Mistakes), and yes, I will do my best to convince management to consider open sourcing it. We do use it for developing Linux on POWER as well as for many other C/C++ systems programming applications. My project is to make sure that all source code that gets checked into AIX is "beamed" beforehand, and that all problems are properly resolved.
Open source apps definitely have fewer statically identifiable problems than does proprietary software. Most of the problems we find are edge cases where an uninitialized variable, null dereference, or memory leak can result. Static analysis tools also report lots of false positives. Most commonly these involve passing null pointers to functions that check their parameters properly, malloc-like functions, or functions that exit.
Lint or Splint is available open source, and Coverity offers a free trial or their Prevent software.
<SNIP> "Open source apps definitely have fewer statically identifiable problems than does proprietary software." <SNIP>
I'm sure I'm not the only one that would LOVE to see you back this statement up with facts, but I'm quite confident you will never be able to do so, because you can't get a proper set of proprietary software samples, no matter how hard you try, to prove or disprove this statement/theory. Until you can actually analyze a statistically meaningful amount of proprietary code, this statement is pure ideology driven: there's no proof that either proprietary (not open for public analysis) or open (available for public review) code has a better overall error rate.
For as many publically known and well-designed/implemented chunks of Open Source, there's a huge number of Open Source applications (far more than the good quality ones) that would tilt the numbers in a negative way. Hopefully, though, those poorly written applications rightfully earn their Darwin Awards before they become known outside of a very select few victims and their creators. So, too, it'd be best if that happened with really bad proprietary software, but at least it's easier to trace the comings and goings of publically released proprietary software (and there's a lot that isn't released to the public! A lot of that is mission critical and specialized to that user) because there's usually press releases and marketing, while most OSS stuff is word-of-mouth until some distributor like Red Hat decides to throw it on their wares.
So, in summary, Proprietary code cannot be assessed on the whole as being inferior in quality to Open Source Software, or the other way around, because it is practically impossible to get enough data to prove or disprove the debate one way or the other. Any claims to the contrary are pure wishful BS, along with 77.5% of statistics that are made up on the spot.
Well as the story title says, "...possible bugs...". So I'm not really certain how useful this story is. If we had tools that could prove bugs over and above what we mormally use? Then I would think we would all be using them, and BSD and GPL alike would benefit. So no, "thousand eyes...all bugs shallow" must still remain in the land of "feel-good" slogans.
BSD is not copylefted, therefore freedom zero is not guaranteed. Don't use it!
From the GNU website, freedom 0 is defined as:
"The freedom to run the program, for any purpose (freedom 0)."
Are you sure you know what you're talking about? Further, I would argue the BSD licence offers a different set of freedoms, which it seems you disagree with. Sorry, but please don't spread FUD!
In April 2004 Coverity analysed the Linux kernel:
http://linuxbugs.coverity.com/linuxbugs.htm
and found 935 bugs (vs 360 FreeBSD).
Anyhow, the point is that open source software has a verifiably low number of bugs. This is great!
"Many eyes" theory seems to be right.
According to Coverity, there is about "0.17 bugs per thousand lines of code" in Linux (http://lwn.net/Articles/115530/) vs. 0.25 bugs per thousand lines of code in FreeBSD...
"The recent 2.6 Linux production kernel now shipping in
operating system products from Novell and other major Linux software companies contains 985 bugs in 5.7 million lines of code, well below the industry average for commercial enterprise software."
FreeBSD seems to have about 1.2 million lines of code (306 potential flaws * 4000 lines/flaw). An example of code bloat in Linux (which is just a kernel, compared to the full operating system that is FreeBSD)?
"Coverity found 306 software defects in FreeBSD's 1.2 million lines of code, or an average of 0.25 defects per 1,000 lines of code. In a December 2004 study of the Linux kernel, Coverity found 985 software defects in 5.7 million lines of code, or an average of 0.17 defects per 1,000 lines of code."
"We want to emphasize that the Linux code base is larger and has more driver support than FreeBSD."
http://www.coverity.com/news/nf_news_06_27_05_story_9.html
Enough said.
LOL LOL LOL!
Ok, so AIX has a higher number of statically checked possible bugs than the reported number for BSD and the Linux kernel. How in the hell can you state that AIX or anything else that IBM does is representative of all proprietary software?
AIX and what IBM produces and the very few places you've worked STILL aren't enough of a dataset to be meaningful except to compare what AIX and IBM's work is compared to the stuff cited with these checks on the BSD and Linux kernel. As hard as it is to believe, there are actually proprietary software solutions that will be at a higher level of perfection than what you've measured, even though what you're using as a measuring stick is from IBM. And I mention once again, there's a hell of a lot of open source stuff that has simply not been measured, because it is so limited and/or crappy that nobody gives a crap that it exists, and thus, the statistics mean nothing, except for comparing AIX and that bit of stuff to BSD or Linux kernels and what they've measured. Your attempt at proving your point fails the test of logic, still, to put forth a "proof" of which is higher quality: OSS or proprietary code, because you're working with an incredibly limited set of data, compared to what exists in the wild.
The basis of my comparison is that both codebases (FreeBSD and AIX) are UNIX-like kernel/userland systems. I could not possibly provide evidence to suggest that proprietary software is in general buggier than open source software if you are holding me to this standard of proof. What I can say is that there are only so many major proprietary UNIX systems in active development today. I would go so far as to say that the only remaining ones are AIX and HPUX, since Solaris has already been extensively prepared for open sourcing. Therefore, comparing FreeBSD to AIX is a fair and representative comparison of open source and proprietary UNIX-like operating systems. I would imagine that HPUX would be on par with AIX at best, especially given HP's commitment to their enterprise UNIX business.
The impact of static analysis on open source software is huge, and the simple reason is: these projects cannot afford to execute large-scale runtime integration, functional verification, and stress testing. Static analysis is extremely cheap in comparison. For proprietary purposes, static anaylsis is just one more item in the QA toolbox. I'm aware of two customer-reported failures in the past 5 years that could have been avoided if IBM had used static anaylsis on those releases of AIX (both resulting from an uninitialized variable). IBM finds nearly every conceivable problem in runtime testing regardless of static analysis. Without access to powerful parallel testing labs, open source projects must embrace static analysis, which is why they eliminate so many of these kinds of errors (and why proprietary development teams can often afford not to care).
Remember that it was you that setup the standard of proof with this precise but (based on your backing down) "buggy" sentence, directly quoted, and you didn't provide a qualifier that it was for AIX/BSD or operating systems code:
"Open source apps definitely have fewer statically identifiable problems than does proprietary software."
It is important to remember that even though a lot of people that read OS News can't program their way out a of a virtual wet paper bag, many of the readers of OS News (they must read it for the same reason that people slow down and gawk at fatal car wrecks along the highway!) use language precisely for a living, whether it is human or computer language. Perhaps next time you will do a static check of your prose for semantics before you press the "Submit comment" button in the page 
I'm sorry, I'm not allowed to provide statistics, but you must have read that my job is to analyze proprietary (read: AIX) source code with static anaylsis tools and manage a system that provides tools for complaint mitigation and statistics collection. AIX and FreeBSD are fairly similar with regards to the nature of the codebase, although AIX is significantly larger. I can say with absolute certainty that the number of valid complaints found in AIX using static analysis is higher than 306, and that the number of complaints per thousand lines of code is higher.
The reason is because companies like IBM are servicing different kinds of customers. Some customers demand that we only ship them fixes for field-reported software defects, because they fear that internally discovered defect fixes might destabilize their mission-critical systems. Customers demand that we test our fixes, and that we test them on their hardware configuration running their OS level. They demand 1 week regression runs for all fixes, and they want them to be tested for versions of AIX that are several years old.
In open source projects, the situation is usually more like: I make a code mod, it works for me, create a diff, send the patch upstream, works for maintainer, earmarked for next week's release. There is normally no fix backlog for simple code mods. There is also a sense of pride in fixing problems in open source software, even if it is low hanging fruit. No one wants to touch the low hanging fruit in proprietary software development unless a manager imposes a deadline for closing those defects.
I'm not aware of any proprietary software project aggregator that makes people aware of new proprietary software releases, whereas with open source there's freshmeat. Half the people in my building don't even know that my static analysis infrastructure exists, because the communication sucks. Developers get angry when some smtp daemon sends them an email about various problems with their code. For proprietary software developers, static analysis is a necessary evil used to satisfy certain code drop requirements, but for open source developers it is an excellent way to quickly find bugs and an even better way to involve new contributors.
I've worked in both open source communities and proprietary software development, and I've dealt specifically with static analysis tools, so I wouldn't be so quick to dispell my comments as BS.
Well, C/C++ are the primary target for most "real" static analysis because it's so easy to write incorrect code. Here's one for Java that seems to check mostly for inefficient code:
http://pmd.sourceforge.net/
This one is a simple C/C++ tool that checks for secure programming, buffer range checking, etc:
http://www.dwheeler.com/flawfinder/
NIST has a list of static source and bytecode checkers for various languages, but not all are open source:
http://samate.nist.gov/
There's PyChecker, JavaChecker, FindBugs and others on sourceforge.
There are slews of Lint-based checkers, both free and nonfree.
And if you maintain an open source project, chances are you can get your source analyzed for free by Coverity or other proprietary static analsis tools if you register on their websites.
You know, I just checked, and the whole FreeBSD source tree (yes, whole OS, not just kernel) is only about twice the size of the current Linux Kernel:
212725760 Jul 20 02:21 linux-2.6.12.3.tar
408421888 Jul 20 02:06 FreeBSD-5.4-sources-all.tar
Who was saying something about the Linux kernel not having a lot of bloat?
Anyway, I have a question for you all. What significance is "lines of code" as a measurement? Shouldn't it be "errors per character" or some such?
I mean, even if this was only comparing kernels (which it doesn't specify), having 1.2 Million lines of code in FreeBSD and 5.7 Million in Linux doesn't that mean that Linux is nearly 5 times the size of the FreeBSD kernel?
It isn't, it's only 1/2 the size.
94176256 Jul 20 02:50 FreeBSD-kernel.tar
212725760 Jul 20 02:21 linux-2.6.12.3.tar
Shouldn't this mean that there would be either 2.8 million in the FreeBSD kernel or 2.4 million in Linux?
In either case, to answer some earlier question, yes, it does mean FreeBSD has less errors of this type than linux does, as we can see from the errors to data ratio:
306/90MB (3.4 per MB) vs 950/202MB (4.7 per MB), or, to put it plainly, it means the FreeBSD kernel has roughly 28% less of these types of errors than Linux does?
Lines of code seems a nonsensical measurement, as
return 0;
}
May be counted as two lines, or even three, who knows?
Hrm...
RE: This is pretty funny
I've just checked the whole FreeBSD source tree and minix source tree.
408421888 Jul 20 02:06 FreeBSD-5.4-sources-all.tar
7117312 1996-09-22 14:04 CMD.tar
5633536 1996-09-22 14:04 SYS.tar
Who was saying something about FreeBSD not having a lot of bloat?
Sorry, your type of arguments...
Here's the "bloat"
Whole linux kernel 2.6.12.1 - lines of code:
$ find . -type f -name '*.[ch]' -print0 |xargs -r -0 cat |wc -l
6106685
drivers: 3127815
archs: 846808
header files for drivers, subsystems,
filesystems, archs, ...: 614056
filesystems: 545200
sound (oss + alsa + drivers): 449875
rest of the kernel (mm, crypto, ipc, security (SELinux, LSM, ...), net, ...): 522931
I use FreeBSD and linux and yes.. I do fully understand both liscenses
and I say to you all..... who really cares which has less errors... you're not fighting over errors, you're fighting your religious war (both sides) and if linux or FreeBSD was so error ridden like some of you would like everyone to believe...... nobody would use it.
lol, that really wasn't my point... people that use windows generally use it because they don't know anything different or they don't want to know anything different..... people that use Linux or BSD use them because they for one reasone or another believe oss is better for their particular application.....
and yes.. those people do care about buggy software/kernels.... if linux and or bsd were so bad.... they would stick with windows



