To read all comments associated with this story, please click here.
Instead of that, they should write their own implementation with its own set of bugs... right? Instead of fixing the bugs in the original implementation and bringing improvements...
Yes. It's another way to check that the theories in the original code are correctly implemented.
Say you want to write some code to verify the Hockey Stick Graph is correct. If you use the original source code, chances are, you're not going to spot all the bugs in the implementation and you'll likely end up with the same graph, which does not fulfil the goal of independent verification.
We're talking scientific formulae, not a Linux desktop environment here. The most important thing is the data.
The only useful improvements for scientific research are corrections to formulae and theories. That can be done outside of code, and probably better served by being outside of code.
Do you seriously think it is a good idea for logic bugs to propagate through hundreds of research projects derived from the same code?
During peer review, the code would be checked to verify any unexpected results.
With open code, any meaningful problem would be found and solved, and old studies could be easily re-run and verified or discarded.
With closed code, the bugs are never found, and the authors have no reason to repair it if they get what they think are sound results.
--The loon
--The loon
It doesn't matter because it's the published results that matter, and if the results are wrong, someone can verify it independently when its published. If you use the original source code, to verify, it's no longer independent.
In a research organization where there are hundreds of people pulling in open source code, you cannot guarantee someone did not pull code from the original base, leading to a compromised verification of the data.
Well, it can also work quite easily in the opposite. The source code is taken, with little or no review, and new data are run through it, confirming the original result.
I am sure that this happens. Not too long ago I run in to this issue while looking at studies done in the field of psychology. They run most there studies through SPSS to make a factor analysis, do get something out of the data. Everybody using the same software the same way of conducting the study, of course they confirm the result of others. Most of the conclusions drawn are just simply wrong, because less than half of the data is actually supporting the result.
Now since most psychologist aren't statisticians, they just take the work of others as template for their own. And you propagate a wrong method / software.
The same is going to happen with opening the source code for all research. If the code is critical to the research than it should be implemented independently to confirm the results, based on the same data. If the code is auxiliary to the problem, then who cares anyway.
Also I know of Professors that stopped publishing all together because of that requirement. Now what do you gain?
The good thing from all the published work is, we KNOW that certain things work/exist, so they can be re-discovered and independently verified.
I'm not so sure: there was a time where a popular idea to produce safe code (for avionics or things like that) was to have several independant teams coding the same software to have different bugs.
A study discovered then that independants teams had quite a few identical bugs, so it became much less popular!
Yes, but with scientific research spread all over the world, we can afford to have more teams than any single organization can afford.
And again, I refer people to the Climategate non-scandal. What if it turns out everyone who verified the data were using the same code, or at least derived versions of the same code? Think about the fallout from that. Even if the bugs were mostly identical, do we want to risk being wrong?





Member since:
2007-02-18
There is a danger that by releasing the source code, other scientists would use the source code with all the bugs which causes errors to propagate undetectably in derived research.
Ostensibly, scientists can check the source code to find bugs, but it's never going to be complete.
There is something to be said about scientists having to recreate source code in a clean room environment because errors in either code or hypothesis is easier to expose.