A new position paper published in the New England Journal of Medicine (NEJM) has generated a lotof controversy among some scientists: Toward Fairness in Data Sharing. It's not hard to see why: the piece criticizes the concept of data sharing in the context of clinical trials. Data sharing is the much-discussed idea that researchers should make their raw data available to anyone who wants to access it. While the NEJM piece is specifically framed as a rebuttal to this recent pro-data sharing NEJM article, the arguments advanced apply to science more generally.
Here's my take. There is a strong prima facie case that raw scientific data should be made freely available. It is widely recognized that
"on the word of no-one" or "take no-one's word for it" - is one of the fundamental principles of the scientific endeavor. Scientists do not believe something just because someone (or even everyone) claims that it is so. Evidence, not opinion, is what science is about. Without open data, a scientific paper is little more than a statement that, in the author's opinion, some evidence supports a certain set of claims. Without access to the raw data, a reader of a paper has no way of checking whether the results really do support the conclusions. So, without access to the raw data, the reader is asked to take the results essentially on faith. It might be said that nullius in verba is an impossible standard. After all, even with open data, readers will still need to take the authors at their word that the data were collected in a certain way as described in the paper, and that the results were not manipulated, cherry-picked or otherwise comprimised. I agree that we will never be able to able to achieve perfect transparency in scientific communication - there will always be an element of trust. But if we're serious about nullius in verba, we should strive to minimize the degree to which readers are expected to just trust the authors - and this means data sharing. As a result, in my view, we should hold any attempts to limit the scope or effectiveness of data sharing to a very high standard, because open data is (or should be) a fundamental principle of science. "Towards Fairness in Data Sharing" doesn't discuss such fundamentals, but focusses on practical objections to data sharing, such as the concern that it will incur financial costs for the producers of raw data, or will put them at risk of being "scooped" by other researchers who analyze their data before they have a chance to. In short, the problem with data sharing, according to the NEJM piece, is that it risks being unfair to scientists. These may be real concerns, but even if they are, if we allow such concerns to determine our policy, we are effectively saying that fairness to scientists is more important than science itself.
International Consortium of Investigators for Fairness in Trial Data Sharing (2016). Toward Fairness in Data Sharing. The New England journal of medicine, 375 (5), 405-7 PMID: 27518658