Register for an account


Enter your name and email address below.

Your email address is used to log in and will not be shared or sold. Read our privacy policy.


Website access code

Enter your access code into the form field below.

If you are a Zinio, Nook, Kindle, Apple, or Google Play subscriber, you can enter your website access code to gain subscriber access. Your website access code is located in the upper right corner of the Table of Contents page of your digital edition.


Why not release data for phylogenetic papers?

Gene ExpressionBy Razib KhanMarch 2, 2013 1:26 AM


Sign up for our email newsletter for the latest science news

Last month I noted that a paper on speculative inferences as to the phylogenetic origins of Australian Aborigines was hampered in its force of conclusions by the fact that the authors didn't release the data to the public (more accurately, peers). There are likely political reasons for this in regards to Australian Aborigine data sets, so I don't begrudge them this (Well, at least too much. I'd probably accept the result more myself if I could test drive the data set, but I doubt they could control the fact that the data had to be private). This is why when a new paper on a novel phylogenetic inference comes out I immediately control-f to see if they released their data. In regards to genome-wide association studies on medical population panels I can somewhat understand the need for closed data (even though anonymization obviates much of this), but I don't see this rationale as relevant at all for phylogenetic data (if concerned one can remove particular functional SNPs). Yesterday I noticed PLoS Genetics published a paper on the genomics of Middle Eastern populations, Genome-Wide Diversity in the Levant Reveals Recent Structuring by Culture. The results were moderately interesting (I'll review the paper in detail later), but bravo to the authors for putting their new data set online. The reason is simple: reading the paper I wanted to see an explicit phylogenetic tree/graph to go along with their figures (e.g., with TreeMix). Now that I have their data I can do that tonight, time permitting. One major aspect of science is reproducibility. Because of capital outlays this is not always viable, and often occurs in a haphazard fashion. But with phylogenetics done on a computer this is less of an issue. I have a desktop at home devoted 99% to running data sets, in part for my own interest, and in part because I want to check the robustness of some of the inferences I see in papers like the ones above.

3 Free Articles Left

Want it all? Get unlimited access when you subscribe.


Already a subscriber? Register or Log In

Want unlimited access?

Subscribe today and save 70%


Already a subscriber? Register or Log In