We have completed maintenance on DiscoverMagazine.com and action may be required on your account. Learn More

Do you want your genotype in a public data set?

Gene Expression
By Razib Khan
Jan 16, 2013 12:54 PMNov 20, 2019 1:47 AM


Sign up for our email newsletter for the latest science news

In the near future one of my projects is revising and expanding the "PHYLO" pedigree file which I put up a week ago. Basically I want there to be a public data set which has a modest number of SNPs useful for phylogenetic analysis (100-200,000) with a wide population coverage. Additionally, I am going to do a few things like rename the family ids to populations, and also release it with scripts to help in running Admixture (for example, shell scripts which will automate replication and later analysis of replicates). Finally, I'm planning on running ~50 replicates of K = 2 to K = 20 with 10-fold cross-validation (yes, this is will take a while) to get a good sense of the "best" K's. The reality is that most people probably are only interested in the "most informative" K, +/- 1, so there's no need for everyone to run K = 2 to K = 20. The time saved should be used on running replicates, and then CLUMPP to merge the results. I would say that this is for 'amateurs' only, but I don't think it's betraying confidence to observe that several academic researchers at prominent institutions have ended up inquiring of me of how to get good public data sets. This sort of information still hasn't percolated to the general public, including scientists who don't work on population genomics. After a few trial runs with public data sets people with academic access could move to things like the POPRES data set. But the ultimate point of this post is to ask: do you want to be in this data set? If so, I need the file (23andMe format is fine, otherwise, pedigree files only), your name, and some minimal ethnic information. I'm not going to add everyone. I just want to diversify the public data set a little. But I am going to put names in the sample sheet, so you won't have anonymity. As you know I don't particular care about this personally, but your mileage may vary. Researchers might need to contact or check that people are who they are. Email: contactgnxp -at- gmail -dot- com

1 free article left
Want More? Get unlimited access for as low as $1.99/month

Already a subscriber?

Register or Log In

1 free articleSubscribe
Discover Magazine Logo
Want more?

Keep reading for as low as $1.99!


Already a subscriber?

Register or Log In

More From Discover
Recommendations From Our Store
Shop Now
Stay Curious
Our List

Sign up for our weekly science updates.

To The Magazine

Save up to 40% off the cover price when you subscribe to Discover magazine.

Copyright © 2024 Kalmbach Media Co.