Advertisement

Just pushing buttons

Human genetics use PCA to visualize genetic relationships, raising questions about zero values in correlation matrices.

Newsletter

Sign up for our email newsletter for the latest science news

Sign Up

Mike the Mad Biologist, whose bailiwick is the domain of the small, asks in the comments:

I don't mean to bring up a tangential point to the post, but why does the field of human genetics use PCA to visualize relationships? When I see plots like those shown here that have a 'geometric pattern' to them (the sharp right angles; another common pattern is a Y-shape), that tells me that there are lots of samples with zeros for many of the Y-variables (i.e., alleles that are unique to certain populations). Thus, the spatial arrangement of the points is largely an artifact of an inappropriate method: how does one calculate a correlation matrix when many of things one is correlating have values of zero? If one really was keen on using PCA, one could calculate a pairwise distance matrix and then use that instead of the correlation matrix (Principal Coordinates Analysis).

Since I know some human geneticists do read this weblog, I thought it was worth throwing the question out there.

Stay Curious

JoinOur List

Sign up for our weekly science updates

View our Privacy Policy

SubscribeTo The Magazine

Save up to 40% off the cover price when you subscribe to Discover magazine.

Subscribe
Advertisement

1 Free Article