We found a match
Your institution may have rights to this item. Sign in to continue.
- Title
Accurate Inference of Subtle Population Structure (and Other Genetic Discontinuities) Using Principal Coordinates.
- Authors
Reeves, Patrick A.; Richards, Christopher M.
- Abstract
Background: Accurate inference of genetic discontinuities between populations is an essential component of intraspecific biodiversity and evolution studies, as well as associative genetics. The most widely-used methods to infer population structure are model-based, Bayesian MCMC procedures that minimize Hardy-Weinberg and linkage disequilibrium within subpopulations. These methods are useful, but suffer from large computational requirements and a dependence on modeling assumptions that may not be met in real data sets. Here we describe the development of a new approach, PCO-MC, which couples principal coordinate analysis to a clustering procedure for the inference of population structure from multilocus genotype data. Methodology/Principal Findings: PCO-MC uses data from all principal coordinate axes simultaneously to calculate a multidimensional ''density landscape'', from which the number of subpopulations, and the membership within subpopulations, is determined using a valley-seeking algorithm. Using extensive simulations, we show that this approach outperforms a Bayesian MCMC procedure when many loci (e.g. 100) are sampled, but that the Bayesian procedure is marginally superior with few loci (e.g. 10). When presented with sufficient data, PCO-MC accurately delineated subpopulations with population Fst values as low as 0.03 (G′st>0.2), whereas the limit of resolution of the Bayesian approach was Fst = 0.05 (G′st>0.35). Conclusions/Significance: We draw a distinction between population structure inference for describing biodiversity as opposed to Type I error control in associative genetics. We suggest that discrete assignments, like those produced by PCO-MC, are appropriate for circumscribing units of biodiversity whereas expression of population structure as a continuous variable is more useful for case-control correction in structured association studies.
- Subjects
GENETICS; GENETIC disorders; BAYESIAN analysis; HARDY-Weinberg formula; EVOLUTIONARY theories; BIODIVERSITY; GENOTYPE-environment interaction; ALGORITHMS; CASE-control method
- Publication
PLoS ONE, 2009, Vol 4, Issue 1, p1
- ISSN
1932-6203
- Publication type
Article
- DOI
10.1371/journal.pone.0004269