We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Marker effect p-values for single-step GWAS with the algorithm for proven and young in large genotyped populations.
- Authors
Leite, Natália Galoro; Bermann, Matias; Tsuruta, Shogo; Misztal, Ignacy; Lourenco, Daniela
- Abstract
Background: Single-nucleotide polymorphism (SNP) effects can be backsolved from ssGBLUP genomic estimated breeding values (GEBV) and used for genome-wide association studies (ssGWAS). However, obtaining p-values for those SNP effects relies on the inversion of dense matrices, which poses computational limitations in large genotyped populations. In this study, we present a method to approximate SNP p-values for ssGWAS with many genotyped animals. This method relies on the combination of a sparse approximation of the inverse of the genomic relationship matrix ( G A P Y - 1 ) built with the algorithm for proven and young (APY ) and an approximation of the prediction error variance of SNP effects which does not require the inversion of the left-hand side (LHS) of the mixed model equations. To test the proposed p-value computing method, we used a reduced genotyped population of 50K genotyped animals and compared the approximated SNP p-values with benchmark p-values obtained with the direct inverse of LHS built with an exact genomic relationship matrix ( G - 1 ) . Then, we applied the proposed approximation method to obtain SNP p-values for a larger genotyped population composed of 450K genotyped animals. Results: The same genomic regions on chromosomes 7 and 20 were identified across all p-value computing methods when using 50K genotyped animals. In terms of computational requirements, obtaining p-values with the proposed approximation reduced the wall-clock time by 38 times and the memory requirement by ten times compared to using the exact inversion of the LHS. When the approximation was applied to a population of 450K genotyped animals, two new significant regions on chromosomes 6 and 14 were uncovered, indicating an increase in GWAS detection power when including more genotypes in the analyses. The process of obtaining p-values with the approximation and 450K genotyped individuals took 24.5 wall-clock hours and 87.66GB of memory, which is expected to increase linearly with the addition of noncore genotyped individuals. Conclusions: With the proposed method, obtaining p-values for SNP effects in ssGWAS is computationally feasible in large genotyped populations. The computational cost of obtaining p-values in ssGWAS may no longer be a limitation in extensive populations with many genotyped animals.
- Subjects
SINGLE nucleotide polymorphisms; GENOME-wide association studies; SPARSE approximations; MATRIX inversion; APPROXIMATION error
- Publication
Genetics Selection Evolution, 2024, Vol 56, Issue 1, p1
- ISSN
0999-193X
- Publication type
Article
- DOI
10.1186/s12711-024-00925-3