We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Penalized regression and risk prediction in genome-wide association studies.
- Authors
Austin, Erin; Pan, Wei; ShEN, Xiaotong
- Abstract
An important task in personalized medicine is to predict disease risk based on a person's genome, e.g. on a large number of single-nucleotide polymorphisms (SNPs). Genome-wide association studies (GWAS) make SNP and phenotype data available to researchers. A critical question for researchers is how to best predict disease risk. Penalized regression equipped with variable selection, such as least absolute shrinkage and selection operator (LASSO) and smoothly clipped absolute deviation (SCAD), is deemed to be promising in this setting. However, the sparsity assumption taken by the LASSO, SCAD, and many other penalized regression techniques may not be applicable here: it is now hypothesized that many common diseases are associated with many SNPs with small to moderate effects. In this article, we use the GWAS data from the Wellcome Trust Case Control Consortium (WTCCC) to investigate the performance of various unpenalized and penalized regression approaches under true sparse or non-sparse models. We find that in general penalized regression outperformed unpenalized regression; SCAD, truncated L1−penalty (TLP), and LASSO performed best for sparse models, while elastic net regression was the winner, followed by ridge, TLP, and LASSO, for non-sparse models. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining, 2013
- Subjects
GENOMES; REGRESSION analysis; SINGLE nucleotide polymorphisms; DISEASE risk factors; INDIVIDUALIZED medicine; SMOOTHLY clipped absolute deviation; EDUCATION
- Publication
Statistical Analysis & Data Mining, 2013, Vol 6, Issue 4, p315
- ISSN
1932-1864
- Publication type
Article
- DOI
10.1002/sam.11183