We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Genotype harmonizer: automatic strand alignment and format conversion for genotype data integration.
- Authors
Deelen, Patrick; Bonder, Marc Jan; van der Velde, K. Joeri; Westra, Harm-Jan; Winder, Erwin; Hendriksen, Dennis; Franke, Lude; Swertz, Morris A.
- Abstract
Background To gain statistical power or to allow fine mapping, researchers typically want to pool data before meta-analyses or genotype imputation. However, the necessary harmonization of genetic datasets is currently error-prone because of many different file formats and lack of clarity about which genomic strand is used as reference. Findings Genotype Harmonizer (GH) is a command-line tool to harmonize genetic datasets by automatically solving issues concerning genomic strand and file format. GH solves the unknown strand issue by aligning ambiguous A/T and G/C SNPs to a specified reference, using linkage disequilibrium patterns without prior knowledge of the used strands. GH supports many common GWAS/NGS genotype formats including PLINK, binary PLINK, VCF, SHAPEIT2 & Oxford GEN. GH is implemented in Java and a large part of the functionality can also be used as Java 'Genotype-IO' API. All software is open source under license LGPLv3 and available from www.molgenis.org/systemsgenetics. Conclusions GH can be used to harmonize genetic datasets across different file formats and can be easily integrated as a step in routine meta-analysis and imputation pipelines.
- Subjects
GENOTYPES; HUMAN genetic variation; LINKAGE disequilibrium; POPULATION genetics; META-analysis
- Publication
BMC Research Notes, 2014, Vol 7, Issue 1, p477
- ISSN
1756-0500
- Publication type
Article
- DOI
10.1186/1756-0500-7-901