Ensemble-based imputation for genomic selection: an application to Angus cattle
Keywords:
AdaBoost, cattle, ensemble-based system, genomic selection, imputation, single nucleotide polymorphisms (SNP)Abstract
Imputation of moderate-density genotypes from low-density panels is of increasing interest in genomic selection, because it can markedly reduce genotyping costs. Several imputation software packages have been developed; however, these vary in imputation accuracy and imputed genotypes may be inconsistent over methods. An AdaBoost-like approach was developed to combine imputation results from several independent software packages, i.e., Beagle (v3.3), IMPUTE (v2.0), fastPHASE (v1.4), AlphaImpute, findhap (v2), and Fimpute (v2), with each package serving as a basic classifier in an ensemble-based system. The ensemble method computes weights sequentially for all classifiers, and combines results from component methods via weighted majority “voting” to determine unknown genotypes. The data included 3,078 registered Angus cattle, each genotyped with the Illumina BovineSNP50 BeadChip. SNP genotypes on three chromosomes (BTA1, BTA16, and BTA28) were used to compare imputation accuracy among methods, and our application involved imputation of 50K genotypes covering 29 chromosomes based on a set of 5K genotypes. Beagle and Fimpute had the greatest accuracy, which ranged from 0.8677 to 0.9858. The proposed ensemble method was better than any of these packages, but the sequence of independent classifiers in the voting scheme affected imputation accuracy. The ensemble systems yielding the best imputation accuracies were those that had Beagle as first classifier, followed by one or two methods that utilized pedigree information. A salient feature of our ensemble method is that it can solve imputation inconsistencies among different imputation methods, hence leading to a more reliable system for imputing genotypes relative to use of independent methods.
Downloads
Published
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).