Reliability of genomic prediction using imputed genotypes for German Holsteins: Illumina 3K to 54K bovine chip

  • J Chen
  • Z Liu
  • F Reinhardt
  • R Reents

Abstract

Low-cost low-density SNP chips are developed for enabling large-scale genotyping of animals at a reasonable accuracy of genomic prediction. In order to assess the accuracy of a low density chip Illumina Bovine3K chip, genotypes of animals of the 3K chip were simulated using the current Illumina Bovine50K information. Three imputing softwares, Beagle, DAGPHASE and Findhap, were applied to three genotype data sets: German Holstein bulls, EuroGenomics Holstein bulls, and all genotyped animals of German Holstein breed. The imputed 54K genotypes were used to calculate DGV and combined GEBV following routine procedures of genomic evaluation for German Holsteins (Liu et al., 2011). To evaluate imputing accuracy of the three softwares, 1369 youngest German Holstein bulls, born between September 2003 and December 2004, were chosen as validation animals. The three imputing softwares differed in computing time markedly, Findhap being much faster than Beagle and DAGPHASE. Allele error rate for the EuroGenomics bull dataset was 3.3% for Findhap, 2.7% for DAGPHASE, and 1.6% for Beagle, respectively. Phenotypic data from April 2010 Interbull evaluation were used to assess the loss in accuracy of genomic prediction using the imputed 54K genotypes of EuroGenomics data set. Equal regression coefficients, by regressing deregressed EBV of the validation bulls, were obtained with the imputed 54K genotypes as the real ones, indicating that GEBV of the imputed 54K genotypes were as unbiased as using the real genotypes. However, R2 value of GEBV of the imputed genotypes decreased by: 5.0% for Findhap and 2.1% for Beagle, across all the evaluated traits. On average, reliability of GEBV dropped by 6.5% for Findhap and 2.6% for Beagle, respectively. Based on the differences in computational requirements and imputing accuracy, different imputing softwares may be chosen for large-scale routine genotype imputation and genomic evaluation or for small-scale imputation without time constraints.