Selection of sequence variants to improve genomic predictions

Authors

  • Jeffrey O'Connell University of Maryland School of Medicine
  • Melvin Tooker
  • Derek Bickhart
  • Paul VanRaden

Keywords:

genomic prediction, reliability, sequence variant, whole genome sequencing

Abstract

Methods of selecting sequence variants were compared using candidate sequence variants within or near genes for 444 Holsteins from run 5 (July 2015) of the 1000 Bull Genomes Project. Test 1 included single nucleotide polymorphisms (SNPs) for 481,904 candidate sequence variants within or near genes. Test 2 also included 249,966 insertions and deletions (indels). After merging sequence variants with 312,614 high-density (HD) SNPs and editing, Test 1 included 762,588 variants and Test 2 included 1,003,453. Imputation quality from findhap was assessed by keeping 404 of the sequenced animals in the reference population and randomly choosing 40 animals as a test set. Their sequence genotypes were reduced to the subset in common with HD genotypes and then imputed back to sequence. Predictions were tested using HD imputed genotypes for 26,970 progeny-tested bulls and 2015 data of 3,983 validation bulls with daughters that were first phenotyped after August 2011. Percentage of correctly imputed variants averaged 97.2% across all chromosomes in Test 1 and 97.0% in Test 2. Prediction reliability improved only 0.6 percentage points in Test 1 when sequence SNPs were added to HD SNPs and was only 0.4 points higher than HD SNPs in Test 2 when sequence SNPs and indels were included. However, selecting the 16,648 candidate SNPs with largest estimated effects and adding those to the 60,671 SNPs used in routine evaluations improved reliabilities by 2.7 percentage points (67.4% vs. 64.7%) on average across traits compared with 35.2% for parent average reliability. Thus, genomic prediction reliabilities can improve when adding selected sequence variants.

Downloads

Published

2016-12-19