Efficient inversion of genomic relationship matrix by the algorithm for proven and young (APY)


  • Ignacy Misztal University of Georgia
  • Breno O Fragomeni University of Georgia
  • Daniela A L Lourenco University of Georgia
  • Shogo Tsuruta University of Georgia
  • Yutaka Masuda University of Georgia
  • Ignacio Aguilar Instituto Nacional de Investigacion Agropecuaria
  • Andres Legarra INRA
  • Tom Lawlor Holstein Association USA Inc


genomic selection, genomic recursion, inversion, single-step method


The purpose of this study was to evaluate properties of the inverse of the genomic relationship matrix derived with the algorithm for proven and young (APY) and the accuracy of genomic selection in single-step genomic best linear unbiased prediction (ssGBLUP). The APY implements genomic recursions on a subset of genotyped animals. When that subset is small, the cost of APY is approximately linear in memory and computations, effectively removing restrictions on the number of genotypes. Tests involved 10 102 702 final scores from 6 930 618 Holstein cows. A total of 100 000 animals with genotypes were used in the analyses and included 23 174 sires, 27 215 cows and 49 611 young animals. Genomic estimated breeding values (GEBVs) were calculated using ssGBLUP with a regular inverse of the genomic relationship matrix (G) and with G inverse from APY. Many subsets were tested including only sires, only cows and random samples from 2 000 to 20 000 animals. When the number of animals in the subset was 15,000, the correlations between GEBV with APY and GEBV with the regular inverse were 0.99. Best convergence rate was achieved with random samples. A theory on APY was derived and is based on the fact that additive effects of animals in the subset are linear functions of the effects of independent chromosome segments (ICSs); the number of segments is a function of the effective population size. Accuracy of GEBV with APY can be slightly superior to that of a regular inverse. The inverse with APY is computed from G, which in turn is derived from single nucleotide polymorphism (SNP) BLUP and indirectly from BayesB or other SNP-based prediction methods. Strategies like SNP selection, SNP weighting, and use of causative SNPs from sequence analysis can be incorporated in APY without additional cost. The APY removes size limitations from ssGBLUP and facilitates a model with a complex genetic architecture.

Author Biography

Yutaka Masuda, University of Georgia

University of Georgia