Methods to estimate erosion factors of genomic breeding values of candidates due to long-distance linkage disequilibrium


  • Didier Boichard
  • Sebastien Fritz
  • Pascal Croiseau
  • Vincent Ducrocq
  • Thierry Tribout
  • Marine Barbat
  • Beatriz Cuyabano


Most validation studies of genomic evaluation observe inflation, i.e. regression coefficients of the later phenotypes on early predictions smaller than one. This pattern does not reflect a bias in the evaluation model, it rather reflects long distance associations between markers and QTLs. Due to linkage disequilibrium (LD), SNP effects estimated from a reference data capture non-zero contributions from distant QTLs located not only in the same, but also in the other chromosomes, and we show that some across-chromosome LD does exist in different French dairy cattle breeds. This LD results from limited effective population size and, more importantly, from the relationship within the reference population. Long distance associations are partly broken and rebuilt at random at each generation. Therefore, corresponding SNP effects are partly lost in the next generations and we shall refer to this effect loss as erosion. This erosion can be predicted by different methods based on the following equations applied to simulated QTLs. If the breeding values are Pq with P the QTL genotypes and q their effects, the expected contribution of QTL j to the estimated SNP effect i is ci M’ Pj qj, where M is the matrix of SNP genotypes and ci is line i (corresponding to SNP i) of C = (M’M + l I)-1. Two methods based on simulations are proposed to estimate the erosion factor r. In Method 1, the direct genomic value (DGV) of the progeny based on SNP effects estimated in this new simulated generation are regressed on the DGV of the same progeny based on SNP effects estimated in the reference population. In Method 2 all the QTL contributions to SNP effects are regressed based on SNP-QTL recombination rates and summed to predict the breeding value at the next generation. The regression coefficient of the DGV based on eroded contributions on the raw DGV is also an estimate of erosion. An illustration is given with the French Normande female reference population in 2021. Method 1 is simpler to implement on a routine basis, and yields good estimates of erosion over one generation. Erosion is also dependent on the distance between the young candidates and their reference population and formulae are proposed to apply erosion. We recommend accounting for erosion in genetic evaluations to provide unbiased predictions for the young candidates. Accordingly, erosion has been accounted for in the French Single Step bovine evaluation since March 2022.