The application of several genomic models for the analysis of small holder dairy cattle data


  • Raphael Mrode Scottish Agricultural College
  • Hassan Aliloo
  • Chinyere Ekine
  • Julie Ojango
  • John Gibson
  • Mwai Okeyo


The lack of data recording in smallholder dairy cattle systems implies that the availability of molecular data could offer some quick wins in terms of using genomic information for genomic prediction and selection.  Initial studies on genomic prediction with data from smallholder dairy herds have reported promising results with low to medium values for the accuracy of prediction.  The relatively small size of data in those studies limited the range of models that could be fitted. With more data now becoming available in the African Dairy Genetic Gains (ADGG) project operating in Tanzania and Ethiopia, this paper examines the impact of fitting GBLUP models with dominance effects, use of a random regression model and various Bayesian methods on the accuracy of genomic predictions in small holder dairy  milk yield data from Tanzania. The data set consisted of 9193 milk test date yields on 1930 cows from 456 herds which were genotyped with Genomic Profiler (GGP) Bovine 50K chip. First analysis was GBLUP based on a fixed regression model consisting of the fixed effects of ward, age nested parity, test-year-season; fixed curves with Legendre polynomials of order four  nested within breed classes by parity interaction. The random effects were herd animal and permanent environmental (PE) effects.  The second model (GBLUP-D) was the same as the above fixed regression model but with dominance effects as an additional random term.  The third model was a random regression model (GBLUP-RRM) with fixed effects as in the fixed regression model plus the random effects of herds, animal and PE but the latter two effects modelled with Legendre polynomials of order two.  Corrected phenotypes or yield deviations (YDs) were then derived from the GBLUP model and were used as response variables in various Bayesian analysis fitting BayesA, BayesB and BayesC. The heritability estimate from GBLUP and GBLUP-D were the same at 0.14±0.04 while the estimate for GBLUP-RRM was higher at 0.26. The proportion of total variance due to PE effects was 0.26 ± 0.02 for both GBLUP and GBLUP-D but was slightly lower at 0.24 for the GBLUP-RRM. The proportion of total variance due to dominance was low at 0.03±0.08, which was not significantly different from zero.  Both cross-validation and forward validation were undertaken to estimate the accuracy of genomic prediction. The estimates of accuracy from the cross-validation from GBLUP and GBLUP-D were low to medium (0.28 to 0.44), being highest for cows with the lowest proportion of exotic genes. The estimates for accuracy were higher from the forward validation with values ranging from 0.30 to 0.43 accompanied with regression coefficients which were closer to unity. The accuracies of prediction for the Bayesian methods were generally very low, varying from 0.10 to 0.21. This was unexpected and will be further examined. In conclusion, the GBLUP-RRM resulted in better accuracies of prediction compared to GBLUP but estimates of accuracies were moderate. As more data accumulates from the ADGG project, these models in addition to those that account for breed origin of alleles will be examined further including joint genomic predictions across countries to examine impact on accuracies.