Imputation of genetic characteristics using deep learning methods

  • Dierck Segelke
  • Lilian Gehrke
  • Jan Wabbersen


Different imputation methods are used to deal with missing markers and to infer genetic characteristics. In routine genetic evaluation, the majority of adopted imputation methods are pedigree and population based. In this study, we compare the routinely used methods with innovative methods based on deep learning and other machine learning frameworks. Therefore the frameworks Keras, LightGBM and a combination of these methods are compared to the common software tool Beagle. Imputation accuracy for four different genetic characteristics were analysed. Results show that a combination of Keras and LightGBM outperform Beagle significantly in accuracy and the computation time decreases drastically. The results also demonstrate that big datasets and the presence of close related animals in the training set are needed. In conclusion, machine learning methods, such as deep learning, are novel powerful tools, which can improve the efficiency of breeding programs.