The use of GPUs in genomic data analysis

  • M P Coffey
  • R Mrode
  • T Krzyzelewski
Keywords: genomic selection, GPU, CUDA, matrix calculations


Modern animal breeding datasets are large and getting larger, due in part to recent availability of high-density single nucleotide polymorphism arrays and cheap sequencing technology. High-performance computing methods for efficient data warehousing and analysis are under development. Storage requirements for genotypes are modest, although full-sequence data will require much more storage. Storage requirements for intermediate and results files for genetic evaluations are much greater, particularly when multiple runs must be stored for research and validation studies. Genomic evaluation using large datasets requires a lot of computing power, particularly when large fractions of the population are genotyped. Large datasets create challenges for the delivery of timely genetic evaluations which must be overcome in a way that does not disrupt service provision in the transition from conventional to genomic evaluations. Processing time is important, especially as real-time systems for on-farm decisions are developed. Modern graphics processing units (GPUs) found in consumer PCs offer animal breeding a means to compute genomic breeding values in reasonable time.