Abstract
Genomic datasets are steadily growing in size asmore genomes are sequenced and new genetic variants arediscovered. Datasets that comprise thousands of genomes andmillions of single-nucleotide polymorphisms (SNPs), exhibitexcessive computational demands that can lead to prohibitivelylong analyses, yielding the deployment of high-performancecomputational approaches a prerequisite for the thoroughanalysis of current and future large-scale datasets. In this work, we demonstrate that the computational kernel for calculatinglinkage disequilibria (LD) in genomes, i.e., the non-randomassociations between alleles at different loci, can be cast interms of dense linear algebra (DLA) operations, leveraging thecollective knowledge in the DLA community in developing high-performance implementations for various microprocessor ar-chitectures. The proposed approach for computing LD achievesbetween 84% and 95% of the theoretical peak performance ofthe machine, and is up to 17X faster than existing LD kernelimplementations. Furthermore, we argue that, the currenttrend of increasing the SIMD (Single Instruction MultipleData) register width in microprocessors yields minor benefitsfor assessing LD, resulting in an increasing gap betweenperformance attainable by LD computations and the theoreticalpeak of the microprocessor architecture, suggesting the needfor hardware support.
Original language | English |
---|---|
Title of host publication | Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016 |
Place of Publication | Piscataway, NJ |
Publisher | IEEE |
Pages | 418-427 |
Number of pages | 10 |
ISBN (Electronic) | 978-1-5090-3682-0 |
ISBN (Print) | 978-1-5090-3683-7 |
DOIs | |
Publication status | Published - 18 Jul 2016 |
Externally published | Yes |
Event | 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016 - Chicago, United States Duration: 23 May 2016 → 27 May 2016 Conference number: 30 |
Conference
Conference | 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016 |
---|---|
Abbreviated title | IPDPSW 2016 |
Country/Territory | United States |
City | Chicago |
Period | 23/05/16 → 27/05/16 |
Keywords
- Dense linear algebra
- Linkage disequilibrium
- Matrix multiplication
- Population genetics