Abstract
Highly optimized library implementations for important scientific kernels can improve scientific productivity. To this end, we are currently developing the Phylogenetic Likelihood Library (PLL) that implements functions to compute and optimize the phylogenetic likelihood score on evolutionary trees. Here, we focus on novel techniques to orchestrate likelihood computations on large vector-like processors such as GPUs. We present a novel scheme for vectorizing computations and organizing conditional likelihood arrays (CLAs) in such a way that they do not need to be transferred at all between the GPU and the CPU. We compare the performance of our GPU implementation for DNA data with a highly optimized x86 version of the PLL that relies on manually tuned AVX intrinsics. Our GPU implementation accelerates the likelihood computations by a factor of two compared to the, most probably, currently fastest available x86 implementation. We conclude that, a hybrid GPU-CPU version needs to be developed and integrated into the PLL to leverage the computational power of modern desktop systems and clusters.
Original language | English |
---|---|
Title of host publication | Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013 |
Place of Publication | Piscataway, NJ |
Publisher | IEEE |
Pages | 530-538 |
Number of pages | 9 |
ISBN (Electronic) | 978-0-7695-4979-8 |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Event | 2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013 - Boston, MA, Japan Duration: 22 Jul 2013 → 26 Jul 2013 |
Conference
Conference | 2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013 |
---|---|
Country/Territory | Japan |
City | Boston, MA |
Period | 22/07/13 → 26/07/13 |
Keywords
- GPU
- Maximum likelihood
- OpenCL
- Phylogenetics
- Vector intrinsics