Abstract
Driven by novel biological wet lab techniques such as pyrosequencing there has been an unprecedented molecular data explosion over the last 2-3 years. The growth of biological sequence data has significantly out-paced Moore's law. This development also poses new computational and architectural challenges for the field of phylogenetic inference, i.e., the reconstruction of evolutionary histories (trees) for a set of organisms which are represented by respective molecular sequences. Phylogenetic trees are currently increasingly reconstructed from multiple genes or even whole genomes. The recently introduced term "phylogenomics" reflects this development. Hence, there is an urgent need to deploy and develop new techniques and computational solutions to calculate the computationally intensive scoring functions for phylogenetic trees. In this paper, we propose a dedicated computer architecture to compute the phylogenetic Maximum Likelihood (ML) function. The ML criterion represents one of the most accurate statistical models for phylogenetic inference and accounts for 85% to 95% of total execution time in all state-of-the-art ML-based phylogenetic inference programs. We present the implementation of our architecture on an FPGA (Field Programmable Gate Array) and compare the performance to an efficient C implementation of the ML function on a high-end multi-core architecture with 16 cores. Our results are two-fold: (i) the initial exploratory implementation of the ML function for trees comprising 4 up to 512 sequences on an FPGA yields speedups of a factor 8.3 on average compared to execution on a single-core and is faster than the OpenMP-based parallel implementation on up to 16 cores in all but one case; and (ii) we are able to show that current FPGAs are capable to efficiently execute floating point intensive computational kernels.
Original language | English |
---|---|
Title of host publication | IPDPS 2009 - Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium |
Place of Publication | Piscataway, NJ |
Publisher | IEEE |
Number of pages | 8 |
ISBN (Electronic) | 978-1-4244-3750-4 |
ISBN (Print) | 978-1-4244-3751-1 |
DOIs | |
Publication status | Published - 2009 |
Externally published | Yes |
Event | 23rd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2009 - Rome, Italy Duration: 23 May 2009 → 29 May 2009 Conference number: 23 |
Conference
Conference | 23rd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2009 |
---|---|
Abbreviated title | IPDPS 2009 |
Country/Territory | Italy |
City | Rome |
Period | 23/05/09 → 29/05/09 |