TY - GEN
T1 - Exploiting multi-grain parallelism for efficient selective sweep detection
AU - Alachiotis, Nikolaos
AU - Pavlidis, Pavlos
AU - Stamatakis, Alexandros
PY - 2012
Y1 - 2012
N2 - Selective sweep detection localizes targets of recent and strong positive selection by analyzing single nucleotide polymorphisms (SNPs) in intra-species multiple sequence alignments. Substantial advances in wet-lab sequencing technologies currently allow for generating unprecedented amounts of molecular data. The increasing number of sequences and number of SNPs in such large multiple sequence alignments cause prohibiting long execution times for population genetics data analyses that rely on selective sweep theory. To alleviate this problem, we have recently implemented fine- and coarse-grain parallel versions of our open-source tool OmegaPlus for selective sweep detection that is based on the ω statistic. A performance issue with the coarse-grain parallelization is that individual coarse-grain tasks exhibit significant run-time differences, and hence cause load imbalance. Here, we introduce a significantly improved multi-grain parallelization scheme which outperforms both the fine-grain as well as the coarse-grain versions of OmegaPlus with respect to parallel efficiency. The multi-grain approach exploits both coarse-grain and fine-grain operations by using available threads/cores that have completed their coarse-grain tasks to accelerate the slowest task by means of fine-grain parallelism. A performance assessment on real-world and simulated datasets showed that the multi-grain version is up to 39% and 64.4% faster than the coarse-grain and the fine-grain versions, respectively, when the same number of threads is used.
AB - Selective sweep detection localizes targets of recent and strong positive selection by analyzing single nucleotide polymorphisms (SNPs) in intra-species multiple sequence alignments. Substantial advances in wet-lab sequencing technologies currently allow for generating unprecedented amounts of molecular data. The increasing number of sequences and number of SNPs in such large multiple sequence alignments cause prohibiting long execution times for population genetics data analyses that rely on selective sweep theory. To alleviate this problem, we have recently implemented fine- and coarse-grain parallel versions of our open-source tool OmegaPlus for selective sweep detection that is based on the ω statistic. A performance issue with the coarse-grain parallelization is that individual coarse-grain tasks exhibit significant run-time differences, and hence cause load imbalance. Here, we introduce a significantly improved multi-grain parallelization scheme which outperforms both the fine-grain as well as the coarse-grain versions of OmegaPlus with respect to parallel efficiency. The multi-grain approach exploits both coarse-grain and fine-grain operations by using available threads/cores that have completed their coarse-grain tasks to accelerate the slowest task by means of fine-grain parallelism. A performance assessment on real-world and simulated datasets showed that the multi-grain version is up to 39% and 64.4% faster than the coarse-grain and the fine-grain versions, respectively, when the same number of threads is used.
UR - http://www.scopus.com/inward/record.url?scp=84866654437&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33078-0_5
DO - 10.1007/978-3-642-33078-0_5
M3 - Conference contribution
AN - SCOPUS:84866654437
SN - 978-3-642-33077-3
VL - 1
T3 - Lecture Notes in Computer Science
SP - 56
EP - 68
BT - Algorithms and Architectures for Parallel Processing
A2 - Xiang, Yang
A2 - Stojmenovic, Ivan
A2 - Apduhan, Bernady O.
A2 - Wang, Guojun
A2 - Nakano, Koji
A2 - Zomaya, Albert Y.
PB - Springer
CY - Berlin, Heidelberg
T2 - 12th International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2012
Y2 - 4 September 2012 through 7 September 2012
ER -