We revisit the Double Digest problem, which occurs in sequencing of large DNA strings and consists of reconstructing the relative positions of cut sites from two different enzymes: we first show that Double Digest is strongly NP-complete, improving upon previous results that only showed weak NP-completeness. Even the (experimentally more meaningful) variation in which we disallow coincident cut sites turns out to be strongly NP-complete. In a second part, we model errors in data as they occur in real-life experiments: we propose several optimization variations of Double Digest that model partial cleavage errors. We then show APX-completeness for most of these variations. In a third part, we investigate these variations with the additional restriction that conincident cut sites are disallowed, and we show that it is NP-hard to even find feasible solutions in this case, thus making it impossible to guarantee any approximation ratio at all.
|Title of host publication||Computing and Combinatorics|
|Subtitle of host publication||9th Annual International Conference, COCOON 2003 Big Sky, MT, USA, July 25–28, 2003 Proceedings|
|Editors||Tandy Warnow, Binhai Zhu|
|Publication status||Published - 2003|
|Name||Lecture Notes in Computer Science|
Cieliebak, M., Eidenbenz, S., & Woeginger, G. (2003). Double digest revisited: Complexity and Approximability in the Presence of Noisy Data. In T. Warnow, & B. Zhu (Eds.), Computing and Combinatorics: 9th Annual International Conference, COCOON 2003 Big Sky, MT, USA, July 25–28, 2003 Proceedings (pp. 519-527). (Lecture Notes in Computer Science; Vol. 2697). Springer. https://doi.org/10.1007/3-540-45071-8_52