Fast exact gap-affine partial order alignment with POASTA.
Authors | |
Keywords | |
Abstract | MOTIVATION: Partial order alignment is a widely used method for computing multiple sequence alignments, with applications in genome assembly and pangenomics, among many others. Current algorithms to compute the optimal, gap-affine partial order alignment do not scale well to larger graphs and sequences. While heuristic approaches exist, they do not guarantee optimal alignment and sacrifice alignment accuracy.RESULTS: We present POASTA, a new optimal algorithm for partial order alignment that exploits long stretches of matching sequence between the graph and a query. We benchmarked POASTA against the state-of-the-art on several diverse bacterial gene datasets and demonstrated an average speed-up of 4.1x and up to 9.8x, using less memory. POASTA's memory scaling characteristics enabled the construction of much larger POA graphs than previously possible, as demonstrated by megabase-length alignments of 342 Mycobacterium tuberculosis sequences.AVAILABILITY AND IMPLEMENTATION: POASTA is available on Github at . |
Year of Publication | 2025
|
Journal | Bioinformatics (Oxford, England)
|
Date Published | 01/2025
|
ISSN | 1367-4811
|
DOI | 10.1093/bioinformatics/btae757
|
PubMed ID | 39752324
|
Links |