K-mer analysis of long-read alignment pileups for structural variant genotyping.

Nature communications
Authors
Abstract

Accurately genotyping structural variant (SV) alleles is crucial to genomics research. We present a novel method (kanpig) for genotyping SVs that leverages variant graphs and k-mer vectors to rapidly generate accurate SV genotypes. Benchmarking against the latest SV datasets shows kanpig achieves a single-sample genotyping concordance of 82.1%, significantly outperforming existing tools, which average 66.3%. We explore kanpig's use for multi-sample projects by testing on 47 genetically diverse samples and find kanpig accurately genotypes complex loci (e.g. SVs neighboring other SVs), and produces higher genotyping concordance than other tools. Kanpig requires only 43 seconds to process a single sample's 20x long-reads and can be run on PacBio or Oxford Nanopore long-reads.

Year of Publication
2025
Journal
Nature communications
Volume
16
Issue
1
Pages
3218
Date Published
04/2025
ISSN
2041-1723
DOI
10.1038/s41467-025-58577-w
PubMed ID
40185777
Links