Data-driven insights to inform splice-altering variant assessment.

American journal of human genetics
Authors
Keywords
Abstract

Disease-causing genetic variants often disrupt mRNA splicing, an intricate process that is incompletely understood. Thus, accurate inference of which genetic variants will affect splicing and what their functional consequences will be is challenging, particularly for variants outside of the essential splice sites. Here, we describe a set of data-driven heuristics that inform the interpretation of human splice-altering variants (SAVs) based on the analysis of annotated exons, experimentally validated SAVs, and the currently understood principles of splicing biology. We defined requisite splicing criteria by examining around 202,000 canonical protein-coding exons and 19,000 experimentally validated splicing branchpoints. This analysis defined the sequence, spacing, and motif strength required for splicing, with 95.9% of the exons examined meeting these criteria. By considering over 12,000 experimentally validated variants from the SpliceVarDB, we defined a set of heuristics that inform the evaluation of putative SAVs. To ensure the applicability of each heuristic, only those supported by at least 10 experimentally validated variants were considered. This allowed us to establish a measure of spliceogenicity: the proportion of variants at a location (or motif site) that affected splicing in a given context. This study makes considerable advances toward bridging the gap between computational predictions and the biological process of splicing, offering an evidence-based approach to identifying SAVs and evaluating their impact. Our splicing heuristics enhance the current framework for genetic variant evaluation with a robust, detailed, and comprehensible analysis by adding valuable context over traditional binary prediction tools.

Year of Publication
2025
Journal
American journal of human genetics
Date Published
03/2025
ISSN
1537-6605
DOI
10.1016/j.ajhg.2025.02.012
PubMed ID
40056912
Links