Integrating representation learning, permutation, and optimization to detect lineage-related gene expression patterns.

Nature communications
Authors
Abstract

Recent barcoding technologies allow reconstructing lineage trees while capturing paired single-cell RNA-sequencing (scRNA-seq) data. Such datasets provide opportunities to compare gene expression memory maintenance through lineage branching and pinpoint critical genes in these processes. Here we develop Permutation, Optimization, and Representation learning based single Cell gene Expression and Lineage ANalysis (PORCELAN) to identify lineage-informative genes or subtrees where lineage and expression are tightly coupled. We validate our method using synthetic data and apply it to recent paired lineage and scRNA-seq data of lung cancer in a mouse model and embryogenesis of mouse and C. elegans. Our method pinpoints subtrees giving rise to metastases or new cell states, and genes identified as most informative about lineage overlap with known pathways involved in lung cancer progression. Furthermore, our method highlights differences in how gene expression memory is maintained through divisions in cancer and embryogenesis, thereby providing a tool for studying cell state memory through divisions across biological systems.

Year of Publication
2025
Journal
Nature communications
Volume
16
Issue
1
Pages
1062
Date Published
01/2025
ISSN
2041-1723
DOI
10.1038/s41467-025-56388-7
PubMed ID
39870610
Links