scNovel: a scalable deep learning-based network for novel rare cell discovery in single-cell transcriptomics.

Briefings in bioinformatics
Authors
Keywords
Abstract

Single-cell RNA sequencing has achieved massive success in biological research fields. Discovering novel cell types from single-cell transcriptomics has been demonstrated to be essential in the field of biomedicine, yet is time-consuming and needs prior knowledge. With the unprecedented boom in cell atlases, auto-annotation tools have become more prevalent due to their speed, accuracy and user-friendly features. However, existing tools have mostly focused on general cell-type annotation and have not adequately addressed the challenge of discovering novel rare cell types. In this work, we introduce scNovel, a powerful deep learning-based neural network that specifically focuses on novel rare cell discovery. By testing our model on diverse datasets with different scales, protocols and degrees of imbalance, we demonstrate that scNovel significantly outperforms previous state-of-the-art novel cell detection models, reaching the most AUROC performance(the only one method whose averaged AUROC results are above 94%, up to 16.26% more comparing to the second-best method). We validate scNovel's performance on a million-scale dataset to illustrate the scalability of scNovel further. Applying scNovel on a clinical COVID-19 dataset, three potential novel subtypes of Macrophages are identified, where the COVID-related differential genes are also detected to have consistent expression patterns through deeper analysis. We believe that our proposed pipeline will be an important tool for high-throughput clinical data in a wide range of applications.

Year of Publication
2024
Journal
Briefings in bioinformatics
Volume
25
Issue
3
Date Published
03/2024
ISSN
1477-4054
DOI
10.1093/bib/bbae112
PubMed ID
38555470
Links