Deep perturbation & cell communication modeling

Mohammad Lotfollahi¹
Technical University of Munich

David Fischer²
Institute of Computational Biology, Helmholtz Zentrum München

Deep interpretable perturbation modeling in single cell genomics¹; Learning cell communication from spatial graphs of cells²

¹ Recent advances in multiplexing single-cell transcriptomics across experiments are enabling the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible, so computational methods are needed to predict, interpret and prioritize perturbations. Here, we present the Compositional Perturbation Autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA encodes and learns transcriptional drug response across different cell types, doses, and drug combinations. The model produces easy-to-interpret embeddings for drugs and cell types, allowing drug similarity analysis and predictions for unseen dosages and drug combinations. We show CPA accurately models single-cell perturbations across compounds, dosages, species, and time. We further demonstrate that CPA predicts combinatorial genetic interactions of several types, implying it captures features that distinguish different interaction programs. Finally, we demonstrate CPA allows in-silico generation of 5,329 missing combinations (97.6% of all possibilities) with diverse genetic interactions. We envision our model will facilitate efficient experimental design by enabling in-silico response prediction at the single-cell level.

² I will discuss statistical dependencies between molecular cells states in space based. In particular, I will discuss node-centric expression modeling (NCEM), a computational method based on graph neural networks as a means of reconciling variance attribution and cell-cell communication modeling. We use these models in varying complexity across spatial assays, such as immunohistochemistry and MERFISH, and biological systems, to demonstrate that the statistical cell–cell dependencies discovered by NCEMs are plausible signatures of known molecular processes underlying cell communication. Altogether, this graphical model of cellular niches is a step towards understanding emergent tissue phenotypes. Statistically, this can be interpreted as NCEMs providing a means for replacing the i.i.d. assumption on cellular vectors, that is commonly used in models of scRNA-seq data, with structured dependencies between cells.

Fabian Theis and Maren Buettner
Institute of Computational Biology, Helmholtz Zentrum München

Primer: Latent space learning in single cell genomics: Current approaches and challenges

Modeling cellular state as well as dynamics e.g. during differentiation or in response to perturbations is a central goal of computational biology. Single-cell technologies now give us easy and large-scale access to state observations on the transcriptomic, epigenomic and more recently also spatial level. In particular, they allow resolving potential heterogeneities due to asynchronicity of differentiating or responding cells, and profiles across multiple conditions such as time points, space and replicates are being generated, with a series of implications across biology and medicine.

Most computational methods for single cell genomics are operating on an intermediate often nonlinear representation of the high-dimensional data such as a cell-cell knn graph or some more general latent space. Interpretation of these led already in early days towards models of cellular differentiation for example by pseudotemporal ordering or mapping time information. Hence latent space modeling and manifold learning have become a popular tool to learn overall variation in single cell gene expression, more recently also across data sets and modalities.

After a short review of these approaches, I will discuss how latent space learning can be achieved using variants of autoencoders, with applications from denoising, imputation to learning perturbations. I will finish with short outlook towards spatial modeling and interpretability of latent projections under perturbations eg to identify optimal drug combinations targeting a certain cell state.