Nona, a novel multimodal masked modeling framework for functional genomics
Senior ML Scientist
ReLU,
We present Nona, a unifying multimodal masked modeling framework for functional genomics. Nona is a neural network model that operates on both DNA sequence and epigenetic tracks such as DNase-seq, ChIP-seq, and RNA-seq at base-pair resolution. By leveraging a flexible masking strategy, Nona can predict any subset of masked DNA and/or tracks from the unmasked subset. Nona supports existing sequence-to-function models, and their applications such as variant effect prediction. Beyond this, Nona enables multiple novel application modes including 1) context-aware predictions 2) functional language modeling 3) functional genotyping. Altogether, Nona is a versatile framework that extends sequence-to-function and masked language modeling to novel applications in regulatory genomics.