Deciphering the RNA structural code

Mehran Karimzadeh, Goodarzi Lab, University of California San Francisco; Vector Institute
Matvei Khoroshkin, Goodarzi Lab, University of California San Francisco

Meeting: Computational tools for deciphering the RNA structural code

We present two different approaches, pyTEISER and Pythia, both inspired by context-free grammars (CFGs), to better capture structural RNA elements and their underlying regulons from transcriptomic measurements. pyTEISER scans a large sampled population of structural elements (modeled as CFGs) against the transcriptome and uses high information criterion to identify likely functional CFGs. A more exhaustive local search is then used to identify the best representation of the structural cis-regulatory elements that underlie the observed transcriptomic moduilations (e.g., changes in RNA stability, processing, etc). Pythia reimagines this concept by modeling context-free grammar rules as fixed-dilated convolutional layers in neural network architectures. This enables a neural network model to build informative context-free grammars from scratch, as opposed to scoring pre-existing ones. Here, we have shown that Pythia passes this structural representation of RNA to a neural network capable of learning RNA binding protein preferences with high accuracy and precision. Together, these frameworks allow us to reveal and interrogate the fundamental contribution of RNA structural elements to post-transcriptional regulatory programs in health and disease.

Hani Goodarzi
Depts. of Biochemistry & Biophysics, Urology, University of California San Francisco

Primer: Primer: Capturing regulatory information encoded in RNA secondary structure

One in every 10 human proteins binds RNA. Sequence and structure of RNA determines its affinity to these post-transcriptional regulators. However, our understanding of the grammar underlying RNA-protein interactions remains incomplete. This is in large part due to the impact of RNA structure on regulatory interactions that is often ignored. For more than a decade now, we have been focused on developing strategies that enable systematic identification of structural RNA elements. Here, we will showcase the traditional approaches that are often utilized to tackle cis-regualtory element discovery in general and structured element discovery in particular. We will discuss context-free grammars as a versatile data structure that are ideally suited for modeling RNA elements, and how they can be leveraged for tackling this problem. We will also cover recent experimental and computational advances in the field that has prompted us to carve out new paths for tackling this problem.