Prior to 2015, I had a casual relationship, at best, with targeting RNA. The bulk of my nearly three decades of experience up to that point was with drugging protein targets using a variety of modalities, but principally small molecules. This typically meant engaging with their functional pockets and thereby blocking their function. The concepts and technologies for binding to proteins and modulating their function are multifarious and mature, developed over decades of successes and failures, amounting to a truly massive toolkit. The tools in this kit fall into a few broad categories: understanding protein structures and their functional significance, identifying ligands that bind into functionally significant pockets, and developing assays that confirm target engagement along with demonstrating the anticipated impact on cell biology. Familiar tools to solve familiar problems with proteins.
Then in 2015, I became smitten and eloped with RNA, setting out to build a company devoted to bringing to bear industrial drug discovery concepts and methods on a new problem of drugging RNA with small molecules. This was thrilling, but, as I surveyed the challenges of engaging RNA with small molecules, it became apparent that every tool, concept, and element of the protein-targeted drug discovery platform would have to be tweaked significantly or, often, completely rebuilt. Welcome to the RNA world.
Let’s start at the beginning, target identification. It’s all well and good to say, “we should drug RNA,” but which RNA? There are many potential targets that can only be addressed by engaging RNA, such as lncRNAs and microRNAs, but these are generally not well validated from a pharmacologic or therapeutic perspective. So, we instead chose to focus on mRNAs and pre‑mRNAs that encode for proteins that are already considered validated and thus of high value. In zeroing in on a manageable structure, the specific challenge is to identify conserved and conformationally persistent RNA structures responsible for functional biology.
In RNA target identification, we first look for low-energy solutions to the “RNA base-pairing problem.” There are numerous applications that generate such solutions, including Francois Major’s MC-fold which accounts well for non-canonical base-pairing, ubiquitous in RNA. We compute across long sequences to assess the likely impact of context and then sort the low-energy solutions by inferred dynamic proximity. Finally, we estimate the Shannon entropy of each proposed base pair and subject those data to principal components analysis to visualize the overall conformational freedom of the sub-target.
But these computational structural predictions need to be tested and refined experimentally. With proteins, there are no general methods for determining protein structure inside cells. In contrast, with RNA there is an extensive literature on inferring 2D structure from low-stoichiometry modification of the RNAs using chemical probes followed by sequencing to map those events. Two commonly used methods include either dimethyl sulfate (DMS) or various acylating agents collectively described as “SHAPE” reagents. In both cases, sequencing reveals the sites of modification because reverse transcriptase, upon encountering the modified nucleotide, stops or misincorporates at the modified site. Inasmuch as probes preferentially modify folded RNA at loops and bulges, RNA-seq reveals detailed structural constraints that enable substantial refinements of computationally predicted endogenous RNA structures. With this information, we can make meaningful choices about the potential ligandability of sub‑structures inside cells. With these models in hand, we can begin to map structures onto the functional nodes of the transcript, principally splicing and translation.
To render these structural insights actionable, we synthesize the candidate sub-structures (Arrakis sub-targets aka ASTs) and use the same computational and probing methods to demonstrate that the synthetic RNAs recapitulate the structures of the endogenous RNAs. The structural fidelity of the ASTs gives us the confidence that screening of those RNAs will yield ligands that we might reasonably expect to also bind the endogenous RNAs.
Screening brings its own challenges. We sampled several screening platforms in the course of our efforts, including fragment screening using ligand-observed nuclear magnetic resonance (NMR), DNA-encoded libraries (DEL), and SEC-MS. Each of these platforms is thoroughly vetted in protein world, but using them on RNA can yield surprising results. Each yielded some insights; for example, fragment screening gave good hits, but advancing them against the conformational dynamism of RNA was problematic. And unproductive interactions between the RNA target and the DNA barcodes complicated DEL screening. Size-exclusion chromatography-mass spectrometry (SEC‑MS), also called affinity selection-mass spectrometry (AS-MS), when screened against good ASTs, proved a reliable source of new, drug-like, RNA-targeted small-molecule ligands. Greater than 80% of the many targets that we have now screened using SEC‑MS have provided confirmed ligands, with nearly 100 screens completed.
But do those hits bind the same structures in the wild? That is, given that our probing data above has reassured us that the screened structures accurately reflect the endogenous structures in cells, we should be able to demonstrate that the found ligands also engage the endogenous RNA structures. To this end, we developed a photoaffinity and RNA-seq protocol termed photoaffinity evaluation of RNA ligation-sequencing (PEARL-seq) that, via competition of ligand with cognate photoprobe, shows that the ligand binds (or does not!) the targeted RNA in cells or cell lysates. This information is critical in trying to understand whether cellular phenomenology associated with the ligand can be reasonably attributed to ligand’s engagement of the targeted RNA.
It is perhaps worth noting that data for a ligand’s engagement of an RNA can be employed to other ends. The experiments above, using appropriate primers, are focused on the target RNA of interest. But sequencing the whole transcriptome, after exposure of the cells or cell lysates to photoprobe and photolysis, can tell us what other RNAs are engaged and serve as a measure of inter-RNA binding selectivity. However, binding un-targeted RNAs can be considered a feature rather than a bug: we use photoprobes to identify new RNAs that are engaged and may be candidate targets of interest. We term this empirical approach to target ID “PEARL-diving.”
Many compounds exhibit cellular activity, often broadly consistent with modulating the intended RNA target, but we must develop biological data packages that persuade us that the observed cellular biology can be attributed to the rSM engagement with the intended RNA. PEARL-seq gives us insight into RNA engagement in the cell, but not functional outcomes. One go-to method is to use CRISPR protocols to remove or alter the RNA sub-target; if the rSM induces the same biology in these altered cells, that’s a strong indication that the biological effect derives from an off-target pathway.
We have shared our vision from the very beginning: to find new therapeutic opportunities in the RNA world using RNA-targeted small-molecule medicines (rSMs). To achieve this, we have adapted much of the widely available protein-oriented drug discovery platform and, in many spots, invented new methods. Many elements of our platform rely on new methods that have emerged only in recent years: next-generation sequencing, CRISPR, PEARL-seq, HiBiT, SEC-MS, and chemical probing of RNA. Our ambitious journey has been to aggregate all of these components onto a single, integrated discovery platform that we can operate every day to crack the problem of drugging RNA. As ever, we proceed fearlessly forward.