Welcome to the RBPome

RNA has consistently broken dogmas, owing to its multitude of unexpected functions. However, RNA does faithfully adhere to one rule: it always functions through interactions with proteins. The studies in this issue focus on the rapidly expanding repertoire of diverse RNA–protein interactions [1] and their functional roles and physiological consequences, both from the perspective of RNA-binding proteins (RBPs) and from the vantage point of RNAs - coding and noncoding - that interface with RBPs.

RNA-protein interactions are fascinating for many reasons, one being their role in evolution - from the earliest life forms to the most complex organisms (examples reviewed in [24]). For example, the interactions between pre-mRNA and proteins fine-tune alternative splicing in a manner that can gradually create new protein functionalities without the need to create additional genes and without affecting existing proteins [46]. Moreover, there has been an emergence of numerous noncanonical RBPs (that is, proteins not previously thought to function as RBPs) that are influenced by interactions with RNA transcripts coding and noncoding alike [7, 8]. In fact, genome-wide footprinting articles in this issue [911] demonstrate a vast and diverse landscape of RNA–RBP complexes that play key regulatory roles.

With recent advances in technology, together with powerful combined experimental and computational developments, we have witnessed unprecedented new insights into the diverse and dynamic interactions that occur between RNA and proteins. These range from new functions of well-established RBPs to the molecular sequences and structures harbored in RNA that drive interactions with proteins. Despite this great progress, the present issue of Genome Biology demonstrates that there are still unresolved aspects of RBP biology.

Technology opening up new horizons of the RBPome

A major challenge in understanding RNA–protein interactions has been to remove non-specific abundant RNAs when mapping protein-binding sites in low-abundant RNAs. For instance, in order to understand how tissue specificity of splicing is determined, it was important to identify protein-binding sites on pre-mRNAs without contamination from mRNAs or ribosomal RNAs. This was achieved with the UV crosslinking and immunoprecipitation (CLIP) method, which employs stringent purification steps to identify specific binding sites to the exclusion of non-specific events [12].

As advances were made in CLIP technology [13, 14], integrative computational approaches were developed to combine CLIP with genome-wide studies of alternative splicing to define regulatory maps for specific RBPs. These maps unraveled the position-dependent principles that were capable of predicting the functional binding sites of RBPs [1517]. Moreover, the regulatory elements positioned around alternative exons were used to derive a splicing code with a capacity to predict tissue-specific splicing with a reasonable accuracy [18, 19].

Technology developed to better understand RNA–protein interactions is not limited to CLIP, but is also maturing on several additional fronts [11, 14]. Indeed, presented in this issue are a number of novel genome-wide experimental and analysis techniques that yield new insights into RBPology [9, 10, 2025]. RNA pull-downs (also known as RIP-seq) and RNA-footprinting methods are revealing key RBP interactions that do not fit the mold of classic RBPs. Technologies will therefore need to advance further if we are to better understand the interplay of noncoding transcripts and RBPs. Why are long noncoding RNA (lncRNA) transcripts alternatively spliced by RBPs like their translated mRNA counterparts? Can lncRNAs serve as decoys, scaffolds or allosteric effectors of RBPs [26]?

Future studies will need to focus on additional technologies in order to identify the specific sequences and structures of RBP–RNA complexes on a larger scale. For example, new methods that determine the genome-wide structure of RNA molecules in vivo will be instrumental in achieving this goal [27, 28]. The many other key aspects of the RBPome that still remain to be solved include understanding the combinatorial interactions of proteins on a given RNA substrate. Can multiple interactions provide combinatorial control? Methods to understand the structure and interactions of RBPs on full-length RNAs will be needed in the future.

New RNA species detected by high-throughput sequencing

The ability to detect low-abundant RNAs by using high-throughput sequencing has led to the discovery of thousands of noncoding RNAs (discussed in [29]). One facet of the noncoding transcriptome that has been of intensive research focus is lncRNAs. As perhaps expected, lncRNAs have been found to break the rules and form numerous noncanonical interactions with proteins such as chromatin regulatory complexes, cohesins and transcription factors (see, for example, [3032]). Moreover, lncRNAs have been shown to influence the regulatory dynamics of small RNAs through sponging [33] and other mechanisms. However, it is not known whether lncRNA–protein interactions are generally required to mediate the functions of lncRNAs or, vice versa, whether lncRNAs modulate RBPs.

The advent of new techniques that can identify RNA–DNA and RNA–protein interactions are revealing a new regulatory layer of lncRNAs. These techniques, which include CHART [34], RAP [35] and ChOP [36], are conceptually similar to ChIP. Instead of mapping DNA–protein interactions, however, these methods determine the localization of RNA on DNA and, moreover, enable the study of proteins that interact with RNA at these sites. Experiments of this nature will provide missing clues into the regulatory principles of how lncRNAs arrive at their target sites, the proteins they interact with to get there and which sequences are specifically required. Collectively, lncRNA–protein interactions perhaps herald a new code for RNA localization around the genome and how it influences local and distal epigenetic aspects of nuclear architecture through these interactions. This is an area certain to be of intense development and research focus in the future.

New RNA modifications detected by high-throughput sequencing

Modified ribonucleic bases have been an area of active interest for over half a century. Yet recent genomic sequencing technologies have expanded the modified RNA space beyond classic tRNA and rRNA modifications to incorporate mRNA and lncRNA transcripts as well (reviewed in [37, 38]). The field of epitranscriptomics is in its infancy, and many questions remain to be answered. Why are these modifications so prevalent? How do they influence RNA-binding interactions and RNA structure? What is their function? Moreover, these same questions also apply to the regulatory layers formed by RNA editing [39]. Specifically, Levanon and colleagues demonstrate that only a very small fraction of sites edited by ADAR are conserved.

Understanding disease-causing mutations

Impact upon human disease is perhaps the most important new frontier in the study of the RBPome. We need to use classic genetic approaches and population studies to identify mutations in both RBPs and RNA substrates themselves. Great progress has been made in finding mutations in RBPs associated with disease risk, such as the RBPs FUS and TDP-43 in amyotrophic lateral sclerosis (reviewed in [40]). Yet more work will be needed to understand what changes occur to RNA substrates in disease and their effect on structure and function, although progress in this area is already underway (reviewed in this issue in [41]).

Several studies in this issue have taken on the challenge of examining the intersection between the RBPome and human disease. Mort, Mooney and colleagues identify single-base mutations that affect the alternative splicing of key regulatory mRNAs [25]. Kechavarzi and Janga explore the dysregulation of RBP-encoding transcripts in cancer and the resulting changes in protein interaction networks [42]. Finally, Tuschl, Wessels and colleagues use PAR-CLIP data to identify differential microRNA targeting that is correlated with breast cancer subtypes [43]. These types of studies are archetypical for numerous future studies to understand the influence of the RBPome on human health and disease.

The next challenge will be to integrate DNA and RNA biology to understand how the various transcriptional and post-transcriptional mechanisms cooperate to orchestrate gene expression. This knowledge will be crucial in order to benchmark the mutations that cause disease by changing gene expression.

Cometh the hour, cometh the special issue

The field’s rapid diversification and growth, together with an increasing impact on human health, makes for perfect timing for a Genome Biology issue focused on the RBPome!