U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Madame Curie Bioscience Database [Internet]. Austin (TX): Landes Bioscience; 2000-2013.

Cover of Madame Curie Bioscience Database

Madame Curie Bioscience Database [Internet].

Show details

SPLICEOSOMAL RNA INFRASTRUCTURE: The Network of Splicing Components and their Regulation by miRNAs

.

Author Information and Affiliations

RNA Infrastructure and Networks edited by Lesley J. Collins.
©2011 Landes Bioscience and Springer Science+Business Media.
Read this chapter in the Madame Curie Bioscience Database here.

The RNA infrastructure model highlights the major roles played by RNA- based networks in cellular biology. One of the principle concepts behind the RNA-infrastructure is that proteins shared between RNP machineries network their processes in a temporal (over the cell cycle) and spatial (across the cell, or intercellular) manner. In order to dig deeper into the RNA-infrastructure we need to examine the networking aspects of RNPs in a more detailed manner. The eukaryotic spliceosome is an excellent example of an RNA machine that contains RNA-Protein and RNA-RNA interactions, as well as temporal and spatial networking to other processes. This chapter will examine some different types of spliceosomal networks that involve RNPs and illustrate how current networking tools can be used to dissect the many faces of the RNA-infrastructure.

INTRODUCTION

RNA-based metabolism underpins the regulation and action of proteins in eukaryotic cells.1-3 Studies continue to reveal enormous genetic complexity through processes such as RNA interference (RNAi), alternative splicing and epigenetics, but we now face a massive challenge in mapping this RNA complexity onto already known molecular biology.2,4 We have regulatory RNAs (e.g., miRNAs) targeting specific genes, or in some cases thousands of genes; and we have processing RNAs (e.g., snRNAs) which interact with proteins in large complexes (for example snRNAs interact with multiple proteins to form snRNPs which themselves interact to form the spliceosome). Associated with these processes are modifying ncRNAs such as snoRNAs. These were once thought to modify only rRNA, snRNA and tRNA, but they are now suspected to be more important in the regulation of many other genes. There are also long ncRNAs (>200 nucleotides) implicated in epigenetic marking (reviewed in refs.4-6). All of these different types of RNAs work with proteins and together they make up a network called the 'RNA-Infrastructure'.1

Until recently, investigation of genetic networks was largely restricted to proteins, since protein-protein interactions (PPI) are important indicators of metabolic pathways,7 with networks often used for drug discovery,8 plant development9 and cancer studies.10,11 However, increases in the use of high-throughput technologies to gather gene expression and regulation information (e.g., see refs. 12, 13) place an increasing importance in connecting RNA information to existing protein information.14 Instead of just protein-protein interactions, we need to also consider RNA-protein and RNA-RNA interactions in order to use gene networks to make more accurate predictions. Understanding the networking properties of miRNA-target information for example can reveal the complicated facets of RNA-based regulation. For example, the Snu13p spliceosomal protein of the yeast Saccharomyces cerevisiae binds to U4 snRNA, but is also involved in the processing of the U3 snoRNA, having different interactions depending on the complex in which it is present.15

The spliceosome comprised of general and specific stage proteins (many of which are regulated by miRNAs), medium length snRNAs (some of which are modified by medium length snoRNAs) makes an ideal model for studying RNA-based networks. For model species such as humans and S. cerevisiae we can examine networks produced from Protein-Protein interactions (e.g., from BioGrid16), RNA-RNA and RNA-protein (miRNA-target and other) information. This chapter will use examples from the literature and our own studies to highlight how network analysis is even more relevant to understanding biological function when RNA-based regulation is added.

THE SPLICEOSOME AS AN EXAMPLE OF THE RNA-INFRASTRUCTURE

The macromolecular spliceosome is responsible for the removal of introns from gene transcripts (pre-mRNA) prior to translation.17 About 200-300 proteins are part of the spliceosome during the splicing cycle,17,18 but it is the inclusion of 5 critical small nuclear RNAs (snRNAs) which allow catalysis in the actual splicing reactions.19,20 These catalytic snRNAs must first bind to specific proteins to form the U-class snRNPs (small nuclear ribonucleoproteins) before the spliceosome assembles. As shown in Figure 1, during splicing, the U1 and U2 snRNPs recognize the intron-exon boundaries and form the prespliceosome (called the A complex). In the next stage the U4/U6.U5 snRNP tri-complex interacts and the B complex is formed. The U6 snRNA then breaks its interactions with the U4 snRNA and instead binds to the U2 snRNA. Both the U1 snRNP and the U4 snRNP leave the spliceosome allowing the formation of the catalytic unit. The mRNA is spliced by the action of the U2/U6 snRNAs then the exons are ligated. At this stage a protein complex called the 'Exon Junction Complex' (EJC) is deposited on the ligated exons which signals that splicing is complete. The spliced mRNA then goes through further checking before it is exported from the nucleus to the cytoplasm where translation takes place. We can readily see different types of RNA interactions taking place within the splicing cycle, some of them between the mRNA and splicing components (e.g., U1, U2, U5 interacting with the mRNA) and some between supporting components (e.g., snRNAs and their proteins).

Figure 1.. The spliceosome cycle (based on the major spliceosomal cycle in Ref.

Figure 1.

The spliceosome cycle (based on the major spliceosomal cycle in Ref. 17) The major cycle is shown here and although the minor cycle is believed to be very similar due to many of the proteins being shared, one important difference is that the U11 and U12 (more...)

'Major splicing' (or U2-splicing) as described above, is only one form of splicing and over the last decade the importance of so-called 'minor' splicing has become clearer.21 Although the cycle is similar, other snRNAs (U11, U12, U4atac and U6atac) replace their counterparts (U1, U2, U4, U6 respectively) and recognise a different set of introns. Although the U5 snRNA is a part of both splicing systems it is very likely that there are some different interactions. One example is that the U11 and U12 form a di-snRNP before interacting with the mRNA (in contrast, U1 and U2 snRNPs interact independently during major splicing). Minor splicing first came to light in plants but since then, it has been characterised in many groups of eukaryotes and may play an important role in alternative splicing.

The spliceosome (both major and minor versions) is a dynamic 'organelle' that displays characteristics of the RNA-infrastructure. Interactions between the RNA and protein components have both temporal and spatial aspects, especially in the biogenesis of different sub-complex components. The dynamic network of the spliceosome means that it cannot presently be investigated in its entirety. Instead we have to break it into either temporal (e.g., A, B, C complexes) or spatial (U2 snRNP-based, U4/U6 tri snRNP-based) partitions. In a similar manner we can also break our network analysis into physical interactions (spatial) and regulatory interactions (temporal). Many proteomic studies (e.g., refs. 22-24) have investigated parts of the spliceosome using mass spectrometry and microscopy giving us a more detailed look at how different components interact. However, we are still limited in how we can peek into the working spliceosome as most of these methods are not compatible with intact RNA. The integration of proteomic (i.e., mass spectrometry) and transcriptomics/expression (i.e., from mRNA sequencing) is a powerful combination of two genomic scale technologies is opening up further analysis into macromolecular complexes such as the spliceosome. Unfortunately, many of the RNA interactions are not collated into any database (especially the non-mRNA interactions) so the literature remains the largest source for RNA-RNA and modifying RNA-protein interactions. For this reason many macromolecular studies begin with a few proteins from a sub-complex and work up from there, because as will be seen further in this chapter, RNP networks can get very complicated.

RNA-PROTEIN INTERACTION NETWORKS

Before examining the complex RNA-protein interaction networks within the spliceosome, let us first look at RNA interaction networks in general. Interaction networks imply a physical connection between interacting partners. We can find many such networks in general life (e.g., subway networks, computer networks) and mapping biological networks has become one way in which we can investigate different cellular connections. One of the most common forms is the protein-protein interaction (PPI) network.

Data for PPI networks can be gathered from experimental data by a variety of methods including large scale tandem affinity purification coupled to mass spectrometry (TAP-MS) and yeast two-hybrid methods.25 Currently the most studied PPI organism is the yeast S.cerevisiae,7 but large datasets are now available for other eukaryotes such as humans, mice, Drosophila mealanogaster (fruitfly) and nematode (Caenorhabditis elegans),16 and also from some bacteria such as Bacillus anthracis, Francisella tularensis and Yersina pestis.26

A gene interaction network is made up of the following parts (Fig. 2). Each node (a gene or in PPIs specifically a node represents a protein) is connected by edges representing the interaction with other nodes. The connectivity (or degree, k) or a node is the number of connections a node makes with other nodes. Nodes which connect to numerous other nodes (i.e., have a high degree or connectivity e.g., k in Fig. 2A) are often called hubs. Network edges can be weighted in different ways (with numerical or colour differences) representing the experimental procedure used to determine the interaction. This distinction between different experiments is important as interactions may differ due to experimental biases. This can be applied to RNP networks where RNA-protein interactions are included. However, generally in these cases the RNA-associated nodes and edges are treated as 'special' types of protein nodes meaning that RNA-specific features (such as specific target areas) may be lost during network construction. Current PPI networks are also rife with noise in that they contain large numbers of false positives and false negatives.27

Figure 2.. Components of PPI networks.

Figure 2.

Components of PPI networks. x, w, d, k, j, g, m are nodes representing genes which are connected by edges (solid lines) indicating an interaction between them. F is an unconnected node. In network A, a scale-free network, node k has a high degree of connectivity (more...)

PPI networks in general can have different topologies (reviewed in ref. 25). Scale-free networks (Fig. 2A) are characterised by a power law degree distribution; that is a few hubs have a very high degree (many interactions) distinct from the majority of hubs that have few interactions.25,28 Random networks (Fig. 2C) are often constructed for comparison with PPI networks to show significance of hubs. The nodes in these random networks tend to have similar numbers of links, no hubs and a Poisson degree distribution. Hierarchical networks (Fig. 2B) contain hubs and defined modules and are considered to more accurately reflect biological systems.25 Hierarchical networks can be considered a different representation of a scale-free network but can be more useful in indicating interaction pathways between different protein complexes. There are also many statistics that can be used to describe PPI networks and they are also applicable to RNP networks. These are described in Table 1.

Table 1.. Network terms and characteristics used in this chapter.

Table 1.

Network terms and characteristics used in this chapter.

From some studies (e.g., see ref. 28) RNP networks follow a power law degree distribution and hence fit into the scale-free and hierarchical topologies. Randomly generated RNP networks (summarised in ref. 28) display the expected Poisson degree distribution where most nodes carry an average degree and hence it could be surmised that the graph properties shown by RNP networks offer some selective evolutionary advantage. Another difference is that RNP graphs show a clustering coefficient of 0.05 (largely unclustered, see Table 1 for definition of scoring) whereas 'real world' networks (those with hierarchical and scale-free topologies) generally show higher clustering (~0.3). On our example in Figure 2 the scale-free network (Fig. 2A) has a clustering coefficient of 0.496, whereas the random network (Fig. 2C) has a clustering coefficient of 0.185, although both networks contain the same number of nodes and edges and isolated nodes (statistics calculated with Network Analyzer29).

It is interesting in how adding ncRNAs such as regulatory RNAs, change PPI networks. Figure 3 shows a network of some human U2 proteins and their miRNA connections visualised with the network software Cytoscape.30-32 We can see here many patterns common to miRNA-based regulation. Some proteins (e.g., RBM8a) are regulated by a single miRNA whereas others (e.g., SIP1, SNRPE and SFRS1) are regulated by many miRNAs. It should be noted that these networks indicate overall interactions and do not factor in that some miRNAs may regulate their targets only in specific tissues or at specific times (i.e., these networks are not graphed temporally or spatially). We can see, however that some miRNAs can connect proteins in a network (e.g., DDX5, ISY1, RHEB, HNRNPK and PNN) whereas these proteins may remain unconnected nodes in a strictly PPI network (this is not really the case with these proteins as connecting nonsplicing proteins have been removed for this exercise).

Figure 3.. Networks of some splicing proteins connected by miRNAs.

Figure 3.

Networks of some splicing proteins connected by miRNAs. Some miRNA.target interactions have a simple one to one relationship whereas other proteins are regulated by many miRNAs and some miRNAs regulate many proteins. miRNA interaction information was (more...)

One downside with these networks is that computational approaches are presently limited in their ability to resolve the connections in an RNA-centric manner. Typically miRNAs are known to target the 3' UTR region of a gene, but others may target the 5' UTR region.33 Although these are still all miRNA-mRNA interactions, they are quite different in their nature and hence their biological relevance. Our way to resolve this matter presently is to weight the edges for different connection types in a manner similar to determining different experiments in PPI networks.

On a larger scale, regulated protein networks (e.g., RNP networks consisting of proteins and their miRNA regulators, but not other ncRNAs) display characteristics similar to that of PPIs from transcription factor proteins.34 This study34 compared data from an early database of RNA-protein interactions from six species (bacteria-E.coli, yeast-S.cerevisae, nematode-C.elegans, Fruitfly-D.melanogaster, mouse and human) to transcription factor interactions. It showed that these networks showed a power-law behaviour (a few miRNAs regulate many proteins) in a scale-free fashion, but that there appeared to be a maximal degree meaning that very highly connected nodes were not present (for human there was a maximum of 10). Current miRNA informational databases (e.g., miRsel,35 miRbase,36 Rfam,37,38 miRecords39) have since swelled especially with newly reported interactions from high-throughput sequencing reactions. It will be interesting to see if the degree restriction holds on analysis from these much larger datasets.

Network comparisons can often be difficult because of the network size,27 because the size includes both the number of nodes but also the (multiple) interactions between them. Because cellular biology is hierarchical we can examine a network by splitting it into modules and analysing for network motifs. One example of a sub-graph motif is where two transcription factors regulate two target genes in parallel possibly due to gene duplication (or perhaps genome duplication). For RNP networks, a duplicated protein may in fact be regulated by the original miRNA or a second miRNA may evolve. With genome duplication a more complicated situation may occur where miRNAs from both copies may cross-regulate until such time as the individual miRNA and their targets co-evolve away from each other. The identification of other sub-graph motifs40 that represent different modes of RNA-RNA and RNA-protein interactions is an important area for future research. With sub-graph motifs we can begin to model the occurrence and distribution of the different interactions for different organisms and perhaps gain insight into how such networks represent the way that each organism has evolved.

We can take the network motif concept further with the extraction of small sub-graphs or motifs from the larger network. Sub-graphs (or graphlets) are defined as sections of the network with a defined topology that are present in a higher abundance than expected from a random network with the same degree distribution.34,40 Counting small connected subgraphs especially in large PPI networks can be computationally demanding as the number of possible subgraphs of n-nodes increases exponentially with n. For this reason, calculations tend to use subgraphs with five to seven nodes.27 Additional statistics such as the Relative Graphlet Frequency (RGF) and the Graphlet Degree Distribution Agreement (GDDA) score (described in Table 1) can be used to count the occurrence of these subgraphs which in turn can be used to compare networks.27,41 These local approaches to network analysis and comparison are most often more successful than complete network analyses because of the incomplete and noisy nature of biological networks.40,41

REGULATORY (EXPRESSION MODULATED) NETWORKS

Biological networks can also represent a pathway of events be used in a 'directed' fashion to indicate. In RNP metabolism in particular we can use directed networks to indicate the immediate and downstream effects of RNA regulation. This means that the interaction between the RNA and the target goes one way (e.g., the miRNA affects the regulation of the target protein).

We know that a gene may be regulated by RNA, but it is only after we connect that protein to its interacting partners (both protein and RNA) can we see possible large scale and even phenotypic effects. Regulation is not only by RNA interference (i.e., miRNA or siRNAs) but can also include alternative splicing, nucleotide modification (e.g., methylation, pseudouridylation), alternative transcription initiation and termination.2 We concentrate here on miRNA-based regulation, but similar RNA-target and network issues apply for these other types of RNA regulation. Most methods of discovering miRNA-mRNA interactions (experimental and computational) focus on down-regulatory interactions where RNA silences the gene of interest. However, there is also RNA-mediated up-regulation and mix-regulation where a gene is up-regulated in one instance but down-regulated in another.42 An example of this is let7 and the synthetic miRcxcr4 which up-regulate target mRNAs upon cell cycle arrest, but down-regulate in proliferating cells.42,43

In the past, regulatory RNA-target interactions were typically discovered firstly at a single gene level by painstaking (or sometimes accidental) molecular biology experiments. As with all ncRNAs classes, once a type of miRNA or siRNA was characterised then further members of that group could then be discovered computationally, quickly building up a more genomic (i.e., multi-gene) view of the regulation network for that particular regulatory RNA. However, computational approaches are limited in their resolution of the true connections between the RNA and the target in a sensitive or specific way. This may be because computational algorithms for RNA-target connections are either written for specific species or specific RNAs and thus cannot perform perfectly in a general situation. Nowadays we have on the experimental side, high-throughput sequencing which can quickly gather genome-wide small RNA information (i.e., from 21-25 nucleotide long RNA sequences including miRNAs, siRNAs, piRNAs and other short classes). So long as we have an accurate genome to map this data against then establishing more accurate connection edges becomes a little more accurate. In one example MacLean et al 201028 generated networks from data collected from high-throughput sequencing of Arabidopsis short RNAs. This directed graph of 39,994 short RNA nodes, 18,054 long RNA nodes (primary transcripts) was connected by 38,149 source edges (a match to the positive strand of the genome) and 140,035 target edges (a match to the complementary strand of the genome). As we would expect the network showed a power-law (i.e., scale free) property and a high clustering coefficient (0.32 compared with random network expectations of 0.05). One other interesting property was that of disassortativity44 where high degree nodes connect preferentially with low degree nodes (as opposed to assorted graphs were high degree nodes are connected to other high degree nodes, as seen in internet social networks.44

In plants, double stranded RNA can initiate a range of sequence-specific gene silencing pathways.14 For example, short small interfering RNAs (siRNAs) are 21 nucleotides (nt) long while the 'long' siRNAs are 24 nt long with each being produced from different pathways.14 High-throughput sequencing does not distinguish between the pathways used to generate the sequences, it just produces the sequence. When these RNAs are connected to protein targets, length, target specificity and location are often key factors in estimating which process was used to produce each sequence.

However, when we take some smaller networks of splicing proteins we can see a network that displays more assertive behaviour. The SF3B complex is a multi-protein assembly that is an integral part of the U2 snRNP and also the U11/U12 snRNP in minor splicing. It consists of seven proteins (SF3B1, SF3B2, SF3B3, SF3B4, SF3B5, PHF5A and SF3B14, also known as SF3b155, SF3b145, SF3b130, SF3b49, SF3b10, SF3b14b and P14 respectively). When a network of these proteins is drawn including known first neighbour interacting proteins and miRNAs (Fig. 4) we can see that all but one protein has a host of miRNAs, most of which associate with only one of the SF3B proteins. There are some miRNAs and a few proteins (SF3A2, TCERG1, CDC5L, TCA1, SMNDC1 and RNPS1) which interact with more than one SF3B protein, whereas the SF3B proteins themselves are highly interacting. This assorted behaviour is perhaps indicative of a tightly bound multi-protein complex as opposed to a general protein interaction network.

Figure 4.. The seven proteins of the SF3B complex (large circles) and their regulatory miRNAs.

Figure 4.

The seven proteins of the SF3B complex (large circles) and their regulatory miRNAs. Six of the proteins SF3B1, SF3B2, SF3B3, SF3B4, SF3B5 and PHF5A have interactions with many miRNAs and some other proteins. However, SF3B14 has interactions only with (more...)

It is interesting that SF3B14 is the only one of the seven SF3B proteins not to have any known regulating miRNAs (as yet). This protein is thought to be positioned within the inner cage of the SF3B complex structure which has very few openings.45 It is thus likely that there is a conformational change that allows the SF3B14 protein to interact with the U2 (or U12) snRNA and the mRNA (Table 2), two extremely critical functions. It is as yet unknown as to what this means in terms of the regulation of SF3B14 and the SF3B complex. It is suggested that SF3B plays a critical role near or at the spliceosome catalytic core,46 since SF3a and SF3b reside at around the intron branchpoint are present prior to the first step and completely absent prior to the second step of splicing arrestment.47 One belief is that SF3 functions to restrain the branchpoint hydroxyl from the cell's chemistry until the catalytic core of the spliceosome is properly assembled.

Table 2.. Protein and RNA interactions within the U2-SF3B complex. pp: number of protein-protein interactions, pr: number of protein RNA interactions. U2 interactions are taken from 60; U12 interactions from BioGrid16 (version 3.0.68).

Table 2.

Protein and RNA interactions within the U2-SF3B complex. pp: number of protein-protein interactions, pr: number of protein RNA interactions. U2 interactions are taken from 60; U12 interactions from BioGrid16 (version 3.0.68).

With a more comprehensive picture of the regulatory aspects behind our RNP interaction networks becoming available, we are discovering that there is a high level of regulatory feedback through multiple mechanisms in many species. We can also use directed networks to illustrate feedback loops between miRNAs and the targets they regulate. Take for example the splicing factor SF2/ASF (the SFRS1 gene) and one of its 40 miRNAs, the up-regulated hsa-miR-7 (miR-7).48 Figure 5A is a simple representation of how SFRS1 promotes the Drosha cleavage step of miR-7 maturation and the mature miR-7 represses SFRS1 by binding to the 3'UTR of its transcript. Another gene EGFR (epidermal growth factor receptor), is also repressed by miR-7 leading to the downstream repression of the AKT pathway.49 Both activation and repression of the genes are indicated graphically and it is not hard to understand the pathways involved. However, the real world is very different. If we add in just a couple of other genes that are known to be repressed by miR-7 including BL2 and MAPK3 (both very prominent in cancer studies) (Fig. 5B), we can see that many other miRNAs are involved and that some of these miRNAs regulate more than one of these genes. In the case of SFRS1 it is known that three other miRNAs (miR-29b, miR-221 and miR-222) are up-regulated by SFRS1 and not all at the Drosha-cleavage step.48 Increased levels of mature miR-29 but not pri-miR-29 were observed upon SFRS1 induction indicating that this regulation occurs at the postDrosha stage. SFRS1 is also known to be feedback regulated through other means including unproductive alternative splicing and inhibition of translation initiation (summarised in ref. 48). We can also see two upstream regulators of the Akt pathway (IRS1 and IRS2)49 are also regulated by miR-7 and have connections to the BCL2 and AKT1 proteins. Here in this network we can find connections to SFRS1 and the other splicing proteins to which it connects, as well as key components of the AKT pathway. With all these mechanisms to take into account our network graphs could get even more complex if we added in 'everything'. However, as we learn more about how each type of RNA regulation affects others we can trim down our representations to give us a less detailed, but clearer picture.

Figure 5.. The relationship between hsa-miR-7 (miR-7) miRNA and two of its targets can be displayed in a simple manner as in (A), showing how miR-7 regulates the SFRS1 protein but in return the SFRS1 protein activates production of miR-7.

Figure 5.

The relationship between hsa-miR-7 (miR-7) miRNA and two of its targets can be displayed in a simple manner as in (A), showing how miR-7 regulates the SFRS1 protein but in return the SFRS1 protein activates production of miR-7. Another protein EGFR is (more...)

RNA interaction networks have been studied using Bayesian analysis, integrating miRNA targeting information with expression profiles.42 This is because predictions based on sequence may not be sufficient to determine the complex interactions of miRNA-mRNA pairs. Adding in information about the expression state (i.e., the conditions in which the RNA sample was taken) and grouping different expression profiles based on these states aids in network accuracy. In Bayesian network analysis,42 the interactions between miRNAs and mRNAs are defined as dependencies of their states encoded in a graphical representation with miRNA and mRNA nodes connected by directed edges. The presence or absence of a directed edge from a miRNA to an mRNA indicates the state of the mRNA as dependent or independent on that of the miRNA implying a regulatory relationship. Observations of different relationship states (either A and B are independent or A regulates B) are taken from the expression data and the one that receives the highest score is used to represent the relationship. This method uses expression values between conditions, so we can determine if A is up, down, or mixed between states (up in one and down in another). This work was done with microarray data but can also be used for mRNA sequencing data, giving an example where incorporating other biological knowledge into protein-RNA networks can aid in the discovery of both strong and subtle interactions.42

SPATIOTEMPORAL DYNAMIC RNA NETWORKS

In order to investigate how RNAs really interact in cellular mechanisms, networks need to be analysed in a spatiotemporal manner (i.e., they must take in both the cellular location of the interaction and the timing/ordering of the interactions). This is especially so when studying multicellular organisms, where there are many signals passed between the cells. This extracellular signalling not only permits the cells to be organised, it also permits signals from the environment to be processed and acted upon. However, spatiotemporal organisation is not limited to multicellular situations. The eukaryotic cell is organised into numerous sub-compartments, some of them membrane bound (e.g., a mitochondria and nucleus) and others considered more dynamic (e.g., nucleolus and spliceosome). Prokaryotes also have sub-cellular localisation of RNP complexes (e.g., Ribosome and SRP (Signal recognition particle) which targets signal peptides and conducts them to the protein-conducting channel on the plasma membrane (or endoplasmic reticulum membrane in eukaryotes)).50 Dynamic compartments in particular offer interesting insights into biological networks since they must form and disassociate at a particular points in the cell cycle or under set cellular conditions (e.g., the formation of a nucleolus around a nucleolar organiser region).

The spliceosome is also a dynamic compartment in which initial components recognise the exon-intron boundaries then other splicing components are recruited to complete the splicing catalysis steps to free the ligated exon. If we analyse interactions from large databases, many of them will have gene ontology, a library of terms describing function and cell localisation. However, these definitions are 'cell-wide' in which they tell us that a protein is involved in the spliceosome, but not which interactions are only present during the different splicing stages (Fig. 1). Using data from mass spectrometry experiments (Complex A,22 Complex B,23 Complex C24), PPI information downloaded from Biogrid16 and RNA-Protein interactions from splicing-related publications,17,51 we can examine the dynamic nature of RNP networks (Fig. 6). The initial proteins for the network in Figure 6 are U2-snRNP associated, since U2 snRNP remains within the spliceosome for the majority of the splicing cycle. In the first stage of splicing the exon-intron boundaries are recognised by the U1 and U2 snRNPs (Fig. 6A). The U5 and U6 snRNPs then join the spliceosome in Complex B (Fig. 6B). All but one of the proteins is present in the B complex when the actual working spliceosome is formed. The U4 and U1 snRNPs leave the spliceosome in complex C (Fig. 6C) and with them many of their associated proteins. This can leave a network with supposedly unconnected components. However, in order to keep these networks focused, our network in Figure 6 does not show all the connections that can occur with other splicing and nonsplicing proteins. It is also likely that other connections between splicing proteins and their RNAs have not yet been found due to the limitations of mass spectrometry with RNA macromolecular complexes. The spliceosome is a massive complex where in humans over 200 proteins can be involved. Many of these proteins link to other transcriptionally related functions such as 5' capping and 3' tailing as well as RNA export from the nucleus. Detailed analysis of small sections of the spliceosome such as SF3 can greatly aid in eventually forming an overall picture of the larger macromolecular spliceosome.

Figure 6.. Protein-ncRNA and protein-protein interactions change during the splicing cycle.

Figure 6.

Protein-ncRNA and protein-protein interactions change during the splicing cycle. Inactive nodes are shown in light grey. Complex A (A) is where the splice sites of the mRNA are recognized. The U5 and U6 snRNPs then join the spliceosome in Complex B (B). (more...)

CONCLUSION

The spliceosome is of great biological importance as there are direct links between splicing proteins and medical conditions such as Alzheimer's disease,52 retinal disorders,53 spinal muscular atrophy54 and especially cancer.55 Indeed the spliceosome is already under investigation as a target for anti-cancer treatment56). It is also important as a model for studying how RNA interactions influence and enhance our network analysis.

There are two main issues surrounding RNP networks;visualisation and interpretation. Common graphical visualization tools such as Cytoscape and BioLayout have evolved greatly over the years to allow us to connect Gene Ontology (functional definitions) and metabolic pathway (e.g., KEGG) information. It is clearly important to connect as much biological inference to the interactions as possible in order for network comparisons to make sense.11,40 However when we come to interpretation, in general the larger the network, the harder it is to visualise then make sense of it. For accurate comparisons and predictions, we require accurate and biologically relevant networks that are large enough to describe the cellular situation, but not too large as to obscure the meaningful connections with noise. For the moment we can concentrate on sub-networks such as those with the spliceosome to gather the accuracy we require before expanding to larger networks.

Within the field of systems biology, there is a greater move to improve the accuracy of proteomic and transcriptomics linkage.57 Issues surrounding transcriptomic analysis58 and RNA granule sequestering of transcripts,59 means that the proteins discovered in a sample by mass spectrometry may not match the transcripts deemed to be present by transcriptomic sequencing. One approach has been to break down the networks by designating 'proteomic seeds'11 (a protein that is differentially expressed between two conditions) and using Synergistic dysregulation11 (coordinate mRNA-level differential expression of a group of genes in the phenotype). Although this sounds impressive what it deals with is the output from two high-throughput technologies that can be used to tackle biological networks at a truly genomic scale. If for example, we also apply epigenetic and metabolomic12 information to our networks this can result in layers upon layers of network complexity. Our challenge is not to let the enormity or complexity of these networks overwhelm us, but instead concentrate on future developments for genetic information integration, network construction and graph visualisation.

ACKNOWLEDGMENTS

This work could not have been done without the support of Prof. David Penny and the support of the Institute of Fundamental Sciences at Massey University. This work was partly funded through the New Zealand Health Research Council.

REFERENCES

1.
Collins LJ, Penny D. RNA-infrastructure: Dark Matter of the Eukaryotic Cell? Trends in Genetics. 2009;25(3):120–128. [PubMed: 19171405]
2.
Licatalosi DD, Darnell RB. RNA processing and its regulation: global insights into biological networks. Nat Rev Genet. 2010;11(1):75–87. [PMC free article: PMC3229837] [PubMed: 20019688]
3.
Amaral PP, Dinger ME, Mercer TR, et al. The eukaryotic genome as an RNA machine. Science. 2008;319(5871):1787–1789. [PubMed: 18369136]
4.
Collins LJ, Chen XS, Schonfeld B. The Epigenetics of Non-coding RNA. In: Tollefsbol T, ed Handbook of Epigenetics. Oxford: Academic Press. 2010:49–61.
5.
Collins LJ, Chen XS. Ancestral RNA: The RNA biology of the eukaryotic ancestor. RNA Biology. 2009;6(5):1–8. [PubMed: 19713749]
6.
Costa FF. Non-coding RNAs: Lost in translation? Gene. 2007;386(1-2):1–10. [PubMed: 17113247]
7.
Costanzo M, Baryshnikova A, Bellay J, et al. The genetic landscape of a cell. Science. 2010;327(5964):425–431. [PMC free article: PMC5600254] [PubMed: 20093466]
8.
Janga SC, Tzakos A. Mol Biosyst. 2009. Structure and organization of drug-target networks: insights from genomic approaches for drug discovery. [PubMed: 19763339]
9.
MacLean D, Elina N, Havecker ER, et al. Evidence for large complex networks of plant short silencing RNAs. PLoS One. 5(3):e9901. [PMC free article: PMC2845630] [PubMed: 20360863]
10.
Li L, Zhang K, Lee J, et al. Discovering cancer genes by integrating network and functional properties. BMC Med Genomics. 2009;2:61. [PMC free article: PMC2758898] [PubMed: 19765316]
11.
Nibbe RK, Koyuturk M, Chance MR. An integrative -omics approach to identify functional sub-networks in human colorectal cancer. PLoS Comput Biol. 2010;6(1):e1000639. [PMC free article: PMC2797084] [PubMed: 20090827]
12.
Yizhak K, Benyamini T, Liebermeister W, et al. Integrating quantitative proteomics and metabolomics with a genome-scale metabolic network model. Bioinformatics. 2010;26(12):i255–260. [PMC free article: PMC2881368] [PubMed: 20529914]
13.
Tsai ZY, Singh S, Yu SL, et al. Identification of microRNAs regulated by activin A in human embryonic stem cells. J Cell Biochem. 2010;109(1):93–102. [PubMed: 19885849]
14.
Jamalkandi SA, Masoudi-Nejad A. Reconstruction of Arabidopsis thaliana fully integrated small RNA pathway. Funct Integr Genomics. 2009;9(4):419–432. [PubMed: 19802639]
15.
Dobbyn HC, McEwan PA, Krause A, et al. Analysis of premRNA and prerRNA processing factor Snu13p structure and mutants. Biochem Biophys Res Commun. 2007;360(4):857–862. [PubMed: 17631273]
16.
Breitkreutz BJ, Stark C, Reguly T, et al. The BioGRID Interaction Database: 2008 update. Nucleic Acids Res. 2008;36(Database issue):D637–D640. [PMC free article: PMC2238873] [PubMed: 18000002]
17.
Collins L, Penny D. Complex spliceosomal organization ancestral to extant eukaryotes. Mol Biol Evol. 2005;22(4):1053–1066. [PubMed: 15659557]
18.
Jurica MS, Moore MJ. Pre-mRNA splicing: awash in a sea of proteins. Mol Cell. 2003;12(1):5–14. [PubMed: 12887888]
19.
Valadkhan S. The spliceosome: a ribozyme at heart? Biol Chem. 2007;388(7):693–697. [PubMed: 17570821]
20.
Valadkhan S. Role of the snRNAs in spliceosomal active site. RNA Biol. 2010;7(3) Epub ahead of print. [PubMed: 20458185]
21.
Lorkovic ZJ, Lehner R, Forstner C, et al. Evolutionary conservation of minor U12-type spliceosome between plants and humans. Rna. 2005;11(7):1095–1107. [PMC free article: PMC1370794] [PubMed: 15987817]
22.
Behzadnia N, Golas MM, Hartmuth K, et al. Composition and three-dimensional EM structure of double affinity-purified, human prespliceosomal A complexes. Embo J. 2007;26(6):1737–1748. [PMC free article: PMC1829389] [PubMed: 17332742]
23.
Deckert J, Hartmuth K, Boehringer D, et al. Protein composition and electron microscopy structure of affinity-purified human spliceosomal B complexes isolated under physiological conditions. Mol Cell Biol. 2006;26(14):5528–5543. [PMC free article: PMC1592722] [PubMed: 16809785]
24.
Ilagan J, Yuh P, Chalkley RJ, et al. The role of exon sequences in C complex spliceosome structure. J Mol Biol. 2009;394(2):363–375. [PMC free article: PMC2783800] [PubMed: 19761775]
25.
Yamada T, Bork P. Evolution of biomolecular networks: lessons from metabolic and protein interactions. Nat Rev Mol Cell Biol. 2009;10(11):791–803. [PubMed: 19851337]
26.
Dyer MD, Neff C, Dufford M, et al. The human-bacterial pathogen protein interaction networks of Bacillus anthracis, Francisella tularensis and Yersinia pestis. PLoS One. 5(8):e12089. [PMC free article: PMC2918508] [PubMed: 20711500]
27.
Rito T, Wang Z, Deane CM, et al. How threshold behaviour affects the use of subgraphs for network comparison. Bioinformatics. 26(18):i611–i617. [PMC free article: PMC2935432] [PubMed: 20823329]
28.
MacLean D, Elina N, Havecker ER, et al. Evidence for large complex networks of plant short silencing RNAs. PLoS One. 2010;5(3):e9901. [PMC free article: PMC2845630] [PubMed: 20360863]
29.
Assenov Y, Ramirez F, Schelhorn SE, et al. Computing topological parameters of biological networks. Bioinformatics. 2008;24(2):282–284. [PubMed: 18006545]
30.
Killcoyne S, Carter GW, Smith J, et al. Cytoscape: a community-based framework for network modeling. Methods Mol Biol. 2009;563:219–239. [PubMed: 19597788]
31.
Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–2504. [PMC free article: PMC403769] [PubMed: 14597658]
32.
Cline MS, Smoot M, Cerami E, et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc. 2007;2(10):2366–2382. [PMC free article: PMC3685583] [PubMed: 17947979]
33.
Moretti F, Thermann R, Hentze MW. RNA. 2010. Mechanism of translational regulation by miR-2 from sites in the 5' untranslated region or the open reading frame. published online. [PMC free article: PMC2995410] [PubMed: 20966199]
34.
Nacher JC, Araki N. Structural characterization and modeling of ncRNA-protein interactions. Biosystems. 2010. Epub ahead of print. [PubMed: 20206662]
35.
Naeem H, Kuffner R, Csaba G, et al. miRSel: automated extraction of associations between microRNAs and genes from the biomedical literature. BMC Bioinformatics. 2010;11:135. [PMC free article: PMC2845581] [PubMed: 20233441]
36.
Griffiths-Jones S. miRBase: microRNA sequences and annotation. Curr Protoc Bioinformatics. 2010;Chapter 12(Unit 12 19):11–10. [PubMed: 20205188]
37.
Gardner PP, Daub J, Tate JG, et al. Rfam: updates to the RNA families database. Nucleic Acids Res. 2009;37(Database issue):D136–D140. [PMC free article: PMC2686503] [PubMed: 18953034]
38.
Daub J, Gardner PP, Tate J, et al. The RNA WikiProject: community annotation of RNA families. Rna. 2008;14(12):2462–2464. [PMC free article: PMC2590952] [PubMed: 18945806]
39.
Xiao F, Zuo Z, Cai G, et al. miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Res. 2009;37(Database issue):D105–D110. [PMC free article: PMC2686554] [PubMed: 18996891]
40.
Rito T, Wang Z, Deane CM, et al. How threshold behaviour affects the use of subgraphs for network comparison. Bioinformatics. 2010;26(18):i611–i617. [PMC free article: PMC2935432] [PubMed: 20823329]
41.
Przulj N, Corneil DG, Jurisica I. Efficient estimation of graphlet frequency distributions in protein-protein interaction networks. Bioinformatics. 2006;22(8):974–980. [PubMed: 16452112]
42.
Liu B, Li J, Tsykin A, et al. Exploring complex miRNA-mRNA interactions with Bayesian networks by splitting-averaging strategy. BMC Bioinformatics. 2009;10:408. [PMC free article: PMC2797807] [PubMed: 20003267]
43.
Vasudevan S, Tong Y, Steitz JA. Switching from repression to activation: microRNAs can up-regulate translation. Science. 2007;318(5858):1931–1934. [PubMed: 18048652]
44.
Redner S. Networks: teasing out the missing links. Nature. 2008;453(7191):47–48. [PubMed: 18451851]
45.
Golas MM, Sander B, Will CL, et al. Luhrmann R, Stark H. Molecular architecture of the multiprotein splicing factor SF3b. Science. 2003;300(5621):980–984. [PubMed: 12738865]
46.
Das BK, Xia L, Palandjian L, et al. Characterization of a protein complex containing spliceosomal proteins SAPs 49, 130, 145 and 155. Mol Cell Biol. 1999;19(10):6796–6802. [PMC free article: PMC84676] [PubMed: 10490618]
47.
Lardelli RM, Thompson JX, Yates JR 3rd, et al. Release of SF3 from the intron branchpoint activates the first step of premRNA splicing. Rna. 16(3):516–528. [PMC free article: PMC2822917] [PubMed: 20089683]
48.
Wu H, Sun S, Tu K, et al. A splicing-independent function of SF2/ASF in microRNA processing. Mol Cell. 2010;38(1):67–77. [PMC free article: PMC3395997] [PubMed: 20385090]
49.
Kefas B, Godlewski J, Comeau L, et al. microRNA-7 inhibits the epidermal growth factor receptor and the Akt pathway and is down-regulated in glioblastoma. Cancer Res. 2008;68(10):3566–3572. [PubMed: 18483236]
50.
Janda CY, Li J, Oubridge C, et al. Recognition of a signal peptide by the signal recognition particle. Nature. 465(7297):507–510. [PMC free article: PMC2897128] [PubMed: 20364120]
51.
Collins L, Penny D. Investigating the intron recognition mechanism in eukaryotes. Mol Biol Evol. 2006;23(5):901–910. [PubMed: 16371412]
52.
Ohe K, Mayeda A. HMGA1a trapping of U1 snRNP at an authentic 5' splice site induces aberrant exon skipping in sporadic Alzheimer's disease. Mol Cell Biol. 30(9):2220–2228. [PMC free article: PMC2863597] [PubMed: 20194618]
53.
Sun N, Zhao H. Reconstructing transcriptional regulatory networks through genomics data. Stat Methods Med Res. 2009;18(6):595–617. [PMC free article: PMC3666560] [PubMed: 20048387]
54.
Pedrotti S, Bielli P, Paronetto MP, et al. The splicing regulator Sam68 binds to a novel exonic splicing silencer and functions in SMN2 alternative splicing in spinal muscular atrophy. Embo J. 2010;29(7):1235–1247. [PMC free article: PMC2857462] [PubMed: 20186123]
55.
Lee JH, Horak CE, Khanna C, et al. Alterations in Gemin5 expression contribute to alternative mRNA splicing patterns and tumor cell motility. Cancer Res. 2008;68(3):639–644. [PMC free article: PMC2678556] [PubMed: 18245461]
56.
van Alphen RJ, Wiemer EA, Burger H, et al. The spliceosome as target for anticancer treatment. Br J Cancer. 2009;100(2):228–232. [PMC free article: PMC2634708] [PubMed: 19034274]
57.
Zdobnov EM, von Mering C, Letunic I, et al. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster. Science. 2002;298(5591):149–159. [PubMed: 12364792]
58.
Bullard JH, Purdom E, Hansen KD, et al. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics. 2010;11:94. [PMC free article: PMC2838869] [PubMed: 20167110]
59.
Anderson P, Kedersha N. RNA granules. J Cell Biol. 2006;172(6):803–808. [PMC free article: PMC2063724] [PubMed: 16520386]
60.
Dybkov O, Will CL, Deckert J, et al. U2 snRNA-protein contacts in purified human 17S U2 snRNPs and in spliceosomal A and B complexes. Mol Cell Biol. 2006;26(7):2803–2816. [PMC free article: PMC1430325] [PubMed: 16537922]
61.
Behzadnia N, Hartmuth K, Will CL, et al. Functional spliceosomal A complexes can be assembled in vitro in the absence of a penta-snRNP. Rna. 2006;12(9):1738–1746. [PMC free article: PMC1557700] [PubMed: 16880538]
Copyright © 2000-2013, Landes Bioscience.
Bookshelf ID: NBK53772

Views

  • PubReader
  • Print View
  • Cite this Page

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...