RNA splicing is a crucial process in the
maturation of eukaryotic messenger RNA (mRNA) molecules. The primary transcript
(pre-mRNA) synthesized from a gene of eukaryotes contains both coding regions
(exons) that encodes the amino acid sequence of a peptide and non-coding
regions (introns) that do not involving in protein coding. The process of RNA
splicing, which can be considered as a major component of post transcriptional
regulation of genes, involves the removal of introns and the joining of exons
to produce a mature mRNA molecule that can be translated into a functional
protein. Introns, the intervening sequences in the pre-mRNA.
In humans, there are around 20,000 protein-coding
genes, each containing an average of 8 introns, with a median length of
approximately 1 kb of each. This poses a significant challenge for cells to
accurately identify exons amid a multitude of intron sequences. Compounding
this complexity, about 95% of human genes undergo alternative splicing. This
means that a single gene has the capacity to generate multiple protein isoforms
by either including or skipping certain exons or by selecting alternative exons.
This extensive alternative splicing greatly amplifies the proteome that can be
produced from a limited set of genes, contributing significantly to the
intricate complexity observed in higher organisms. In mammals, except for
certain genes like histones, most genes are transcribed by RNA polymerase II
contain introns. RNA splicing requires cis-acting elements, trans-acting
proteins, and spliceosomes.
What is spliceosome?
The basic chemical process of removing
introns from pre-messenger RNAs (pre-mRNAs) is catalyzed by a large, complex
molecular machine, called the spliceosome that conserves across species,
ranging from yeast to humans.
Spliceosome is composed of small
nuclear ribonuclear (snRNAs) and 100 of proteins. Most multicellular organisms
in the animal kingdom (metazoan organisms) have two spliceosomes of major and
minor spliceosome, functioning simultaneously in the splicing process.
Spliceosomes are not preassembled, and
it assembles during the process of RNA splicing. Assembling is highly regulated
and dynamic process, occurring in the nucleus. The major spliceosome mainly
consists of 5 snRNA named U1, U2, U4, U5 and U6 which is responsible for 99.5%
of introns. The minor spliceosome is less common, and it removes U12 type
introns existing in around 0.5% of total introns. U1, U2, U4, and U5 snRNAs are
transcribed by RNA polymerase II transcribes and these transcripts aquire a
tri-methyl-guanosine cap. Likewise, U6 snRNA is transcribed by RNA polymerase
III obtaining a γ-monomethyl guanosine cap. snRNA molecules binds with Sm and
form a ring around the U-rich Sm site at the 3' end of U1, U2, U4, and U5
snRNAs. Similarly, LSm proteins are associated with LSm proteins and form ring
like structures. These rings are essential for the structural integrity and
function of snRNAs. Sm proteins are a group of proteins, binding to specific
RNA sequences and they were named after their initial identification as
antigens in patients with systemic lupus erythematosus, referred to as
"Smith" antigens. "LSm" stands for "Like Sm” and are
structurally and functionally related to the Sm proteins but are distinct in
their specific roles. Each snRNA, along with specific proteins, forms a small
nuclear ribonucleoprotein (snRNP) particle. These snRNPs are key components of
the spliceosome.
Cis-acting elements like 5’ and 3’
splice sites, exonic/intronic splicing enhancers (ESEs/ISEs), and silencers,
branch point sequence, and the polypyrimidine tracts involve in the recruitment
of the spliceosome and spliceosomal-associated factors.
Transacting proteins
of RNA splicing
Transacting splicing proteins are serine/arginine-rich
(SR) binding to enhancers and heterogenous nuclear ribonucleoproteins (hnRNPs) which
bind to the silencers. Serine rich proteins consist of a serine rich domain
enabling protein – protein interactions with other splicing factors. Also, SR
proteins have RNA-recognition motifs (RRMs) that bind to exonic splicing
enhancers (ESEs) and intronic splicing enhancers (ISEs) on pre-mRNA to recruitie
additional spliceosomal factors. Heterogeneous nuclear ribonucleoproteins
functions antagonistically to SR proteins, possessing different kinds of RNA
binding domains (RBD) named as RRM, KH, and RGG.
How to recognize the splice
site?
To minimize the errors and maintain the integrity of accurate splicing, precise identification of splice site is essential. Splice sites can be considered as specific consensus sequences indicating the boundaries between introns and exons in 5’ and 3’ sides. 5’ splice site (5'SS) is indicated as GU dinucleotide while the 3’ splice site (5'SS) is indicated as AG dinucleotide.
The splicing process begins with the
binding of U1 snRNP to the 5'SS of the intron in the pre-mRNA resulting the
formation of early (E) complex. U1 snRNP consists of Sm protein ring and three
U1-specific proteins named as U1-70, U1A, and U1C. In humans, to stabilize the
protein – RNA interaction, a Zinc finger domain of the U1C protein directly
contacts with the RNA duplex.
Then U2 snRNP binds to the branch point
(BP) sequence, which is located near the 3' end in the intron upstream to the
splice site. With this reaction, the prespliceosome called as A complex is
formed. U4, U5, and U6 snRNA and proteins are preassembled to form the U4/U6.U5
t, complex which is called the tri-snRNP and this large complex
integrates with the A complex to form the fully assembled pre – B complex. In
the tri-snRNP, U6 snRNA is initially paired with U4
snRNA. This pairing is crucial as U4 snRNA acts like a chaperone, keeping U6 in
a pre-catalytic, inactive state. Then the helicase Prp28 facilitates the
transfer of the 5' SS from U1 snRNP to a specific sequence within U6 snRNA.
Another helicase, Brr2, separates U4 from U6 snRNA from the initial pair. The
separation allows U6 snRNA to fold and join with a part of U2 snRNA, forming
the active site of the spliceosome with two catalytic metal ions, forming the U6/U2 snRNA stem II.
The formation of the B complex prepares
the spliceosome for the catalytic steps of splicing - the branching reaction
and the subsequent exon ligation. B complex is a dynamic structure where the
substrate (pre-mRNA) and the catalytic components are properly positioned for
the chemical transformations that remove introns and join exons. Unwinding of
U4 and U6 snRNAs and the release of U1 and U4 snRNPs results for the transition
of B complex into activated B complex functioning the U6 snRNP as the catalytic
center for the spliceosome.
After the spliceosome has been fully
assembled and activated, intron removal happens. Branching reaction occurs to
form a lariat structure on the intron and releasing the upstream exon. This
leads to the formation of the C complex. The actual removal of the intron
happens during the C complex stage. Here, the 3' splice site is cleaved, the
intron is excised in the form of a lariat, and the two exons are ligated
together. The 2' OH of the adenosine at the BP attacks the 5' SS, forming a
lariat structure on the intron and releasing the upstream exon. The free 3'
OH of the released exon attacks the 3' splice site (3' SS), joining the two
exons and releasing the lariat intron.
The spliceosome then repositions the
exons for the second transesterification reaction, resulting in the ligation of
the exons and the formation of the post-spliceosome (P complex). Following intron removal, the spliceosome
disassembles, and its components are recycled for subsequent splicing events.
The excised intron is debranched and degraded within the cell.
Figure
2: The complex structure of spliceosome arrangement and other biochemical
reactions occurring in RNA splicing. (Image courtesy - http://dx.doi.org/10.1146/annurev-biochem-091719-064225)
Alternative splicing
Alternative splicing is a pivotal process in eukaryotic gene expression, enabling a single gene to produce multiple protein variants. It occurs when genes are transcribed into pre-mRNA, which includes exons and introns. The spliceosome, a complex RNA-protein machinery, facilitates the removal of introns and joining of exons. This process can vary, leading to different splicing outcomes such as exon skipping, intron retention, alternative splice site selection, and mutually exclusive exons. The regulation of alternative splicing involves splicing enhancers and silencers within the exons or introns, which interact with regulatory proteins like SR proteins and hnRNPs. These proteins' activities are often modulated by phosphorylation through kinase signaling pathways, linking splicing to cellular signals. Additionally, alternative splicing is influenced by various cellular conditions, allowing adaptive protein production in response to environmental changes. This mechanism not only contributes to protein diversity but also regulates gene expression, impacting mRNA stability and translational efficiency. However, aberrations in splicing regulation are associated with numerous diseases, including cancer and neurodegenerative disorders, highlighting its crucial role in both normal cellular functioning and disease pathogenesis.
References
Akinyi,
M. V., & Frilander, M. J. (2021). At the Intersection of Major and Minor
Spliceosomes: Crosstalk Mechanisms and Their Impact on Gene Expression.
Frontiers in Genetics, 12. https://doi.org/10.3389/fgene.2021.700744
Schellenberg, M., Ritchie, D. B., & MacMillan, A. M. (2008). Pre-mRNA splicing: a complex picture in higher definition. Trends in Biochemical Sciences, 33(6), 243–246. https://doi.org/10.1016/j.tibs.2008.04.004
Wang,
E., & Aifantis, I. (2020). RNA Splicing and Cancer. Trends in Cancer, 6(8),
631–644. https://doi.org/10.1016/j.trecan.2020.04.011
Wilkinson, M. E., Charenton, C., & Nagai, K. (2020). RNA Splicing by the Spliceosome. Annual Review of Biochemistry, 89(1), 359–388. https://doi.org/10.1146/annurev-biochem-091719-064225
