The prenatal analysis consisted of analyzing inhouse RNA-Seq data from fetal skeletal muscles (n = 20) and fetal cardiac muscles (n = 2) from 2 different fetuses (Table 1). Additionally, it included analyzing publicly available skeletal muscle data from a 19 weeks female, skeletal muscle data from a 22 weeks male, cardiac muscle data from a 28 weeks female and cardiac muscle data from a 19 weeks female (for more information see ‘Methods’). For postnatal analysis, RNA-Seq data from an internal cohort of 44 individuals were analyzed (Table 1). We studied six OBSCN isoforms, four of which were curated mRNAs and featured NM REFSEQ IDs, i.e. ENST00000284548 (NM_052843), ENST00000422127 (NM_001098623), ENST00000570156 (NM_001271223) and ENST00000680850 (NM_001386125). Additionally, we analyzed isoforms ENST00000660857 and ENST00000493977 since they collectively featured four additional unique exons. Overall, we extracted 126 unique or 121 non-overlapping exons from these isoforms, out of which three exons were annotated as alternative first and four exons as alternative last. Additionally, the introns upstream of two exons were annotated with alternative 3’ splicing (Table 2).
Table 1 Internal cohort of individuals whose biopsies were collected for RNA-seqExon inclusion/skippingWe measured \(\Psi\) inclusion levels of the 126 OBSCN exons for the 45 postnatal skeletal muscle, seven postnatal heart, 20 fetal skeletal muscle and three fetal heart samples. Heatmap (coupled with hierarchical clustering using the “Euclidean” distance) and PCA analysis of OBSCN exon inclusion PSI values showed that the samples did not group based on the sex and clinical diagnosis of the studied individuals (Fig. S1A,B, Additional File 1). However, overall, a clear distinction of pre- and postnatal skeletal muscles with a less distinction of pre- and postnatal cardiac muscles was observed (Fig. 2B, and Fig. S2, Additional File 1). We compared the samples in six different ways: the muscle samples to heart samples (i.e. denoted with M), postnatal (i.e. mostly from adult individual) samples to fetal samples (i.e. A), postnatal muscle samples to fetal muscle samples (i.e. AM/FM), postnatal heart samples to fetal heart samples (i.e. AH/FH), postnatal muscle samples to postnatal heart samples (i.e. AM/AH), and fetal samples to fetal heart samples (i.e. FM/FH) (Fig. 2A). We plotted the average \(\Psi\) levels to detect loci within OBSCN where differential splicing was detected when fetal samples were compared to postnatal samples or heart samples were compared to skeletal muscles (Fig. 3). Furthermore, we visualized the distribution of the inclusion levels of the OBSCN exons in the studied samples (using box plots), for those exons whose at least two out of the six comparisons (Fig. 2A) produced significant results (FDR < 0.05) with \(\Delta \psi>10\) (%)(Fig. 4). It is worth noting that for three of the studied exons (i.e. exons 48, 53 and 56) significant results were achieved in all comparisons except when postnatal hearts were compared to fetal hearts (Fig. 4, Table S1, and Fig. S3, Additional File 1).
Fig. 2Sample comparisons in our study. A The RNA-Seq data in our study were analyzed for splicing by comparing the muscle samples to heart samples (denoted with M), mostly adult postnatal samples to fetal samples (denoted with A), mostly adult postnatal muscle samples to fetal muscle samples (denoted also with AM/FM), mostly adult postnatal heart samples to fetal heart samples (AH/FH), mostly adult postnatal muscle samples to mostly adult postnatal heart samples (AM/AH), and fetal samples to fetal heart samples (FM/FH). B Scatterplot shows the separation of the studied samples based on OBSCN exon inclusion PSI values, by illustrating PC1 vs PC2 (achieved from PCA analysis). The sample types have been labelled with different shapes and colours
Fig. 3Inclusion levels of OBSCN exons: The plot illustrates the average \(\Psi\) inclusion levels of the unique exons of OBSCN (in the 4 studied sample classes, i.e. postnatal muscles, postnatal hearts, fetal muscles and fetal hearts) and the exons are ordered by their start position (i.e. X- axis). Each dot shows the average of the \(\Psi\) values for an OBSCN exon within a sample class. The dots are marked by the exon numbers. For those dots that are too close together to distinguish, the ranges of the exon numbers are stated. The average \(\Psi\) measurements related to a sample class are connected via a line. As their \(\Psi\) measurements are not accurate (due to the lack of exon-skipping sequence reads), the first and last exons are shown with red triangles and horizontal grey dashed-lines. The variance of the average \(\Psi\) values (across the different sample classes) are shown (with a purple line) below, in the figure. Exon 126 (i.e. an alternative last exon) is omitted as its start coordinate is identical to that of exon 125
Fig. 4Significantly differentially included OBSCN exons: A-O) Ordered by the exon number, the boxplots illustrate the distribution of the \(\Psi\) levels of the exons that were detected as significant (FDR < 0.05) in at least three of the comparisons performed in the study. The box plots extend from the 25th to the 75th percentile, and the thick horizontal line represents the median. The whiskers of the box plots show 1.5 times the interquartile range. The outliers are values higher and lower than the interquartile range. P-R Sashimi plots, illustrate the exon-exon junctions observed in the RNAseq data, in regions flanking exons 17, 18, 48, 98 and 126. The samples with the nearest PSI values to the median PSI of the exons are chosen for the sashimi plots. S-T Relative expression levels of the mRNAs that include exons 17 and 18 (S) (based on primers matching junctions Ex16-Ex17 and Ex18-Ex19), and relative expression of the long and short OBSCN isoforms (T) (based on analyzing exon-exon junctions specific to these isoforms) are shown with bar plots. These values were measured by real-time polymerase chain reaction (i.e. RT-qPCR) in the postnatal and fetal skeletal muscles. The exon-exon junction levels have been normalized to the total OBSCN mRNA levels (i.e. inferred by Ex67-Ex68 junction in S and Ex5-Ex6 junction in T). The sample classes include: mostly adult postnatal muscles (AM), mostly adult postnatal hearts (AH), fetal muscles (FM), and fetal hearts (FH). The significant levels in the plots are shown using asterisks: P < 0.05 (*), P < 0.01 (**) and P < 0.001 (***)
Extensive exon inclusion regulation was detected at several loci at the 5’ end (exons 17 and 18), the middle (exons 48–57) and the 3’ end of the gene (exons 97 ad 107) that were associated with skeletal muscle development (Figs. 3, 4). The inclusion (or usage) of exon 17 (FDR(AM/FM) = 0.00104, \(\Delta \Psi\)(AM/FM) = -19.2) and exon 18 (FDR(AM/FM) = 0.00236, \(\Delta \Psi\)(AM/FM) = -18.2) were noticeably lower in postnatal muscles compared to fetal muscles (Fig. 4A-B, 4S and Table S2). Interestingly, however, a similar effect was not seen in the postnatal heart compared to the fetal heart tissues (FDR(AH/FH) > 0.9, -5% < \(\Delta \Psi (AH/FH)\) < 0%). This leads us to believe that the inclusion of these exons is regulated specifically during skeletal muscle development (and not during heart development). As a result of this exon inclusion decrease, upregulation of the canonical exon junctions chr1:228243458–228246528 (connecting exons 15 and 18) and chr1:228244571–228256651 (connecting exons 16 and 20) were observed in postnatal muscles compared to fetal muscles (P(AM > FM) = 1e − 04, 0.0243% ≤ \(\Delta \Psi\) EJ(AM/FM) ≤ 0.0485%) (Fig. 5 A-B, Table S3).
Fig. 5Normalized exon-exon junction levels of non-consecutive exons. Boxplots, illustrating the distribution of the normalized canonical (A-G) and non-canonical (H-M) junction levels of non-consecutive exons. If the exon-flanking 5’ or 3’ splice site is included in the reference (i.e. GENCODE) the name of the corresponding exon begins with “EX”; otherwise it starts with “ex”. The sample classes include: mostly adult postnatal muscles (AM), mostly adult postnatal hearts (AH), fetal muscles (FM), and fetal hearts (FH). The p-value and \(\Delta EJ\) values for two sets of comparisons are listed below the box plots: postnatal muscle vs fetal muscle (AM > FM), and postnatal heart vs fetal heart (AH > FH). The Jonckheere Terpstra method was used to test the order and extract the significant results. The significant levels are shown using asterisks: P < 0.05 (*), P < 0.01 (**) and P < 0.001 (***). The box plots extend from the 25th to the 75th percentile, and the thick horizontal line represents the median. The whiskers of the boxplots show 1.5 times the interquartile range. The outliers are values higher and lower than the interquartile range
Located in the central region of the OBSCN gene, exons 48–56 were included significantly less in postnatal muscles than in fetal muscles (FDR(AM/FM) < 0.05, \(\Delta \psi\)(AM/FM) < -10%) (Fig. 4). Although the inclusion levels of these exons were mostly lower in postnatal heart samples than in fetal heart samples (except for exon 52), the effects were milder and the false discovery rates were not significant (FDR(AH/FH) > 0.1, \(\Delta \psi\), (AH/FH) < -10%) (Fig. 4, Table S1). We believe that the reason for observing a milder effect in heart is the small size of the fetal heart samples, as the P‐value for some of these effects are less than 0.05 even though their FDR values are not (Fig. 4 K-M). Furthermore, the inclusion levels for most of these exons (i.e. exons 48, 49, 52–56, as well as exons 57 and 58) were significantly higher in human muscle samples compared to human heart samples (FDR(M) < 0.05, \(\Delta \Psi\)(M) > 30%), suggesting that the detection of exon inclusion variations in the cardiac muscles are technically more challenging and require more sequence reads and biological replicates (Fig. 4). Concurrent to these findings, we also noticed significant increase of several canonical as well as a few noncanonical exon junctions in postnatal muscle samples compared to fetal muscle samples (Fig. 5). The upregulated canonical exon junctions were chr1:228288888 − 228292523 (connecting exons 47 and 49), chr1:228288888 − 228293353 (connecting exons 47 and 50), and chr1:228292161 − 228294152 (connecting exons 48 and 51) (Table S3). The upregulated non-canonical exon junctions were chr1:228293518 − 228294318 (overlapping exons 50 and 51), chr1:228295020 − 228300123 (overlapping exons 52 and 55), and chr1:228300020 − 228303814 (overlapping exon 56) (Table S4).
We developed an interactive visualization tool for the exon inclusion \(\Psi\) values using the R Shiny package [11] which is available at http://psivis.it.helsinki.fi:3838/OBSCN_PSIVIS/. The software allows the users to zoom into more precise regions within the OBSCN gene to view the distribution of the inclusion levels (i.e. \(\Psi )\) of the exons of interest and the measured statistics.
Alternative first/last exons and alternative 3’ splicingWe measured \(\Psi\) values of the four alternative final exons (Table 2) and two alternative first exons. It is worth noting that the results for exons 126 and 125 (i.e. alternative last exons) were reported together as their differences are minor (Table S5). Our results showed that exon 98 (i.e. exon 95 of the meta-transcript) was included significantly less in the mRNAs of postnatal muscles compared to fetal muscles (FDR(AM/FM) = 3.574e − 19, \(\Delta \Psi\)(AM/FM) = − 36.74%) (Fig. 6). In contrast, exon 125 or 126 were included significantly more (FDR(AM/FM) = 3.419e − 19, \(\Delta \Psi\)(AM/FM) = 36.69%) in the mRNAs of postnatal muscles compared to fetal muscles (Fig. 6). As a consequence, the skipping of exon 97 (together with exon 98) was significantly upregulated in the postnatal muscles compared to fetal muscles (P(AM > FM) = 1e − 04, \(\Delta\) EJ(AM/FM) = 0.34%) (Figs. 5J, 4T). These findings, together with the real-time polymerase chain reaction (RT-qPCR) results, indicate a higher abundance of the longer isoform obscurin-B in the postnatal skeletal muscles despite the higher abundance of the shorter isoform obscurin-A in the fetal skeletal muscles (Fig. 4T, Table S6).
Table 2 The studied OBSCN isoforms and exons. Each row of the table represents a unique OBSCN exon. The unique IDs based on the non-overlapping exons are listed below the “Meta-transcript exon number”.The six studied transcripts include ENST00000284548, ENST00000422127, ENST00000570156, ENST00000680850, ENST00000660857 and ENST00000493977. Below the columns labelled with these Ensembl IDs, for those exons included in the isoform, the number of the exons are stated and the corresponding cells are highlighted in green. In the columns labelled with “First exon” and “Last exon”, the cells corresponding to the alternative first and last exons are marked with TRUE and highlighted in green. The final column (i.e. on the right) includes the detailed information about the alternative splicing events and the alternative first or last exons Fig. 6The inclusion of the alternative last exons in OBSCN mRNAs. The boxplots, illustrate the distribution of the \(\Psi\) levels of the alternative last exons. The sample classes for which the \(\Psi\) values are shown are: muscle vs heart (M), mostly adult postnatal vs fetal (A), postnatal muscle vs fetal muscle samples (AM/FM), postnatal heart vs fetal heart samples (AH/FH), postnatal muscle vs postnatal heart samples (AM/AH), and fetal muscle vs fetal heart samples (FM/FH). The significant levels are shown using asterisks: P < 0.05 (*), P < 0.01 (**) and P < 0.001 (***). They are coloured in red if \(\left|\Delta \psi \right|\ge 10\). The box plots extend from the 25th to the 75th percentile, and the thick horizontal line represents the median. The whiskers of the boxplots show 1.5 times the interquartile range. The outliers are values higher and lower than the interquartile range. Since their start coordinates are identical, the \(\Psi\) values for the exons 125 (i.e. 121a of meta-transcript, chr1: 228378618–228378874) and 126 (i.e. 121 of meta-transcript, chr1: 228378618–228378876) are indistinguishable, therefore reported together
We also studied the alternative 3’ splicing related to the exons 122 and 123 (i.e. 119 and 119a of meta-transcript, respectively) (Table 2). The inclusion levels of exon 123 (i.e. 119a of meta-transcript) were very low and the upstream intron was rarely spliced across our studied samples, suggesting that almost all mRNAs in our samples included the alternative exon 122 (i.e. 119 of meta-transcript) (Fig. S4, Additional File 1).
The affected protein domainsThe exons 17 and 18 that were frequently skipped in the adult skeletal muscle samples are known to code for Ig domains (Table 2). Furthermore, the exons 48–56 that were less included in the postnatal skeletal and cardiac muscles compared to the equivalent prenatal samples, also code for Ig domains (Table 2). As mentioned earlier, our results showed higher abundance of the longer OBSCN isoform (e.g. obscurin-B) compared to the shorter isoform (e.g. obscurin-A) in postnatal skeletal muscles, even though the shorter isoform was more abundant in fetal skeletal muscles. Compared to the short isoform (i.e. obscurin-A), the long isoform (i.e. obscurin-B) features an addition fibronectin type-III domain, two additional Ig sites and two serine/threonine type kinase sites. These variations in the domains can change the chemical/physical properties of a protein and ultimately affect its function.
Regulation of OBSCN exon inclusion by the splicing factorsWe examined the Spearman (rank) correlation of the expression of the significantly differentially expressed splicing factors (when comparing postnatal to fetal muscle samples) with the inclusion \(\Psi\) values of the OBSCN exons that were significantly differentially included (in postnatal muscle vs fetal muscle) across the studied skeletal muscle samples. Several significant correlations (|rho|> 0.4, P < 0.05) were detected, e.g. expression of DHX15, THOC1, PRPF1 with inclusion levels of exon 17 and 49 (Fig. 7A-J, Table S7-S10). However, remarkably the expression of BUB3 was significantly correlated with the inclusion levels of most of the significantly differentially included exons (Fig. 7A-I). The BUB3 gene belongs to the budding uninhibited by benomyl (BUB) protein family and is involved in mitosis, aging, carcinogenesis, as well as splicing [12, 13]. Our results suggest the possibility of regulation of OBSCN splicing by BUB3 especially during muscle development.
Fig. 7Correlation of the OBSCN exon inclusion levels with the expression of several splicing factors. A The rank correlation of the expression of the significantly differentially expressed splicing factors (comparing postnatal muscles vs fetal muscles) with the inclusion levels of the exons that were significantly differentially included (also when comparing postnatal muscles vs fetal muscles) have been shown as a matrix of circles (i.e. correlation plots). The significant (i.e. P < 0.05) correlations have only been shown. The size and the colour of the circles represent the correlation (i.e. rho value) of the corresponding splice factor (labelled on the row) with the corresponding OBSCN exon (labelled on the column). The rho values higher than 0.4 or lower than -0.4 have also been written. B-J For a highly correlated pair (i.e. |rho|> 0.5, P < 0.05), the two lines in the plot show the PSI values of the OBSCN exon, as well as the VST normalized expression levels of the splicing factor (scaled to 100) across the studied samples
Comments (0)