Lena Maria Hackl1*, Amit Fenn1,2, Zakaria Louadi1,2, Jan Baumbach1,3, Tim Kacprowski4,5, Markus List2 and Olga Tsoy1
1Institute for Computational Systems Biology, University of Hamburg, Notkestrasse 9, 22607 Hamburg, Germany
2Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Arcisstraße 21, 80333 Munich, Germany
3Computational BioMedicine Lab, University of Southern Denmark, Campusvej 50, 5230 Odense, Denmark
4Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Rebenring 56, 38106 Braunschweig, Germany
5Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Rebenring 56, 38106 Braunschweig, Germany
olga.tsoy [at] uni-hamburg.de
Abstract
MicroRNAs (miRNAs) are small non-coding RNA molecules that regulate post-transcriptional gene expression by binding to specific target sites. Approximately 95% of human multi-exon genes can be spliced alternatively, which enables the production of functionally diverse transcripts and proteins from a single gene. In complex diseases, such as cancer, gene but also miRNA dysregulation plays a significant role. According to most studies miRNAs preferably bind to 3’-untranslated regions of mRNA. However, through alternative splicing, transcripts might lose exons harboring miRNA target sites and, hence, become unresponsive to miRNA regulation.
To check this hypothesis, we studied the role of miRNA target sites in both coding and noncoding regions using six cancer data sets from The Cancer Genome Atlas (TCGA). First, we predicted miRNA target sites on mRNAs from their sequence using TarPmiR. For our analysis, we focused on miRNAs whose expression was negatively correlated with gene expression (as evidence for active regulation) as well as genes that were at least moderately expressed and showed evidence of alternative splicing. We chose different subsets of transcripts to differentiate the effects of target sites in different gene regions. To check whether alternative splicing interferes with miRNA regulation, we trained linear regression models to predict miRNA expression from transcript expression. Using nested models, we compared the predictive power of transcripts with miRNA target sites to that of transcripts without target sites in the investigated gene region. For all six cancer data sets and all subsets, models containing transcripts with target sites predicted miRNA abundance significantly better.
We conclude that alternative splicing does interfere with miRNA regulation by skipping exons with miRNA target sites within the coding region.
Keywords: alternative splicing, miRNA, machine learning, nested models, cancer