S. Kapunac1*, S. Malkov1 , M. Beljanski1, G. Pavlović Lažetić1 , B. Stojanović2, M. Maljković1, A. Veljković1 , N. Mitić1
1Faculty of Mathematics, University of Belgrade, Studentski trg 16, 11000 Belgrade, Serbia
2Mathematical Institute SASA, Knez Mihaila 36, 11000 Belgrade, Serbia
stefan.kapunac [at] matf.bg.ac.rs
Abstract
Repeats in nucleotide sequences are connected with various genome characteristics. RNA secondary structures are related to repeats at the primary structure level. Four different types of nucleotide repeats may be identified: direct non-complementary, direct complementary, inverse non-complementary and inverse complementary. Reverse complementary tandem repeats, for example, may form hairpin secondary structures, while reverse non-complementary may be recognized by proteins. On the other side, direct complementary and/or non-complementary repeats may be reflected in protein sequence repeats, if found in the same reading frame, within the protein-coding sequence.
Here we analyzed (determined and compared) all four types of nucleotide repeats in referent sequences of SARS-CoV-1, SARS-CoV-2 and MERS-COV viruses. In addition to the complete repeat set, we analyze different repeat subsets: repeats with the left component within the 5′ end, repeats with the right component within the 3′ end, and repeats with at least one component within the surface glycoprotein coding sequence.
We found significant differences in repeat sets corresponding to analyzed sequences in all analyzed repeat sets. In this moment we can only speculate what are the real consequences of the discovered differences.
Keywords: SARS-COV, MERS-COV, SARS-COV-2, nucleotide repeats