Using Singular Value Decomposition for Extracting Underlying Gene Expression Patterns in Transcriptomic Analysis

Biljana Stanković*, Mirjana Novković, Nikola Kotur

Institute of Molecular Genetics and Genetic Engineering, University of Belgrade, Belgrade, Serbia

biljana.stankovic [at] imgge.bg.ac.rs

Abstract

Singular Value Decomposition (SVD) is a mathematical approach that can be useful in analysis of transcriptome data. SVD is based on transforming gene expression data from genes × arrays to reduced eigengenes/eigenarrays vector space. In high-throughput analysis, the goal is often to reduce the dimensionality of data, excluding non-informative noise, and to extract patterns reflecting biological processes. Arranging the data based on eigenvectors provides a comprehensive overview of gene expression dynamics, where individual genes are classified into groups of similar regulation and function, or similar cellular state and biological phenotype.

Here we applied SVD to microarray gene expression data involving 21,176 genes from ulcerative colitis patients both glucocorticoid-sensitive (n=20) and glucocorticoid-resistant (n=20). Our aim was to validate SVD based methodology as an alternative to classical differential gene expression analysis (PMID: 20941359).

The SVD process involves decomposing a matrix Am×n (m=number of genes, n=number of patients) into three matrices Um×m, Dm×n, and VTn×n. The VT matrix can be utilized to identify the eigengenes that differ the most between the sensitive and resistant patient groups. The greatest differences were found for eigengenes 7, 4, and 5 (p=0.007, p=0.078, and p=0.090, respectively (t-test)). For each eigengene of interest, lists of genes with the highest absolute values of projection and correlation were identified and used for gene and disease ontology analysis. Our results showed that the top 40 genes with the highest projection on selected eigengenes participate in the same five most important biological processes as the genes obtained from the differential gene expression analysis (PMID: 20941359).

In summary, SVD is a powerful tool for gene expression analysis, capable of isolating significant biological patterns. Further validation on additional datasets is necessary to confirm the robustness of SVD compared to more commonly used methods for differential expression analysis.

Keywords: differential gene expression, dimensionality reduction, eigengenes