Omics Data Fusion for Understanding Molecular Complexity Enabling Precision Medicine

Nataša Pržulj1,2,3

1Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain

2Department of Computer Science, University College London, London WC1E 6BT, UK

3ICREA, Pg. Lluís Companys 23, 08010 Barcelona, Spain

Abstract

We are flooded by increasing volumes of heterogeneous, interconnected, systems-level, molecular (multi-omic) data. They provide complementary information about cells, tissues and diseases. We need to utilize them to better stratify patients into risk groups, discover new biomarkers, and repurpose known and discover new drugs to personalize medical treatment. This is nontrivial, because of computational intractability of many underlying problems, necessitating the development of algorithms for finding approximate solutions (heuristics).

We develop a versatile data fusion (integration) machine learning (ML) framework to address key challenges in precision medicine from these data: better stratification of patients, prediction of biomarkers, and re-purposing of approved drugs to particular patient groups, applied to cancer, Covid-19, rare thrombophilia and Parkinson’s Disease. Our new methods stem from graph-regularized non-negative matrix tri-factorization (NMTF), a machine learning technique for dimensionality reduction, inference and co-clustering of heterogeneous datasets, coupled with novel network science algorithms. We utilize our new framework to develop methodologies for improving the understanding the molecular organization and disease from the omics data embedding space.

Comments are closed.