Some Applications of Graph-Based Machine Learning Methods on Biological Data

Mladen Nikolić1

1Faculty of Mathematics, University of Belgrade, Studentski trg 16, 11000 Belgrade, Serbia

mladen.nikolic [at] matf.bg.ac.rs

Abstract

Machine learning has made considerable contributions to various fields, most notably by providing methods for predictive modeling and data analysis. Usually, different kinds of data are best modeled by specialized machine learning models, tailored to account for the specifics of the data at hand. Graphs are an expressive data representation most suited for representing relationships between objects. The relationships can be interactions, hierarchies, similarities, or others. Such structures can be found in different kinds of data, including biological ones. Luckily, machine learning toolbox abounds with methods suitable for handling these kinds of data and we consider several applications of such graph-based machine learning methods on biological data. First we discuss tree-like hierarchies over the target variable values and the ways to account for such hierarchies in learning. We consider enzyme classification as a suitable application. Then we discuss hierarchies over the target variable values corresponding to directed acyclic graphs and graph neural network as a suitable model for this kind of data. We consider protein function classification as a suitable application. Finally, we discuss construction of similarity graphs over tabular instances, based on autoencoders and graph representation learning ideas. We consider the application of such techniques to the exploratory analysis of biological data related to expression of schizophrenia.

Keywords: machine learning, graphs, biological data

Acknowledgement: I would like to thank my coauthors and collaborators: Jovana Kovačević, Petar Veličković, Stefan Spalević, Nevena Ćirić, Predrag Janjić, and Stefan Kapunac

Comments are closed.