Integrating gene regulatory pathways into differential network analysis of gene expression data

Authors

Tyler Grimes

S. Steven Potter

Somnath Datta

Published

April 2, 2019

In Scientific Reports

Abstract

Background: The analysis of gene-gene co-expression networks provides insight into the function of gene products. Exposing network irregularities offers an avenue for discovery in systems biology; these pursuits can include the study of gene function in developmental biology and understanding and treating diseases. Modern methods for differential network analysis often have two drawbacks: they implicitly rely on the selection of a relatively small subset of genes before analysis, and they are not flexible to the choice of association measure.

Methods: A general framework for integrating known gene regulatory pathways into a differential network analysis is proposed. The framework allows for any gene-gene association measure to be used, and inference is carried out through permutation testing. A simulation study investigates the performance in identifying differentially connected genes when incorporating known pathways and compares the general framework to four state-of- the-art methods. Two RNA-seq datasets are analyzed to illustrate the use of this framework in practice.

Results: The simulation study shows that incorporating pathway information can improve performance in terms of both sensitivity and true discovery rate. Furthermore, we demonstrate that the state-of-the-art methods each estimate different things and are not directly comparable – this emphasizes the fact that the choice of association measure can have a strong influence on results. In the applied examples, the analysis reveals genes and pathways that are known to be biologically significant along with new findings to motivate future research.

Conclusions: The proposed framework makes explicit two critical, but often overlooked, assumptions: the selection of a subset of genes and the meaning of gene-gene association. The results obtained from analyzing gene expression with this framework are more interpretable, and the pathway information provides context that can lead to deeper insights.

Summary

A framework for comparing gene-gene associations between two populations is proposed. Any association measure may be used - correlations, partial correlation, mutual information, etc; this flexibility is in contrast to the rigidity of existing methods. The framework makes explicit the practice of incoporating pathway information into the analysis. A simulation study explores the affect this has on performance, and we investigate what happens when pathway information is misspecified or incomplete.

doi: 10.1038/s41598-019-41918-3

dnapath R package available on CRAN

Back to top