TY - CHAP

T1 - A regression tree-based Gibbs sampler to learn the regulation programs in a transcription regulatory module network

AU - Qi, J.

AU - Butler, G.

AU - Michoel, T.

PY - 2010/1/1

Y1 - 2010/1/1

N2 - Many algorithms have been proposed to learn transcription regulatory networks from gene expression data. Bayesian networks have obtained promising results, in particular, the module network method. The genes in a module share a regulation program (regression tree), consisting of a set of parents and conditional probability distributions. Hence, the method significantly decreases the search space of models and consequently avoids overfitting. The regulation program of a module is normally learned by a deterministic search algorithm, which performs a series of greedy operations to maximize the Bayesian score. The major shortcoming of the deterministic search algorithm is that its result may only represent one of several possible regulation programs. In order to account for the model uncertainty, we propose a regression tree-based Gibbs sampling algorithm for learning regulation programs in module networks. The novelty of this work is that a set of tree operations is defined for generating new regression trees from a given tree and we show that the set of tree operations is sufficient to generate a well mixing Gibbs sampler even in large data sets. The effectiveness of our algorithm is demonstrated by the experiments in synthetic data and real biological data.

AB - Many algorithms have been proposed to learn transcription regulatory networks from gene expression data. Bayesian networks have obtained promising results, in particular, the module network method. The genes in a module share a regulation program (regression tree), consisting of a set of parents and conditional probability distributions. Hence, the method significantly decreases the search space of models and consequently avoids overfitting. The regulation program of a module is normally learned by a deterministic search algorithm, which performs a series of greedy operations to maximize the Bayesian score. The major shortcoming of the deterministic search algorithm is that its result may only represent one of several possible regulation programs. In order to account for the model uncertainty, we propose a regression tree-based Gibbs sampling algorithm for learning regulation programs in module networks. The novelty of this work is that a set of tree operations is defined for generating new regression trees from a given tree and we show that the set of tree operations is sufficient to generate a well mixing Gibbs sampler even in large data sets. The effectiveness of our algorithm is demonstrated by the experiments in synthetic data and real biological data.

UR - http://www.scopus.com/inward/record.url?partnerID=yv4JPVwI&eid=2-s2.0-77955593676&md5=9e4934647fd45036ef2bc6d7d67a2406

U2 - 10.1109/CIBCB.2010.5510433

DO - 10.1109/CIBCB.2010.5510433

M3 - Other chapter contribution

AN - SCOPUS:77955593676

SN - 978-1-4244-6766-2

SP - 1

EP - 8

BT - 2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2010

ER -