In systems biology, the ability to compare models addresses a critically important question: Given two or more models, and one or more data sets, which model structure (topology) explains the data best? Calculating the Bayesian evidence for a model, and for any competing models that can be specified, is a quantitative approach to answering this question. The Bayesian evidence is the result of a summation (more formally, an integration) over parameter values; this is in contrast to a point estimate of the goodness of fit for some specific combination of parameter values - which will typically only be guaranteed to be locally optimal, and therefore limit the conclusions that can be drawn about the models. To address this question, this project will applied the nested sampling algorithm (Skilling, 2006) for computing the Bayesian evidence, and optimising model parameters, to systems biology models. These functions have been delivered to users by incorporating them in a new version of the popular stochastic simulation tool Dizzy (Ramsey et al, 2005). The nested sampling algorithms have also be released as R and Java code in order to make the technique available to users who require the power of computational environments for systems modelling.
The way that molecular systems are described is changing from the traditional diagrammatic sketch of likely interactions, to a set of mathematical equations linking the rates of change of one molecule with the amounts of others. This project addresses the important issue of the justification for decisions made in modelling a biological system. We might like to say that only one model describes the data - but this is not possible for any complex system. Instead, we can hope to show that one model fits the data better than another, and this is the aim of the research proposed here. We shall apply a probabilistic approach that can optimise the fit of models to data, and quantitatively compare the extent to which they fit the data. This will provide useful information to the bench biologists and the systems biologists with whom they collaborate to further our knowledge of the cell.
The key outputs of this project are the development of a model comparison methodology for systems biology based on the calculation of the Bayesian evidence Ζ (also known as the marginal likelihood), in the form of R and Java code for the nested sampling algorithm, and its application to concrete questions of the inference of parameters of circadian models from data. These outputs and the means by which they were achieved are exactly in line with the original project aims.