Projects per year
Abstract
Motivation
When analysing one-dimensional time series, scientists are often interested in identifying regions where one variable depends linearly on the other. Typically they use an ad hoc and therefore often subjective method to do so.
Results
Here we develop a statistically rigorous, Bayesian approach to infer the optimal partitioning of a data set not only into contiguous piece-wise linear segments, but also into contiguous segments described by linear combinations of arbitrary basis functions. We therefore present a general solution to the problem of identifying discontinuous change points. Focusing on microbial growth, we use the algorithm to find the range of optical density where this density is linearly proportional to the number of cells and to automatically find the regions of exponential growth for both Escherichia coli and Saccharomyces cerevisiae. For budding yeast, we consequently are able to infer the Monod constant for growth on fructose. Our algorithm lends itself to automation and high throughput studies, increases reproducibility, and should facilitate data analyses for a broad range of scientists.
Availability and Implementation
The corresponding Python package, entitled Nunchaku, is available at PyPI: https://pypi.org/project/nunchaku.
When analysing one-dimensional time series, scientists are often interested in identifying regions where one variable depends linearly on the other. Typically they use an ad hoc and therefore often subjective method to do so.
Results
Here we develop a statistically rigorous, Bayesian approach to infer the optimal partitioning of a data set not only into contiguous piece-wise linear segments, but also into contiguous segments described by linear combinations of arbitrary basis functions. We therefore present a general solution to the problem of identifying discontinuous change points. Focusing on microbial growth, we use the algorithm to find the range of optical density where this density is linearly proportional to the number of cells and to automatically find the regions of exponential growth for both Escherichia coli and Saccharomyces cerevisiae. For budding yeast, we consequently are able to infer the Monod constant for growth on fructose. Our algorithm lends itself to automation and high throughput studies, increases reproducibility, and should facilitate data analyses for a broad range of scientists.
Availability and Implementation
The corresponding Python package, entitled Nunchaku, is available at PyPI: https://pypi.org/project/nunchaku.
Original language | English |
---|---|
Article number | btad688 |
Number of pages | 8 |
Journal | Bioinformatics |
Volume | 39 |
Issue number | 12 |
Early online date | 15 Nov 2023 |
DOIs | |
Publication status | E-pub ahead of print - 15 Nov 2023 |
Keywords / Materials (for Non-textual outputs)
- Bayesian inference
- linear trend
- change point analysis
- growth curve
- mid-log phase
- exponential growth
Fingerprint
Dive into the research topics of 'Nunchaku: Optimally partitioning data into piece-wise contiguous segments'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Using systems biology to determine how budding yeast coordinates carbon and nitrogen sensing for efficient growth
Swain, P. (Principal Investigator)
1/02/22 → 31/01/25
Project: Research
Datasets
-
Nunchaku: Optimally partitioning data into piece-wise contiguous segments
Huo, Y. (Creator), Hongpei, L. (Creator) & Wang, X. (Creator), Edinburgh DataShare, 27 Nov 2023
DOI: 10.7488/ds/7548, https://www.biorxiv.org/content/10.1101/2023.05.26.542406v1
Dataset