Nunchaku: Optimally partitioning data into piece-wise contiguous segments

Yu Huo, Hongpei Li, Xiao Wang, Xiaochen Du, Peter S. Swain*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation
When analysing one-dimensional time series, scientists are often interested in identifying regions where one variable depends linearly on the other. Typically they use an ad hoc and therefore often subjective method to do so.
Results
Here we develop a statistically rigorous, Bayesian approach to infer the optimal partitioning of a data set not only into contiguous piece-wise linear segments, but also into contiguous segments described by linear combinations of arbitrary basis functions. We therefore present a general solution to the problem of identifying discontinuous change points. Focusing on microbial growth, we use the algorithm to find the range of optical density where this density is linearly proportional to the number of cells and to automatically find the regions of exponential growth for both Escherichia coli and Saccharomyces cerevisiae. For budding yeast, we consequently are able to infer the Monod constant for growth on fructose. Our algorithm lends itself to automation and high throughput studies, increases reproducibility, and should facilitate data analyses for a broad range of scientists.
Availability and Implementation
The corresponding Python package, entitled Nunchaku, is available at PyPI: https://pypi.org/project/nunchaku.
Original languageEnglish
Article numberbtad688
Number of pages8
JournalBioinformatics
Volume39
Issue number12
Early online date15 Nov 2023
DOIs
Publication statusE-pub ahead of print - 15 Nov 2023

Keywords / Materials (for Non-textual outputs)

  • Bayesian inference
  • linear trend
  • change point analysis
  • growth curve
  • mid-log phase
  • exponential growth

Fingerprint

Dive into the research topics of 'Nunchaku: Optimally partitioning data into piece-wise contiguous segments'. Together they form a unique fingerprint.

Cite this