Massive lossless data compression and multiple parameter estimation from galaxy spectra

AF Heavens, R Jimenez, O Lahav

Research output: Contribution to journalArticlepeer-review


We present a method for radical linear compression of data sets where the data are dependent on some number M of parameters. We show that, if the noise in the data is independent of the parameters, we can form M linear combinations of the data which contain as much information about all the parameters as the entire data set, in the sense that the Fisher information matrices are identical; i.e. the method is lossless. We explore how these compressed numbers fare when the noise is dependent on the parameters, and show that the method, though not precisely lossless, increases errors by a very modest factor. The method is general, but we illustrate it with a problem for which it is well-suited: galaxy spectra, the data for which typically consist of ∼103 fluxes, and the properties of which are set by a handful of parameters such as age, and a parametrized star formation history. The spectra are reduced to a small number of data, which are connected to the physical processes entering the problem. This data compression offers the possibility of a large increase in the speed of determining physical parameters. This is an important consideration as data sets of galaxy spectra reach 106 in size, and the complexity of model spectra increases. In addition to this practical advantage, the compressed data may offer a classification scheme for galaxy spectra which is based rather directly on physical processes.
Original languageEnglish
Pages (from-to)965-972
Number of pages8
JournalMonthly Notices of the Royal Astronomical Society
Issue number4
Publication statusPublished - 1 Oct 2000


  • methods : data analysis
  • methods : statistical
  • galaxies : fundamental parameters
  • galaxies : statistics

Cite this