Abstract / Description of output
We develop a machine learning-based framework to predict the HI content of galaxies from optical photometry and environmental parameters. We train the algorithm on z = 0-2 outputs from the MUFASA cosmological hydrodynamic simulation, which includes star formation, feedback, and a heuristic model to quench massive galaxies that yields a reasonable match to a range of survey data including HI.We employ a variety of machine learning methods (regressors), and quantify their performance using the slope of the predicted versus true relation, its root mean square error (RMSE), and Pearson correlation coefficient (r). Training on only Sloan Digital Sky Survey photometry, all regressors give r > 0.8 and RMSE ~ 0.3 at z = 0, led by random forests with r = 0.91, and a deep neural network (DNN) with comparable accuracy (r = 0.9). Adding near-IR photometry improves all regressors. All regressors perform worse with redshift, particularly at z ≲ 1. Slope values are generally sub-linear, so that we overpredict HI in HI-poor galaxies and underpredict HI rich, because the regressors do not fully capture the scatter in the data. We test our framework on REsolved Spectroscopy Of a Local VolumE (RESOLVE) and Arecibo Legacy Fast ALFA (ALFALFA) survey data. Training on a subset of the observations, we find that our machine learning method can reasonably predict H Irichnesses in the remaining data (RMSE ~ 0.28 for RESOLVE and ~0.25 for ALFALFA). Training on mock data from MUFASA to predict observed data is worse (RMSE ~ 0.45 for RESOLVE and 0.31 for ALFALFA), with DNN well outperforming other regressors. Our method will be useful for making galaxy-by-galaxy survey predictions and incompleteness corrections for upcoming HI 21 cm surveys on Square Kilometre Array precursors such as MeerKAT, over regions where photometry is already available.
Original language | English |
---|---|
Pages (from-to) | 4509-4525 |
Number of pages | 17 |
Journal | Monthly Notices of the Royal Astronomical Society |
Volume | 479 |
Issue number | 4 |
Early online date | 5 Jul 2018 |
DOIs | |
Publication status | Published - 1 Oct 2018 |
Keywords / Materials (for Non-textual outputs)
- Galaxies: evolution
- Galaxies: statistics
- Methods: numerical