Measurement error in time-series analysis: a simulation study comparing modelled and monitored data

Barbara K Butland, Ben Armstrong, Richard W Atkinson, Paul Wilkinson, Mathew R Heal, Ruth M Doherty, Massimo Vieno

Research output: Contribution to journalArticlepeer-review

Abstract

Background
Assessing health effects from background exposure to air pollution is often hampered by the sparseness of pollution monitoring networks. However, regional atmospheric chemistry-transport models (CTMs) can provide pollution data with national coverage at fine geographical and temporal resolution. We used statistical simulation to compare the impact on epidemiological time-series analysis of additive measurement error in sparse monitor data as opposed to geographically and temporally complete model data.

Methods
Statistical simulations were based on a theoretical area of 4 regions each consisting of twenty-five 5 km x 5 km grid-squares. In the context of a 3-year Poisson regression time-series analysis of the association between mortality and a single pollutant, we compared the error impact of using daily grid-specific model data as opposed to daily regional average monitor data. We investigated how this comparison was affected if we changed the number of grids per region containing a monitor. To inform simulations, estimates (e.g. of pollutant means) were obtained from observed monitor data for 2003--2006 for national network sites across the UK and corresponding model data that were generated by the EMEP-WRF CTM. Average within-site correlations between observed monitor and model data were 0.73 and 0.76 for rural and urban daily maximum 8-hour ozone respectively, and 0.67 and 0.61 for rural and urban loge(daily 1-hour maximum NO2).

Results
When regional averages were based on 5 or 10 monitors per region, health effect estimates exhibited little bias. However, with only 1 monitor per region, the regression coefficient in our time-series analysis was attenuated by an estimated 6% for urban background ozone, 13% for rural ozone, 29% for urban background loge(NO2) and 38% for rural loge(NO2). For grid-specific model data the corresponding figures were 19%, 22%, 54% and 44% respectively, i.e. similar for rural loge(NO2) but more marked for urban loge(NO2).

Conclusion
Even if correlations between model and monitor data appear reasonably strong, additive classical measurement error in model data may lead to appreciable bias in health effect estimates. As process-based air pollution models become more widely used in epidemiological time-series analysis, assessments of error impact that include statistical simulation may be useful.

Original languageEnglish
Article number136
JournalBMC Medical Research Methodology
Volume13
DOIs
Publication statusPublished - 13 Nov 2013

Fingerprint

Dive into the research topics of 'Measurement error in time-series analysis: a simulation study comparing modelled and monitored data'. Together they form a unique fingerprint.

Cite this