Abstract
Estimating the size of hidden or difficult to reach populations is often of interest for economic, sociological or public health reasons. In order to estimate such populations, administrative data lists are often collated to form multilist crosscounts and displayed in the form of an incomplete contingency table. Loglinear models are typically fitted to such data to obtain an estimate of the total population size by estimating the number of individuals not observed by any of the datasources. This approach has been taken to estimate the current number of people who inject drugs (PWID) in Scotland, with the Hepatitis C virus (HCV) diagnosis database used as one of the datasources to identify PWID. However, the HCV diagnosis datasource does not distinguish between current and former PWID, which, if ignored, will lead to overestimation of the total population size of current PWID. We extend the standard modelfitting approach to allow for a datasource which contains a mixture of target and nontarget individuals (i.e. in this case; current and former PWID). We apply the proposed approach to data for PWID in Scotland in 2003, 2006 and 2009 and compare to the results from standard loglinear models.
Original language  English 

Pages (fromto)  15641579 
Number of pages  16 
Journal  STATISTICS IN MEDICINE 
Volume  33 
Issue number  9 
Early online date  1 Dec 2013 
DOIs  
Publication status  Published  30 Apr 2014 
Keywords
 Censoring
 Incomplete contingency table
 People who inject drugs
 Loglinear models
 Population size
 QA Mathematics
Fingerprint
Dive into the research topics of 'Incomplete contingency tables with censored cells with application to estimating the number of people who inject drugs in Scotland'. Together they form a unique fingerprint.Profiles

Ruth King
 School of Mathematics  The Thomas Bayes Chair of Statistics
Person: Academic: Research Active (Teaching)