RATIONALE AND OBJECTIVES: The goal was to investigate the effects of choosing between different metrics in estimating the size of pulmonary nodules as a factor both of nodule characterization and of performance of computer aided detection systems, because the latter are always qualified with respect to a given size range of nodules.
MATERIALS AND METHODS: This study used 265 whole-lung CT scans documented by the Lung Image Database Consortium (LIDC) using their protocol for nodule evaluation. Each inspected lesion was reviewed independently by four experienced radiologists who provided boundary markings for nodules larger than 3 mm. Four size metrics, based on the boundary markings, were considered: a unidimensional and two bidimensional measures on a single image slice and a volumetric measurement based on all the image slices. The radiologist boundaries were processed and those with four markings were analyzed to characterize the interradiologist variation, while those with at least one marking were used to examine the difference between the metrics.
RESULTS: The processing of the annotations found 127 nodules marked by all of the four radiologists and an extended set of 518 nodules each having at least one observation with three-dimensional sizes ranging from 2.03 to 29.4 mm (average 7.05 mm, median 5.71 mm). A very high interobserver variation was observed for all these metrics: 95% of estimated standard deviations were in the following ranges for the three-dimensional, unidimensional, and two bidimensional size metrics, respectively (in mm): 0.49-1.25, 0.67-2.55, 0.78-2.11, and 0.96-2.69. Also, a very large difference among the metrics was observed: 0.95 probability-coverage region widths for the volume estimation conditional on unidimensional, and the two bidimensional size measurements of 10 mm were 7.32, 7.72, and 6.29 mm, respectively.
CONCLUSIONS: The selection of data subsets for performance evaluation is highly impacted by the size metric choice. The LIDC plans to include a single size measure for each nodule in its database. This metric is not intended as a gold standard for nodule size; rather, it is intended to facilitate the selection of unique repeatable size limited nodule subsets.
- Databases as Topic
- Diagnosis, Computer-Assisted
- Image Processing, Computer-Assisted
- Imaging, Three-Dimensional
- Knowledge Bases
- Lung Neoplasms
- Observer Variation
- Radiology Information Systems
- Solitary Pulmonary Nodule
- Tomography, X-Ray Computed