Digital images obtained by the laser scanning of spotted microarrays often include saturated pixel values. These arise when the scan settings are sufficiently high and some pixels exceed the limit L=65535 and are instead set to L. Failure to adjust for this censoring leads to biased estimates of gene expression levels. To impute censored values, we propose a linear model based on the principal components of uncensored spots on the same array. This is computationally fast, flexible to adapt to distinctive spot shapes and profiles on different arrays, and is shown to be more effective than the polynomial-hyperbolic model in correcting for the bias. The application to biological data demonstrates the potential for enhancing the dynamic range of detection. Fortran90 subroutines implementing these methods are available at http://www.bioss.ac.uk/~chris.
|Journal||Statistical applications in genetics and molecular biology|
|Publication status||Published - 2007|
- Gene Expression Profiling
- Image Processing, Computer-Assisted
- Models, Theoretical
- Oligonucleotide Array Sequence Analysis
- Principal Component Analysis