The effect of microarray image compression on expression-based classification
- 84 Downloads
Current gene-expression microarrays carry enormous amounts of information. Compression is necessary for efficient distribution and storage. This paper examines JPEG2000 compression of cDNA microarray images and addresses the accuracy of classification and feature selection based on decompressed images. Among other options, we choose JPEG2000 because it is the latest international standard for image compression and offers lossy-to-lossless compression while achieving high lossless compression ratios on microarray images. The performance of JPEG2000 has been tested on three real data sets at different compression ratios, ranging from lossless to 45:1. The effects of JPEG2000 compression/decompression on differential expression detection and phenotype classification have been examined. There is less than a 4% change in differential detection at compression rates as high as 20:1, with detection accuracy suffering less than 2% for moderate to high intensity genes, and there is no significant effect on classification at rates as high as 35:1. The supplementary material is available at http://gsp.tamu.edu/web2/Compression.
KeywordsMicroarray Classification Compression
Unable to display preview. Download preview PDF.
- 3.Taubman D., Marcellin M.: JPEG2000: Image Compression Fundamentals, Standards, and Practice. Kluwer, Dordrecht (2001)Google Scholar
- 4.ISO/IEC 14495-1, ITU Recommendation T.87, Information technology—Lossless and near-lossless compression of continuous-tone images (1999)Google Scholar
- 6.Hua, J., Liu, Z., Xiong, Z., Wu, Q., Castleman, K.: Microarray basica: Background adjustment, segmentation, image compression and analysis of microarray images. EURASIP J. Appl. Signal Process. 92–107 (2004)Google Scholar
- 9.Lacayo N.J., Meshinchi S., Kinnunen P., Yu R., Wang Y., Stuber C.M., Douglas L., Wahab R., Becton D.L., Weinstein H., Chang M.N., Willman C.L., Radich J.P., Tibshirani R., Ravindranath Y., Sikic B.I., Dahl G.V.: Gene expression profiles at diagnosis in de novo childhood aml patients identify flt3 mutations with good clinical outcomes. Blood 104, 2646–2654 (2004)CrossRefGoogle Scholar
- 11.Welsh T.: A technique for high-performance data compression. IEEE Comput. Mag. 17, 8–19 (1984)Google Scholar
- 13.Strang G., Nguyen T.: Wavelets and Filter Banks. Wellesley-Cambridge Press, New York (1996)Google Scholar
- 22.Ioannidis J.P.: Microarrays and molecular research: noise discovery?. Lancet 365, 454–455 (2005)Google Scholar
- 23.Dougherty E.R., Brun M.: On the number of close-to-optimal feature sets. Cancer Inform. 2, 189–196 (2006)Google Scholar
- 25.Grate, L.R.: Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery. BMC Bioinformatics, vol. 6, 2005Google Scholar