Abstract
Lung cancer is the leading cause of death from cancer in the US and the world1. The high mortality rate (80–85% within 5 years) results, in part, from a lack of effective tools to diagnose the disease at an early stage2,3,4. Given that cigarette smoke creates a field of injury throughout the airway5,6,7,8,9,10,11, we sought to determine if gene expression in histologically normal large-airway epithelial cells obtained at bronchoscopy from smokers with suspicion of lung cancer could be used as a lung cancer biomarker. Using a training set (n = 77) and gene-expression profiles from Affymetrix HG-U133A microarrays, we identified an 80-gene biomarker that distinguishes smokers with and without lung cancer. We tested the biomarker on an independent test set (n = 52), with an accuracy of 83% (80% sensitive, 84% specific), and on an additional validation set independently obtained from five medical centers (n = 35). Our biomarker had ∼90% sensitivity for stage 1 cancer across all subjects. Combining cytopathology of lower airway cells obtained at bronchoscopy with the biomarker yielded 95% sensitivity and a 95% negative predictive value. These findings indicate that gene expression in cytologically normal large-airway epithelial cells can serve as a lung cancer biomarker, potentially owing to a cancer-specific airway-wide response to cigarette smoke.
Similar content being viewed by others
Accession codes
References
Parkin, D.M., Bray, F., Ferlay, J. & Pisani, P. Global cancer statistics, 2002. CA Cancer J. Clin. 55, 74–108 (2005).
Hirsch, F.R., Merrick, D.T. & Franklin, W.A. Role of biomarkers for early detection of lung cancer and chemoprevention. Eur. Respir. J. 19, 1151–1158 (2002).
Jett, J.R. Limitations of screening for lung cancer with low-dose spiral computed tomography. Clin. Cancer Res. 11, 4988s–4992s (2005).
Macredmond, R. et al. Screening for lung cancer using low dose CT scanning: results of 2 year follow up. Thorax 61, 54–56 (2006).
Auerbach, O., Hammond, E.C., Kirman, D. & Garfinkel, L. Effects of cigarette smoking on dogs. II. Pulmonary neoplasms. Arch. Environ. Health 21, 754–768 (1970).
Powell, C.A., Klares, S., O'Connor, G. & Brody, J.S. Loss of heterozygosity in epithelial cells obtained by bronchial brushing: clinical utility in lung cancer. Clin. Cancer Res. 5, 2025–2034 (1999).
Wistuba, I.I. et al. Molecular damage in the bronchial epithelium of current and former smokers. J. Natl. Cancer Inst. 89, 1366–1373 (1997).
Franklin, W.A. et al. Widely dispersed p53 mutation in respiratory epithelium. A novel mechanism for field carcinogenesis. J. Clin. Invest. 100, 2133–2137 (1997).
Guo, M. et al. Promoter hypermethylation of resected bronchial margins: a field defect of changes? Clin. Cancer Res. 10, 5131–5136 (2004).
Miyazu, Y.M. et al. Telomerase expression in noncancerous bronchial epithelia is a possible marker of early development of lung cancer. Cancer Res. 65, 9623–9627 (2005).
Spira, A. et al. Effects of cigarette smoke on the human airway epithelial cell transcriptome. Proc. Natl. Acad. Sci. USA 101, 10143–10148 (2004).
Postmus, P.E. Bronchoscopy for lung cancer. Chest 128, 16–18 (2005).
Mazzone, P., Jain, P., Arroliga, A.C. & Matthay, R.A. Bronchoscopy and needle biopsy techniques for diagnosis and staging of lung cancer. Clin. Chest Med. 23, 137–158 (2002).
Schreiber, G. & McCrory, D.C. Performance characteristics of different modalities for diagnosis of suspected lung cancer: summary of published evidence. Chest 123, 115S–128S (2003).
Salomaa, E.R., Sallinen, S., Hiekkanen, H. & Liippo, K. Delays in the diagnosis and treatment of lung cancer. Chest 128, 2282–2288 (2005).
Golub, T.R. et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999).
Tibshirani, R., Hastie, T., Narasimhan, B. & Chu, G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. USA 99, 6567–6572 (2002).
Bhattacharjee, A. et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. USA 98, 13790–13795 (2001).
Wachi, S., Yoneda, K. & Wu, R. Interactome-transcriptome analysis reveals the high centrality of genes differentially expressed in lung cancer tissues. Bioinformatics 21, 4205–4208 (2005).
Raponi, M. et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res. 66, 7466–7472 (2006).
Potti, A. et al. A genomic strategy to refine prognosis in early-stage non-small-cell lung cancer. N. Engl. J. Med. 355, 570–580 (2006).
Cheng, K.W., Lahad, J.P., Gray, J.W. & Mills, G.B. Emerging role of RAB GTPases in cancer and human disease. Cancer Res. 65, 2516–2519 (2005).
Shimada, K. et al. Aberrant expression of RAB1A in human tongue cancer. Br. J. Cancer 92, 1915–1921 (2005).
Kamio, T. et al. B-cell-specific transcription factor BACH2 modifies the cytotoxic effects of anticancer drugs. Blood 102, 3317–3322 (2003).
Xie, K. Interleukin-8 and human cancer biology. Cytokine Growth Factor Rev. 12, 375–391 (2001).
Arimura, Y. et al. Elevated serum beta-defensins concentrations in patients with lung cancer. Anticancer Res. 24, 4051–4057 (2004).
Coussens, L.M. & Werb, Z. Inflammation and cancer. Nature 420, 860–867 (2002).
Gudmundsson, G. & Hunninghake, G.W. Respiratory epithelial cells release interleukin-8 in response to a thermophilic bacteria that causes hypersensitivity pneumonitis. Exp. Lung Res. 25, 217–228 (1999).
Su, A.I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101, 6062–6067 (2004).
Bolstad, B.M., Irizarry, R.A., Astrand, M. & Speed, T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19, 185–193 (2003).
Acknowledgements
We thank C. O'Hara for histologic review of our airway epithelial cell samples; J. Warrington, J. Palma and R. Lipshutz for their support in designing and implementing this study; M. Klempner and D. Center for their review of the manuscript; F. O'Connell, J. Lundebye and the Lung Cancer Multi-Disciplinary Team at St. James's Hospital; and the doctors and nurses of the bronchoscopy service at Boston Medical Center, St. James's Hospital and Lahey Clinic. Affymetrix Inc. provided the HG-U133A arrays for these studies. This work was supported by the Doris Duke Charitable Foundation (A.S.), US National Institutes of Health/National Institute of Environmental Health Sciences (ES10377 to J.S.B.) and National Institutes of Health/ National Cancer Institute (R21CA10650 to A.S.).
Author information
Authors and Affiliations
Contributions
A.S. was responsible for the conception and design of this study and oversaw all aspects of the study including patient recruitment, experimental protocols and data analysis. J.E.B. contributed to the design of the analytic strategy and was responsible for the computational analysis of gene-expression data including preprocessing, class prediction and the connection to tumor tissue. V.S. contributed to the analysis of gene-expression and clinical data and optimization of the class prediction algorithm. K.S. was responsible for patient recruitment and for collection and analysis of clinical data on all subjects in this study. G.L. performed the microarray experiments and real-time PCR studies and was responsible for QRTPCR data analysis. F.S. performed the histologic studies of airway samples and the immunofluorescence studies. S.G. recruited subjects, collected samples and contributed to the analysis of clinical data on all subjects. Y.-M.D. was responsible for coordinating all patient recruitment and sample collection. P.C., J.B., C.L. and T.A. recruited subjects and collected samples at their respective institutions. P.S. contributed to the statistical analysis of the data. S.S. contributed to the development of the relational database. N.G. performed all microarray hybridizations. J.K. recruited subjects, collected samples and provided support in the design of the study. M.E.L. was responsible for conceptualizing many aspects of the analytic strategy and directed the computational analysis. J.S.B. was responsible for the conception and design of the study and oversaw the experimental studies and biological interpretation of the data. A.S., J.E.B., V.S., M.E.L. and J.S.B. were responsible for the writing of the manuscript and for the supplementary information.
Corresponding author
Ethics declarations
Competing interests
Affymetrix Inc. provided the HG-U133A arrays and some of the reagents for these studies.
Supplementary information
Supplementary Fig. 1
Confirmation of expression differences for selected biomarker genes by RT-PCR. (PDF 18 kb)
Supplementary Fig. 2
Inflammatory gene expression in bronchial epithelial cells. (PDF 39 kb)
Supplementary Fig. 3
Biomarker accuracy is independent of the composition of the training set. (PDF 19 kb)
Supplementary Fig. 4
Comparison of bronchoscopy cytopathology and biomarker prediction accuracies in our primary dataset by (a) cancer stage or (b) cancer subtype. (PDF 24 kb)
Supplementary Table 1
Patient demographics by dataset and cancer status. (PDF 17 kb)
Supplementary Table 2
Cell type and staging information for the 60 lung cancer patients in the n = 129 primary dataset. (PDF 11 kb)
Supplementary Table 3
Functional classification of biomarker genes. (PDF 17 kb)
Supplementary Table 4
Comparing the airway biomarker to randomized biomarkers. (PDF 16 kb)
Supplementary Table 5
Primers used for real time PCR. (PDF 14 kb)
Rights and permissions
About this article
Cite this article
Spira, A., Beane, J., Shah, V. et al. Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer. Nat Med 13, 361–366 (2007). https://doi.org/10.1038/nm1556
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nm1556
- Springer Nature America, Inc.
This article is cited by
-
Initial development and testing of an exhaled microRNA detection strategy for lung cancer case–control discrimination
Scientific Reports (2023)
-
Optimization of Sparsity-Constrained Neural Networks as a Mixed Integer Linear Program
Journal of Optimization Theory and Applications (2023)
-
Extracellular matrix profiles determine risk and prognosis of the squamous cell carcinoma subtype of non-small cell lung carcinoma
Genome Medicine (2022)
-
Percepta Genomic Sequencing Classifier and decision-making in patients with high-risk lung nodules: a decision impact study
BMC Pulmonary Medicine (2022)
-
Smoking modulates different secretory subpopulations expressing SARS-CoV-2 entry genes in the nasal and bronchial airways
Scientific Reports (2022)