Candidate biomarkers for discrimination between infection and disease caused by Mycobacterium tuberculosis
Infection with Mycobacterium tuberculosis is controlled by an efficacious immune response in about 90% of infected individuals who do not develop disease. Although essential mediators of protection, e.g., interferon-γ, have been identified, these factors are insufficient to predict the outcome of M. tuberculosis infection. As a first step to determine additional biomarkers, we compared gene expression profiles of peripheral blood mononuclear cells from tuberculosis patients and M. tuberculosis-infected healthy donors by microarray analysis. Differentially expressed candidate genes were predominantly derived from monocytes and comprised molecules involved in the antimicrobial defense, inflammation, chemotaxis, and intracellular trafficking. We verified differential expression for alpha-defensin 1, alpha-defensin 4, lactoferrin, Fcγ receptor 1A (cluster of differentiation 64 [CD64]), bactericidal permeability-increasing protein, and formyl peptide receptor 1 by quantitative polymerase chain reaction analysis. Moreover, we identified increased protein expression of CD64 on monocytes from tuberculosis patients. Candidate biomarkers were then assessed for optimal study group discrimination. Using a linear discriminant analysis, a minimal group of genes comprising lactoferrin, CD64, and the Ras-associated GTPase 33A was sufficient for classification of (1) tuberculosis patients, (2) M. tuberculosis-infected healthy donors, and (3) noninfected healthy donors.
KeywordsMycobacterium tuberculosis Tuberculosis Biomarkers
In the beginning of the twenty-first century, tuberculosis (TB) remains a major cause of morbidity and mortality in humans worldwide (http://www.who.int/gtb). Yet, only 10% of about two billion individuals infected with Mycobacterium tuberculosis develops active disease . It is generally accepted that a competent immune system is crucial for protection against this pathogen, but the exact underlying mechanisms remain elusive . Therefore, identification of biomarkers of protective immunity against M. tuberculosis is critical especially for the clinical evaluation of efficacious vaccines.
Interferon gamma (IFN-γ) production and a type 1-dominated immune response are widely regarded as biomarkers of protective immunity. Nevertheless, IFN-γ represents an insufficient correlate of protection , and therefore, additional indicators are needed to define susceptibility against TB. Candidate indicators are T cell-derived molecules like the mycobacteriocidal mediator granulysin, which correlates with protection and clinical improvement in mycobacterial disease  and, possibly, molecules from other immune cell populations, e.g., macrophages and granulocytes. Regarding the role of macrophages (and their precursor monocytes in the blood), several molecules are involved in the interaction between host and pathogen in TB infection. These candidates comprise molecules from distinct functional groups including pathogen receptors, e.g., toll-like receptors (reviewed in ), regulators of intracellular vesicle trafficking , molecules involved in iron metabolism , and antimicrobial effector molecules, e.g., reactive nitrogen and oxygen intermediates (reviewed in ), and defensins . Although these effector mechanisms may not be specific to TB, it is tempting to speculate that the combined analysis of a pattern of differentially expressed candidates (a “biosignature”) will allow discrimination between long-term protection and disease activation in TB.
In this study/paper, we analyzed the gene expression profiles of peripheral blood mononuclear cells (PBMC) from TB patients and M. tuberculosis-infected healthy donors. Recent studies demonstrated the feasibility of using PBMC for gene expression analyses to discover characteristic patterns of cancer  and autoimmune diseases [3, 4]. As ethical considerations and accessibility restrict the usage of affected tissue from TB patients, PBMC are the first choice as a surrogate tissue in this chronic infection.
In a first step to narrow down the choice for relevant candidates, we performed microarray analyses comparing a randomly chosen subgroup of TB patients and healthy M. tuberculosis-infected individuals. Then categorization of preselected genes was performed to reduce the number of false positive genes. Preselected candidate genes involved in antimicrobial processes were then analyzed by quantitative polymerase chain reaction (qPCR). Two candidates, the formyl peptide receptor 1 (FPR1) and cluster of differentiation 64 (CD64), were determined for differential protein expression of monocytes from TB patients and M. tuberculosis-infected healthy donors. In a second step, we assessed candidate genes for optimal study group discrimination in a linear discriminant analysis (LDA) approach. Classification properties for these candidate biomarkers were validated in independently measured test data sets.
Materials and methods
Patients and M. tuberculosis-infected healthy donors
TB patients were recruited at the Asklepios Center for Respiratory Medicine and Thoracic Surgery München-Gauting, Germany. Diagnosis was based on chest radiography and laboratory confirmation by mycobacterial culture. All TB patients were HIV negative and received standard chemotherapeutic treatment except for two donors with acute TB who had been included before treatment.
Characteristics of tuberculosis patients and healthy controls
Healthy infected donors
Healthy noninfected donors
Age, mean years
Lymph node TB
Preparation of RNA from PBMC
Forty milliliter of heparinized peripheral venous blood from each patient and control donor were drawn, and PBMC were isolated on Ficoll gradients. PBMC were immediately mixed with TRIzol® reagent (Invitrogen, Carlsbad, CA) and frozen at −80°C until RNA was extracted according to manufacturer’s instructions. RNA content, purity, and integrity were determined using Agilent 2100 Bioanalyzer (Agilent Technologies, Forster City, CA).
Microarray procedures, experimental design, and analysis
Real-time qPCR analysis
RNA was reverse transcribed to cDNA as described earlier . SYBR® Green (Applied Biosystems, Foster City, CA) uptake in double stranded DNA was measured using the ABI PRISM™ 7000 thermocycler (Applied Biosystems) according to manufacturer’s instructions. We designed primer pairs with the ABI PRISM™ primer express Version 2.0.0 software (Applied Biosystems) for real-time qPCR analysis. Altogether, RNA of PBMC from 18 TB patients (13 men and 5 women) and 17 M. tuberculosis-infected healthy donors (9 men and 8 women) was analyzed for expression of alpha-defensins 1 and 4 (DEFA1 and DEFA4), lactoferrin (LTF), CD64, bactericidal permeability increasing protein (BPI), and FPR1. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as an internal control.
We performed flow cytometry to quantify protein expression of selected candidate genes. Median fluorescence intensities of the FPR1 and the CD64 were determined in parallel staining attempts with phycoerythrin-labeled monoclonal antibodies (anti-FPR1, clone 5F1, BD Pharmingen; anti-CD64, clone 10.1, eBioscience). A single antibody staining procedure was used to avoid possible interference between different colors . Monocytes were gated according to size (forward scatter) and granularity (side scatter). In total, randomly chosen samples from 24 TB patients and 15 healthy controls were analyzed in two independent experiments.
The discriminatory power for classifying patients and healthy controls was investigated using LDA  based on qPCR data of nine selected candidate genes. These candidates were DEFA1, DEFA4, LTF, CD64, BPI, FPR1, Rab13, Rab24, and Rab33A. We optimized the combination of genes with the best discriminatory power using a leave-three-out cross-validation for all possible combinations of genes. We assessed the proportion of correctly classified patients (hit rate) in the left-out group. From these analyses, we chose the combination of genes with maximal hit rate. In a validation step, we used a novel data set for temporal validation . Performance of LDA for these validation data was then assessed using the parameters from the training step. According to this procedure, we performed two approaches for donor classification. Altogether, 37 TB patients (23 donors in the training step, 10 donors in the validation step, 2 donors before treatment, and 2 former TB patients 6 months after termination of treatment) were included. These were compared on the one hand with M. tuberculosis-infected healthy donors (17 donors in the training step and 5 donors in the validation step) and, on the other hand, with a combined study group of healthy M. tuberculosis-infected and noninfected (tuberculin skin test negative) donors. The noninfected study group comprised 15 donors (10 donors in the training step and 5 donors in the validation step).
Differential expression of monocyte-derived genes in PBMC from TB patients and M. tuberculosis-infected healthy donors
We performed microarray analyses to identify differentially expressed genes in PBMC from TB patients and M. tuberculosis-infected healthy donors. The entire data set is accessible in the GEO public database (GSE6112). A modified t-test (for details, see “Materials and methods”) was applied to determine a ranking list of genes differentially expressed between these study groups. A cutoff was set according to the first gene below the 1.5 fold-change threshold (Fig. 1).
DEFA1, DEFA4, LTF, CD64, BPI, and FPR1 are differentially expressed between TB patients and M. tuberculosis-infected healthy donors
CD64 protein expression differs significantly between TB patients and M. tuberculosis-infected healthy donors
Optimal gene expression markers for discrimination between TB patients and M. tuberculosis-infected healthy donors
To examine whether expression patterns are influenced by chemotherapy, we analyzed discrimination in two TB patients before chemotherapy. Both donors were classified correctly (Fig. 5d). Two former TB patients were examined 6 months after termination of chemotherapy to determine whether the gene expression pattern had returned to normal upon recovery. Interestingly, one patient was classified as a healthy control, whereas no prediction was possible for the other one (Fig. 5d). Further experiments have to be performed to clarify whether this divergence is indeed associated with relapse risk. Four patients with extrapulmonary, TB were included (three in the training set and one patient before chemotherapy in the test set) to determine the influence of different organs affected by M. tuberculosis. None of those were falsely predicted as controls (Fig. 5b and 5d). Therefore, a bias introduced by extrapulmonary TB cases on the gene expression analyses could be excluded. We conclude that gene expression analyses by qPCR of a minimal biomarker set comprising CD64, LTF, and Rab33A suffices for a biosignature, which allows robust discrimination between TB patients and healthy donors.
Multiple host factors determine the outcome of M. tuberculosis infection and, thus, susceptibility and pathogenesis. We determined candidate biomarkers for classification of TB patients and healthy donors. Combining microarray analyses with qPCR in a discriminant analysis approach revealed an optimal group of genes for classification including CD64, LTF, and Rab33A. These candidates for discrimination between TB patients and healthy donors were validated in a second data set.
Further studies are ongoing to determine the role of these molecules in TB and to reveal the reasons for differential expression. As long as the TB specificity is not proven, other causes, e.g., an inflammation induced effect, cannot be excluded. These studies particularly focus on disease specificity and the prognostic or diagnostic value of these markers in comparison to IFN-γ, a marker currently introduced into diagnosis of TB. In this context, it remains to be determined whether differential classification within the M. tuberculosis-infected healthy donor group will represent a robust correlate of protection. Long-term follow-up studies in recently M. tuberculosis-infected donors are necessary to examine the value of our candidate biomarkers in this context.
Although discrimination between probable pathologic or protective influences of these molecules remains impossible, known functions of these markers revealed involvement of most genes in host/pathogen interactions. Notably, CD64 is capable of inducing phagocytosis, respiratory burst, and antibody-dependent cell-mediated cytotoxicity in monocytes, macrophages, and granulocytes (reviewed in ). A crucial role of CD64 in infectious diseases is supported by studies showing regulation of gene expression in macrophages and dendritic cells by cytokines such as IFN-γ  and interleukin-10 . IFN-γ is a key mediator in anti-mycobacterial host defense [8, 13]. It is also a major target of the survival strategy of M. tuberculosis that blocks various transcriptional responses including induction of CD64 . Interestingly and apparently controversial, CD64 protein expression is increased in monocytes from TB patients . We found increased CD64 expression at the RNA and cell surface protein level.
Numerous efforts have been made to characterize molecules involved in the regulation of CD64. The role of N-formyl-methionyl-leucyl-phenylalanine peptides from bacteria as ligands of FPR1 in this process remains controversial. N-formyl peptides are released as a consequence of their destruction by the immune system or by autolysis (reviewed in ). Despite their proinflammatory function, N-formyl-methionyl-leucyl-phenylalanine peptides induce downregulation of CD64 in monocytes thus IFN-γ and interleukin-10, which induce enhanced CD64 expression . Like CD64 and FPR1, LTF is essential for antimicrobial defense. LTF is a transporter molecule with high affinity for iron, which modulates host defense likely by competing with microbes for iron . A direct effect of LTF on M. tuberculosis infection has been revealed in a murine model, in which correction of iron overload by LTF inverted increased susceptibility to TB . Increased expression of LTF in TB patients’ PBMC could restrict mycobacterial access to iron. Because LTF is a crucial host protective factor against TB, we hypothesize that increased expression of at least some candidates in TB patients indicates fine tuned balance between key processes in host pathogen interactions.
The third molecule in the optimal discriminating group of genes was Rab33A. Rab33A is a member of the Ras-associated small GTPase family that is likely involved in the regulation of intracellular trafficking . Recently, we demonstrated that Rab33A is downregulated in PBMC from TB patients and that it is preferentially expressed in CD8+ T cells . The induction of Rab33A expression depends on T cell receptor activation . Further studies will clarify the biological function of Rab33A and its exact role in TB.
The bactericidal BPI and DEFA1, 3, and 4 are well-known antibacterial effector molecules . BPI is crucial in the immune response against Gram-negative bacteria and increased serum concentrations are prevalent in patients with active TB . High titers of DEFA in bronchoalveolar lavage fluid and plasma from TB patients and specific mycobactericidal activity of DEFA have been described . Thus, the molecular biosignature identified here includes several candidates of known relevance to host defense against TB.
Accessibility restricts the use of affected pulmonary tissue for research and diagnostic purposes, demanding accessible surrogate tissue samples in TB. PBMC heterogeneity markedly confounds microarray analysis because differences in immune cell populations cannot be separated from real single cell RNA expression . Nevertheless, PBMC have been used as surrogate tissue with diagnostic or prognostic value in different malignant and autoimmune diseases [3, 4, 5]. Our data demonstrate that it is feasible to extend this approach to chronic infections where active disease is a possible, but not conclusive, outcome of infection.
This study was supported in part by the National Genome Research Network (Germany), the EU FP6 funded IP “TBVAC”, and Grand Challenge 6 of the Bill & Melinda Gates Foundation to S. H. E. Kaufmann and M. Jacobsen. H.-J. Mollenkopf and S. H. E. Kaufmann acknowledge additional funding by the European Fund for Regional Development/State of Berlin. The authors have no conflicting financial interests. We thank M. L. Grossman for carefully revising the manuscript.
- 4.Bomprezzi R, Ringner M, Kim S, Bittner ML, Khan J, Chen Y, Elkahloun A, Yu A, Bielekova B, Meltzer PS, Martin R, McFarland HF, Trent JM (2003) Gene expression profile in multiple sclerosis patients and healthy controls: identifying pathways relevant to disease. Hum Mol Genet 12:2191–2199PubMedCrossRefGoogle Scholar
- 6.Chan YH (2005) Biostatistics 303. Discriminant analysis. Singap Med J 46:54–61, quiz 62Google Scholar
- 10.Jacobsen M, Schweer D, Ziegler A, Gaber R, Schock S, Schwinzer R, Wonigeit K, Lindert RB, Kantarci O, Schaefer-Klein J, Schipper HI, Oertel WH, Heidenreich F, Weinshenker BG, Sommer N, Hemmer B (2000) A point mutation in PTPRC is associated with the development of multiple sclerosis. Nat Genet 26:495–499PubMedCrossRefGoogle Scholar
- 23.te Velde AA, de Waal Malefijt R, Huijbens RJ, de Vries JE, Figdor CG (1992) IL-10 stimulates monocyte Fc gamma R surface expression and cytotoxic activity. Distinct regulation of antibody-dependent cellular cytotoxicity by IFN-gamma, IL-4, and IL-10. J Immunol 149:4048–4052Google Scholar
- 29.Young PHWaSS (1993) Resampling-based multiple testing: examples and methods for p-value adjustment. WileyGoogle Scholar
- 31.Jacobsen M, Repsilber D, Gutschmidt A, Neher A, Feldmann K, Mollenkopf H-J, Kaufmann SHE, Ziegler A (2006) Deconfounding microarray analyses: independent measurements of cell type proportions used in a regression model to resolve tissue heterogeneity bias. Methods Inf Med 45:557–563PubMedGoogle Scholar