Background

Malignant mesothelioma is an aggressive cancer caused by exposure to asbestos. The incidence is increasing worldwide and the global annual number of mesothelioma deaths was extrapolated to be 38,400 [1]. Commonly, mesothelioma is characterized by a long latency of up to 50 years and is usually diagnosed at late stages of the disease, resulting in poor survival between nine and 13 months, depending on treatment [2]. For the detection of mesothelioma at early stages non- or minimally-invasive methods, i.e. liquid biopsies, are preferable. Up to date, only the combination of the proteins calretinin and mesothelin has been validated for the early detection of mesothelioma using plasma samples taken up to 15 months before the clinical diagnosis [3]. At a fixed high specificity of 98% the marker combination revealed a sensitivity of 46%. However, for the further improvement of the panel performance additional markers are needed. These marker candidates could either be additional proteins or derived from different molecular classes, i.e. DNA and RNA. In general, proper markers of all molecular classes need to fulfill four key characteristics, namely detectability, robustness, sufficient sensitivity, and high specificity, for the detection of malignant diseases [4]. In recent years, non-coding RNAs have been in the focus of marker research. Particularly, long non-coding RNAs (lncRNAs) represent a versatile and promising group of potential markers, playing a role as oncogenes as well as tumor suppressors, and showing an altered expression in cancer [5, 6]. However, to the best of our knowledge, RP1-86D1.3 is the only known circulating lncRNA for the detection of malignant mesothelioma so far, marked by a sensitivity of 83% and a specificity of 95% [7].

The aims of this study were (i) the identification of circulating lncRNAs as candidate markers for the detection of malignant mesothelioma using previously published RNA expression profiles from Gene Expression Omnibus (GEO), (ii) the verification using mesothelioma cell lines as well as human plasma samples from mesothelioma patients and subjects formerly exposed to asbestos as controls, and (iii) to assess the possible benefit of adding new candidate markers to the established marker panel of calretinin and mesothelin for the detection of malignant mesothelioma in liquid biopsies.

Methods

In silico analysis

RNA expression data of nine human pleural mesotheliomas and four normal pleural specimens using Affymetrix HGU133A plus 2.0 microarrays were obtained from GEO database (http://ncbi.nlm.hih.gov/projects/geo) as GSE 12345 series [8]. Raw CEL files were quantile normalized and background adjusted using RMAExpress Version 1.1.0 (http://rmaexpress.bmbolstad.com/). Using the annotated lncRNA transcripts with corresponding probe IDs on the HGU133A plus 2.0 microarray generated by Zhang et al. [9] Significance Analysis of Microarrays (SAM) version 5.0 (https://github.com/MikeJSeo/SAM) [10] was used with a false discovery rate (FDR) of < 10% and a fold change ≥1.5 to determine the differently expressed lncRNAs between pleural mesothelioma and normal human pleura.

Cell lines

Four mesothelioma cell lines NCI-H2452 (ATCC® (American Type Culture Collection) CRL-5946™; LGC Standards GmbH, Wesel, Germany), NCI-H28 (ATCC® CRL-5820™; LGC Standards GmbH), JL-1 (ACC 596; Deutsche Sammlung von Mikroorganismen und Zellkulturen (DSMZ), Braunschweig, Germany), and MSTO-211H (ACC 390; DSMZ) were cultivated according to the supplier’s instructions. As epithelioid and biphasic mesothelioma are the predominated histological subtypes [11], corresponding cell lines were selected for the initial analysis. Harvested cells were resuspended in 500 μl RNAlater (Thermo Fisher Scientific, Darmstadt, Germany) and frozen at − 80 °C until use.

Study population

The Molecular Marker (MoMar) cohort consists of 2769 German workers formerly exposed to asbestos with a confirmed asbestos-related disease, like asbestosis and/or other (non-malignant) pleural diseases but no malignancies at the beginning of the study participation.

Mesothelioma patients were recruited in participating medical practices of the MoMar study and at the Lungenklinik Heckeshorn, Helios Klinikum Emil von Behring, Berlin, Germany. The study group consisted of 22 male mesothelioma patients, including 14 (64%) epithelioid, two biphasic (9%), and two sarcomatoid (9%) mesotheliomas. The histological subtype for four cases (18%) remained unknown. Six patients underwent partial pleurectomy before blood drawing (median: 56 days, range: 24–179 days). Asbestos-exposed controls were derived from cancer-free participants of the MoMar study. The matched group consisted of 44 men formerly exposed to asbestos. Criteria for matching were age, smoking status, and time of blood collection. Characteristics of the study group are presented in Table 1.

Table 1 Characteristics of the study groups

Additionally, nine subjects (four mesothelioma patients and five asbestos-exposed subjects) were recruited in the context of the MoMar study for initial experiments, but were not used in the performance analyses.

Blood collection and storage

Peripheral blood of the subjects was collected in 9.0 ml S-Monovette EDTA gel tubes (Sarstedt, Nümbrecht, Germany). Blood samples were centrifugated at 2000x g for 10 min at room temperature within 30 min after collection. Afterwards, plasma was separated and temporarily stored frozen in the collaborating medical practices. Samples were transported frozen to the central laboratory, thawed at room temperature, aliquoted using an automated liquid handling robot (Tecan Group Ltd., Männedorf, Switzerland), and stored at − 80 °C until use.

Isolation of RNA

RNA from cell lines was isolated using the miRVana miRNA Isolation Kit (Thermo Fisher Scientific, Darmstadt, Germany) according to the manufacturer’s instructions. Concentration of the isolated RNA was determined using the NanoDrop ND-100 spectrophotometer (Thermo Fisher Scientific). Isolated RNA of the immortalized human mesothelial cell line MeT-5A (ATCC® CRL-9444™) was purchased from tgcBIOMICS GmbH (Bingen, Germany).

RNA from 0.5 ml plasma samples was isolated using the miRVana PARIS kit (Thermo Fisher Scientific) according to the manufacturer’s instructions, modified by adding 5 μl carrier RNA MS2 (Roche, Mannheim, Germany). Amount of free hemoglobin (Hb) in plasma was measured by spectral analysis using a NanoDrop ND-100 spectrophotometer (Thermo Fisher Scientific). Absorbance was measured at 415 nm (total Hb), 450 nm (bilirubin), and 700 nm (sample turbidity). Hemoglobin concentrations were quantified using the formula Hb (mg/dl) = 154.7 x A415–130.7 x A450–123.9 x A700 [12, 13]. Hemoglobin values (g/l) of the plasma samples used in the subsequent performance analyses are presented in Additional File 3.

Expression analysis

Expression analysis of lncRNAs and mRNAs was performed using a MJ Research PTC-200 Thermal Cycler (Bio-Rad Laboratories, Hercules, CA, USA) for reverse transcription (RT) and preamplification and a 7900 HT Fast Real-Time PCR System (Thermo Fisher Scientific) for quantitative real-time PCR (qPCR) according to the manufacturer’s instructions. In brief, using RNA from cell lines, RT was carried out in 25 μl reaction volume with 40 ng RNA as template. The subsequent qPCR was carried out in 20 μl reaction volume with 5 μl cDNA as template with and reactions were performed in duplicate. Using RNA from plasma samples, RT was carried out in 20 μl reaction volume with 5 μl RNA as template. Intermediary pre-amplification with 14 cycles was carried out in 10 μl reaction volume with 2.5 μl cDNA as template. For pre-amplification (but not for subsequent PCR) primers were diluted 1:100 according to the manufacturer’s instructions. Lastly, qPCR was carried out in 20 μl reaction volume with 5 μl DNA of a 1:20 dilution as template and reactions were performed in duplicate. Non-template controls were included in all assays. Estimation of the cycle threshold (Ct) was performed as described elsewhere [14]. The IDs of the commercially available probe-based assays (Integrated DNA Technologies, Leuven, Belgium) are presented in Additional File 1.

The candidate references B2M, GUSB, HPRT1, PPIA, RPLP0, and TBP [15] as well as the geometric mean (GM) of various reference combinations were analyzed using RefFinder to evaluate the most stable reference [16, 17]. Different lncRNA expressions in mesothelioma cell lines in comparison to MeT-5A were calculated using the 2-ΔΔCt method [18]. Altered expressions of lncRNAs were considered as significant for fold changes < 0.5 and > 2.0 [19]. Raw Ct values of lncRNAs and mRNAs in cell lines are presented in Additional File 2. Assays were analyzed to avoid hetero-dimers using the OligoAnalyzer tool (Integrated DNA Technologies). For group comparison between mesothelioma cases and asbestos-exposed controls marker values in plasma were normalized and expressed as 2-ΔCt. Raw Ct values in plasma ≥35 were considered to be under the detection limit and samples were excluded from further analyses. Raw Ct values of markers and references in the plasma samples used in the subsequent performance analyses are presented in Additional File 3.

Determination of calretinin and mesothelin

Enzyme-linked immunosorbent assays (ELISA) were used for the determination of calretinin and mesothelin in plasma. For calretinin the Calretinin ELISA kit (DLD Diagnostika GmbH, Hamburg, Germany) was used according to the manufacturer’s instructions. For mesothelin the Mesomark ELISA Kit (Fujirebio Diagnostics, Inc., Malvern, PA, USA) was used according to the manufacturer’s instructions with modifications as described elsewhere [20]. All samples were determined in duplicate. Optical densities were measured using a SpectraMax 384 plus plate reader (Molecular Devices, Sunnyvale, CA, USA) and the standard curves were obtained by four-parameter curve fitting using the SoftMax Pro 5.4.1 software (Molecular Devices). Values of calretinin and mesothelin in the plasma samples used in the subsequent performance analyses are presented in Additional File 3.

Statistical analyses

Box plots with median and inter-quartile range (IQR) were used to describe the distribution of marker concentrations. Whiskers depicted minimum and maximum. Mesothelioma cases and cancer-free controls were compared using the non-parametric Kruskal-Wallis test for continuous variables. Classification performances of GAS5, calretinin, and mesothelin were determined from receiver operating characteristic (ROC) curves. The accuracy of the marker performances was depicted by the area under the curve (AUC) and its 95% confidence interval (95% CI). The markers were combined in two different ways. For linear marker combination, ROC curves were calculated with the corresponding markers as independent variables in a multiple logistic regression model. Sequential combination was performed as described elsewhere [3]. In brief, the first marker was used for classification. Afterwards, only marker-negative subjects were examined with the next marker and so on. This procedure was repeated for all possible cut points to calculate the related sensitivities and specificities. The related AUC intervals depict minimum and maximum obtainable AUCs. Potential factors influencing GAS5 concentration were evaluated using a multiple linear regression model with log-transformed marker values. Estimates were given as Exp(β) with 95% CI and p-values. Here, values of Exp(β) > 1 indicate a positive and Exp(β) < 1 indicate a negative association between analyzed factor and GAS5. Statistical analyses were performed using SAS/STAT and SAS/IML software, version 9.4 (SAS Institute Inc., Cary, NC, USA). GraphPad Prism version 7.04 (GraphPad Software, La Jolla California, USA) was used to prepare graphs.

Results

In silico lncRNA expression analysis

The Affymetrix HG-U133 Plus 2.0 arrays include 2448 probe sets representing 1988 lncRNAs (Additional File 4). Analysis of the lncRNA expression between pleural mesothelioma cases and normal pleural controls identified 40 altered lncRNAs, of which 28 lncRNAs were up-regulated and twelve lncRNAs were down-regulated in mesothelioma (Table 2).

Table 2 Differently expressed lncRNAs between pleural mesothelioma cases (N = 9) and normal pleural controls (N = 4)

Assessment of lncRNA detectability

Candidate references were determined in NCI-H2452, NCI-H28, JL-1, MSTO-211H, and MeT-5A. Using RefFinder the GM of B2M, HPRT1, and RPLP0 was identified as the most stable reference for the normalization of lncRNAs in the analyzed cell lines.

Twenty-four lncRNAs initially identified in silico were determined in the cell lines as candidate markers. Using the 2-ΔΔCt method to assess different expressions between mesothelioma cell lines and MeT-5A as control revealed an up-regulation of AFAP1-AS1, GAS5, and LOC84856 in at least three of the mesothelioma cell lines. LOC642852 and LOC388796 showed a constant down-regulation in all cell lines, whereas CRNDE and LOC100130776 showed no altered regulation in mesothelioma cell lines. The remaining lncRNAs showed sporadically up- and down-regulation in the various cell lines (Table 3). All other lncRNAs were constantly not detectable in the cell lines.

Table 3 Fold change of long non-coding (lncRNAs) in mesothelioma cell lines. Fold changes > 2.0 represent an up-regulation and fold changes < 1.5 a down-regulation

Afterwards, the general detectability of the lncRNAs in liquid biopsies was assessed using nine plasma samples from four mesothelioma patients and five subjects formerly exposed to asbestos (Additional file 5). GAS5 was detectable in almost all plasma samples, in contrast to all other analyzed lncRNAs. Based on the obtained results, circulating GAS5 was selected as candidate marker for further analyses.

Circulating GAS5 as marker for mesothelioma

Potential references were measured in the plasma samples of the study group. Using raw Ct values statistically significant differences were revealed for PPIA (p < 0.001) and B2M (p < 0.001). No differences could be observed for RPLP0 (p = 0.516) and HPRT1 (p = 0.285), but HPRT1 was detectable only in 32 of 53 samples (60.4%). Additionally, using RefFinder RPLP0 was identified as the most stable reference. Thus, RPLP0 was selected for the normalization of circulating GAS5.

Eleven asbestos-exposed controls were excluded from analyses because raw Ct values of GAS5 or RPLP0 in plasma were ≥ 35, resulted in 22 mesothelioma patients and 31 asbestos-exposed controls appropriate for analysis. The median plasma level of normalized GAS5 was 4.05 (IQR 2.94–8.38) in mesothelioma patients and 0.62 (IQR 0.28–0.96) in asbestos-exposed controls (Fig. 1a). The difference of circulating GAS5 in plasma between mesothelioma patients and asbestos-exposed controls was statistically significant (p < 0.0001). Using ROC analysis an AUC of 0.86 (95% CI 0.75–0.98) was calculated for circulating GAS5 (Fig. 1b).

Fig. 1
figure 1

a Distribution of normalized GAS5 in plasma of mesothelioma patients and asbestos-exposed controls. b Receiver operating characteristic (ROC) curve of circulating GAS5

Using a predefined high specificity of 97%, allowing one false-positive test, resulted in 14% sensitivity for circulating GAS5 in plasma (Table 4).

Table 4 Performance of GAS5, calretinin, mesothelin, linear combination, and sequential combination of the markers

The impact of influencing factors on circulating GAS5 in plasma was analyzed in the study group. Pleurectomy before blood collection, age, and smoking status did not influence the GAS5 levels in plasma, whereas the target disease leads to increased values (Table 5).

Table 5 Estimates of the influence of potential factors on GAS5 in plasma

Determination of calretinin and mesothelin

The median calretinin value in mesothelioma patients was 1.17 (IQR 0.37–2.29) and in asbestos-exposed controls 0.18 (IQR 0.07–0.32). The median mesothelin level was 1.79 (IQR 1.08–11.41) in mesothelioma patients and 1.04 (IQR 0.74–1.34) in asbestos-exposed controls (Fig. 2a and b). Differences were statistically significant for calretinin (p < 0.0001) and mesothelin (p = 0.0026). Using ROC analyses, AUCs of 0.84 (95% CI 0.74–0.96), 0.75 (95% CI 0.61–0.89), and 0.88 (range 0.76–0.88) were calculated for calretinin, mesothelin, and the sequential combination of both markers, respectively (Fig. 2c). Using a predefined specificity of 97% revealed sensitivities of 55% for calretinin, 41% for mesothelin, and 64% for the sequential combination of calretinin and mesothelin (Table 4).

Fig. 2
figure 2

a Distribution of calretinin in plasma of mesothelioma patients and asbestos-exposed controls. b Distribution of mesothelin in plasma of mesothelioma patients and asbestos-exposed controls. c Receiver operating characteristics (ROC) curves of calretinin, mesothelin, and sequential combination of calretinin and mesothelin

Combination of GAS5 with calretinin and mesothelin

The combination of GAS5, calretinin, and mesothelin was evaluated using a linear and a sequential approach. The ROC curves revealed an AUC of 0.88 (95% CI 0.78–0.99) for the linear combination and 0.96 (range 0.85–0.96) for the sequential combination (Fig. 3). Using a predefined specificity of 97% resulted in 73% sensitivity for the linear combination and 68% sensitivity for the sequential combination (Table 4).

Fig. 3
figure 3

Receiver operating characteristics (ROC) curves of linear and sequential combination using GAS5, calretinin, and mesothelin

Discussion

Altered expression of lncRNAs were shown in multiple human cancers [21] and the number of lncRNAs surpasses the number of protein coding genes, i.e. approximately 60,000 lncRNAs vs. 30,000 protein coding genes [22]. Thus, it is indicated that the large pool of lncRNAs includes a greater number of potential marker candidates for the detection of cancer.

The Affymetrix Gene Chip Human Genome U133 is one of the most frequently used microarray for human cancer profiling [9] and expression data are deposited in public gene expression repositories, e.g. GEO. The re-annotation and the classification pipeline according to Zhang et al. allows the identification and the expression analysis of lncRNAs using expression data sets from analyses primarily targeting mRNAs [9]. In this study, the in silico analysis of the lncRNA expression using pleural mesothelioma and normal human pleura revealed 40 altered lncRNAs. Notably, RP1-86D1.3 [7] was not identified as differently expressed lncRNA. Additionally, Wright et al. detected 33 lncRNAs differentially expressed in mesothelioma cell lines compared to MeT-5A [23], but only NEAT1 was identified in both studies. Such differences might rely on different microarrays and samples types as well as relatively small numbers of analyzed samples. However, in previous analysis using plasma samples we could not confirm NEAT1 as circulating marker for mesothelioma (data not shown). Thus, NEAT1 was not further analyzed in the current study.

In this study, only GAS5 was reliably proven as a marker for the detection of mesothelioma using liquid biopsies. GAS5 is located on chromosome 1q25, comprising twelve exons and playing an important role in carcinogenesis as a tumor suppressor [24]. We found a significant up-regulation of GAS5 in plasma of mesothelioma patients and mesothelioma cell lines, confirming our initial in silico analysis of published expression data based on tissue samples of pleural mesothelioma. This agrees with the up-regulation of GAS5 in mesothelioma tissues shown by Renganathan et al., although in the same study a down-regulation of GAS5 was observed using primary mesothelioma cell cultures [25]. Such divergent results were also obtained for the lncRNA MALAT1, reflecting that in spite of its ubiquitous expression MALAT1 could function in a cell type- or tissue-specific manner [26], and the same might be true for GAS5. Accordingly, despite the described predominant role as tumor suppressor, it appears that GAS5 might also act as an oncogene, not only in mesothelioma but also in other malignancies, e.g. prostate and esophageal cancer [27, 28]. However, the results show unambiguously that GAS5 fulfills the first key characteristic of a diagnostic marker: to be detectable in liquid biopsies. Additionally, the observed up-regulation of GAS5 in mesothelioma tissues as well as in plasma of mesothelioma patients suggested that the presence of circulating GAS5 in plasma might be a direct effect of the tumor, e.g., via the secretion of lncRNA containing extracellular vesicles [29]. To the best of our knowledge, this is the first study using circulating GAS5 as a diagnostic marker for the detection of malignant mesothelioma using liquid biopsies. Kresoja-Rakic et al. determined GAS5 as prognostic marker using plasma samples from mesothelioma patients before and after chemotherapy [30]. Otherwise, circulating GAS5 is repeatedly suggested as a marker for the detection of lung cancer, showing a consistent down-regulation in cancer patients [31,32,33].

Regarding the key characteristics of a sufficient sensitivity and a high specificity, the sensitivity of candidate markers should be calculated at a fixed high specificity [34]. Generally, high specificity is needed to avoid false-positive tests, resulting in psychological stress, overdiagnosis, and needless interventions for the affected. Due to the relatively small number of analyzed subjects only a single false-positive test was allowed in this study, representing a predefined specificity of 97%. This resulted in a low sensitivity of 14% for GAS5 as a single marker. However, lower sensitivity could be balanced by the use of various markers in a panel. In theory, in an optimal panel every marker is characterized by sufficient sensitivity and the necessary high specificity, complementing each other to obtain superior diagnostic performance [35]. The potential of calretinin and mesothelin to discriminate between mesothelioma cases and asbestos-exposed controls has been confirmed in various studies [36,37,38,39] and recently, both markers have also been validated for the early detection of malignant mesothelioma [3]. Using a linear approach GAS5 was combined with calretinin and mesothelin. At a predefined specificity of 97%, the AUC of this combination decreased slightly, but the sensitivity increased from 64 to 73%, resulting in two additional true-positive tests in comparison to the combination of calretinin and mesothelin alone. Using a sequential combination of the three markers revealed a higher AUC, but at the predefined specificity the sensitivity is 68%, resulting in only one additional true-positive test. The results indicate that GAS5 might be useful as a complementary marker for the established marker combination. Because the combination of calretinin and mesothelin already works well, the performance of the panel is improved only slightly by the additional marker. Further improvements will most likely require a larger number of markers. It should be noted, however, that the current as well as previous studies analyzed marker combinations with diagnosed cases at mostly late stages of tumor development. As has been shown before, using prediagnostic samples of mesothelioma cases can result in a better performance regarding marker complementation [3].

Considering the forth key characteristic, GAS5 seems to be relatively robust regarding obvious influencing factors. However, this assumption needs to be verified in more detail using larger study groups. Additionally, it was shown that hemolysis influence microRNA levels in plasma [40] and the same might be true for lncRNAs [41]. Thus, free hemoglobin was determined in all plasma samples, showing that no hemoglobin value exceeding the clinically significant threshold (> 0.3 g/l) [42]. However, the real impact of the hemolysis grade on lncRNAs levels should be analyzed in more detail in appropriate studies, e.g. using artificial hemolysis [43].

The results of this study are based on small numbers. Thus, it might be meaningful to verify GAS5 in a larger and independent study group. Additionally, it is known that earlier detection of cancer can improve survival, at least for some cancer types [44]. Thus, for early diagnosis it will also be necessary to validate GAS5 as well as the previously identified RP1-86D1.3 [7] in a prospective study regarding their potential to detect mesothelioma in prediagnostic plasma samples and to complement calretinin and mesothelin. This validation procedure is an obligatory step to select appropriate candidate markers for early detection. This is exemplified by some promising candidate markers, i.e. miR-103a-3p, miR-132-3p, and 126-3p, that were identified in common case-control studies but ultimately failed to detect mesothelioma in prediagnostic samples [45]. Therefore, more marker candidates - preferably of all molecular classes - need to be identified and validated for the completion of a useful and reliable marker panel to detect malignant mesothelioma at early stages.

Conclusions

GAS5 was identified in silico and verified in cell lines as well as human liquid biopsies as an appropriate circulating marker for the supplement of calretinin and mesothelin to detect malignant mesothelioma. Although the sensitivity of GAS5 is too low for the use as a single marker, the addition of GAS5 as a third marker improves the performance of the established marker panel. The benefit of GAS5 for the detection of mesothelioma at early stages using plasma samples taken before clinical diagnosis still needs to be validated in a prospective study.