Introduction

In recent years, breath analysis has become a research focus in the field of respiratory disease diagnosis due to its noninvasiveness, convenience and real-time analysis1. The principal component in breath is water vapor, and the remaining parts include volatile organic compounds (VOCs) and nonvolatile matters dissolved in water or contained in exhaled aerosol particles2. To date, over 3000 different VOCs have been detected in human breath3, with some successfully used in the detection of diseases including lung cancer. VOCs in breath and their applications are shown in Table 1.

Table 1 VOCs in human breath.

At present, a Tedlar® bag is often used in sampling VOCs in breath34. Collected VOCs could be tested by multiple spectrometric techniques such as gas chromatography, ion transfer reaction, ion flow tube, ion mobility and so on35, 36. These analytical techniques are sensitive and accurate, but they also have many restrictions, such as high cost, requiring professional operation and requiring pre-concentration of the breath37. In recent years, inexpensive and portable e-nose systems have been proposed and designed for respiratory disease detection, the measurement reproducibility of the e-nose system was also validated to be acceptable38. The common methodology and application of e-nose systems were described in detail elsewhere39, 40. For lung cancer detection, Mazzone et al. reported an e-nose based on colorimetric sensor arrays could show a good performance41, 42; Haick’s team from the Israel Institute of Technology developed an e-nose system with gold nanoparticle sensors for the detection of lung cancers as well. Simulations revealed the accuracy for lung cancer detection to be over 86%, and subsequent experiments proved that this e-nose system could identify many types of cancers20, 43; Blatt et al. used metal oxide semiconductor sensor arrays for lung cancer diagnosis with accuracy, sensitivity and specificity all of over 90%44. However, these electronic noses are often based on sensor arrays with similar response principles. These sensors are very similar in terms of sensitivity and response patterns. Less research has been performed on type-different sensors for the diagnosis of lung cancers. Different feature extractions and classifiers could significantly affect recognition effects of sensor arrays, but little has been reported in this area.

In the current study, we used 14 gas sensors and 2 temperature/humidity sensors to develop a small-sized e-nose system. Samples from 52 volunteers were tested using the system. A software based on C# was also programmed to control detection process and generate the “breath pattern”. Five algorithms for extracting features from the “breath pattern” were compared. Finally, a 10 fold cross validation method was used to investigate the classification performance of the e-nose system with two classifiers. Lastly, whether type-different sensor arrays could improve the recognition ability of e-nose systems was also studied.

Materials and Methods

Selection of type-different sensors

According to the major components and their concentration ranges in human breath especially from lung cancer patients (Table 1), we selected 14 gas sensors which could be classified into 4 types: metal oxide semiconductor (MOS), hot wire gas, catalytic combustion gas, and electrochemical gas sensors.

None of the four classes of sensors are gas specific sensors; they are all cross response sensors, i.e. different gas sensors responding differently to the same gas mixture and the responses of the same sensor to different gas mixtures are also different. According to this response pattern, sensor arrays could form characteristic “breath fingerprints” by exhalation components, and could diagnose diseases through pattern recognition of a “breath fingerprint”. All sensors used in this study are commercial sensors, considering they are robust and stable. These sensors could be easily obtained to carry out repetitive tests. Details of gas sensors used in this study are listed in Table 2.

Table 2 Sensors used in this study.

E-nose system design

We designed the miniature e-nose system using the above mentioned sensors. The system’s gas reaction chamber is a rectangular enclosure made from aluminum alloy, with a volume of about 220 ml. All sensor probes are embedded into the gas reaction chamber. Sampling frequency of the whole system was designed to be 10 Hz. When the gas to be tested passes through the gas chamber, the sensor arrays will respond and form voltage (or current) signals. After initial processing (such as filtering, amplification, etc.), signals are sent to the main control chip (STM32F10) for analog digital conversion and temperature/humidity compensation calculations. Finally the signals are sent to the computer through a universal serial bus (USB) for display and storage. The compensation method of temperature/humidity involves taking the means of the two groups of temperature/humidity sensors (Model:HTG3515CH, Humirel Inc., France) placed at the air inlet and air outlet for linear compensation. The electronic nose system designed for this study is shown in Fig. 1.

Figure 1
figure 1

The photo of the designed E-nose system.

In addition, in the current study, we programmed upper computer software based on C#. This software controlled detection modalities of the electronic nose system, as well as the display, storage of test data, and sample information management. The database used by this software was mySQL (Oracle®, CA, USA).

Lastly, all breath samples in this study were collected by the 2 L Tedlar® bag (E-switch®, China), and pumped into the designed e-nose system by a gas sampling pump. The detection platform is shown in Fig. 2.

Figure 2
figure 2

Overview of the breath sampling and analysis system. (a) Tedlar® bag for breath sampling. (b) Photograph of the E-nose system and software interface. (c) Block diagram of the system.

Source of test samples

Breath samples in this study included 24 cases from lung cancer patients, 5 cases from patients with other respiratory diseases (4 with COPD and 1 with Silicosis), 10 cases from healthy smokers and 13 healthy non-smokers. Among them, lung cancer patients were from the in-patient department of respiration at the Second Affiliated Hospital of Chongqing Medical University; patients with other respiratory diseases were from People’s Hospital of Jiangbei District and the Second Affiliated Hospital of Chongqing Medical University; healthy control volunteers (smokers and non-smokers) were recruited from Chongqing University. All volunteers singed informed consent after a detailed introduction of the purpose and plan of this study. Protocols including any relevant details of this study were carried out in accordance with the relevant guidelines and approved by Medical Ethics Committee of Chongqing University. Details of all volunteers for this study are listed in Table S1.

Test process

All volunteers involved in this study fasted overnight prior to sampling. Meanwhile, smokers were asked to stop smoking two hours prior to sampling. All Tedlar® bags were washed with nitrogen three times prior to sampling. The process for collection VOCs in breath is as follows:

  1. (1)

    All volunteers rested for 3–5 minutes and rinsed their mouths with clear water 3–5 times.

  2. (2)

    All volunteers put on nose clips and breathed in deeply with their mouths, exhaling into sampling bags via a disposable mouth piece until the bags were filled.

  3. (3)

    Sampling bags were then tightened and marked.

All samplings experiments were carried out in a well-ventilated room to avoid interference of other scents. All breath samples were tested within 8 hours after sampling; otherwise samples would be re-collected.

The breath detection process is as follows:

  1. (1)

    Opening the gas sampling pump to pump ambient air into the gas chamber at a rate of 6 L/min. Sensor response gradually stabilized near baseline, which normally took 30–80 s. This process can be termed a “preparation phase”.

  2. (2)

    Breath samples were collected and pumped into the gas chamber, and the sensors started to respond. This process took 10 s and can be termed the “ventilation phase”.

  3. (3)

    Disconnecting the gas sampling pump resulted in sensor responses gradually stabilizing. This process took 50 s and can be called the “response phase”.

  4. (4)

    Opening the gas sampling pump to inlet ambient air. Sensor response gradually returned to baseline. This process took 30 s and can be called “deflation phase”.

The effective detection period of each sample in this study includes a “ventilation phase”, a “response phase” and a “deflation phase”, altogether totaling 90 s. Each stage of the detection process was precisely controlled by a timing relay and electromagnetic valve. A high efficiency air filter was used when ambient air was pumped into the chamber during the “preparation phase” to dispose of solid debris that might exist in the air, avoiding sensor damage and ensuring baseline stability. All sample detections were carried out at the same place with good indoor ventilation.

Data analysis

For successful identification of disease with an e-nose system, raw data should be analyzed and processed appropriately. All data was processed using Matlab. The data analysis procedure is as follows:

Data pre-processing

All raw data measured by the electronic nose system should be pre-processed before analysis, which includes baseline processing and standardization.

Baseline processing was performed for drift compensation and contrast enhancement45. Assuming a total of NS samples were detected in this study, each sample contained ND sensor response data; the length of each sensor detection data was NT; and the length of the stabilized data at the “preparation phase” was NB (NB ≤ NT); then the response (after baseline processing) to sample S (S = 1, 2, 3, …, NS) from sensor D (D = 1, 2, 3, …, ND) at time T (T = 1, 2, 3, …, NT) would be:

$${R}_{(S,D,T)}^{B}={R}_{(S,D,T)}-\frac{1}{{N}_{B}}\sum _{t=1}^{{N}_{B}}{R}_{(S,D,t)}$$
(1)

In which, R(S, D, T) and R(S, D, t) are the actual responses to sample S from sensor D at time T and time t (t = 1, 2, 3, …, NB).

Data standardization is necessary in order to compensate for the numerical range and/or unit differences of the results measured by sensors in the gas chamber43. A standard deviation method was adopted in this study for data standardization, i.e. the average response of each sensor was 0 and the standard deviation was 1 after standardization.

Feature extraction

In this study, the sampling frequency of electronic nose system was set at 10 Hz. The data dimension obtained from each sensor in one sampling cycle was 10 × 90 = 900. For each sample, there were actually 16 sensors for detection. Therefore, the feature vector dimension consisting of sensor responses obtained from each sample was 900 × 16 = 14400, far greater than the total number of samples included in this study. There was a great amount of redundancy in these dimension data. Redundant data were not only non-conducive to the internal parameters of the calculation classification method, but also would lead to over-fitting46. Therefore, it is necessary to reduce the dimension of the obtained characteristic matrix and extract main features. In the current study, we searched for optimal dimension reduction mapping from five different dimension reduction methods, including principal component analysis (PCA), linear discriminant analysis (LDA), Laplacian Eigenmap (LE), local linear embedding (LLE), and t-Stochastic Neighbor Embedding (tSNE).

When applying the features for classification after PCA dimension reduction, it was necessary to determine the number of principal components-k. The method used in this study was: the eigenvalues of the covariance matrix of the original data set were arrayed from large to small as: λ1, λ2, …, λn, with the total variance percentage τ of the first k components occupied being:

$$\tau =\frac{{\sum }_{j=1}^{k}{\lambda }_{j}}{{\sum }_{j=1}^{n}{\lambda }_{j}}$$
(2)

We set τ = 99% in this study to calculate the parameter k.

Classifier selection

In this study, we initially selected 2 classifiers, fuzzy k-NN and support vector machine (SVM). Optimal values of near neighbor k and proportionality coefficient m in Fuzzy k-NN were determined using the 10-fold cross validation method. The kernel function used in SVM was a radial basis function. Penalty factor C and kernel function parameter σ were determined via grid search method45.

Error estimation

When classifying the breath samples, 5 samples from other respiratory disease were excluded due to small sample size. Meanwhile, healthy smokers and healthy non-smokers were classified as the healthy group. The remaining 47 samples could be classified into the following two groups: lung cancer negative group and lung cancer positive group. The confusion matrix formed by real results and predicted results is shown as Table 3:

Table 3 Confusion matrix obtained from classifier.

Then the accuracy rate (Acr), sensitivity (Tpr) and specificity (Tnr) are:

$$Acr=\frac{TP+TN}{TP+FP+TN+FN}$$
(3)
$$Tpr=\frac{TP}{TP+FN}$$
(4)
$$Tnr=\frac{TN}{FP+TN}$$
(5)

Data Availability

All data generated or analyzed during this study are included in this published article (and its Supplementary Information files).

Results and Discussion

Response curves of sensor arrays

The typical response curve of the electronic nose system to the breath sample is shown in Fig. 3. It can be seen that the curve contains the three phases of one detection period, i.e.: ventilation phase, response phase and deflation phase. The response curves of most sensors were similar within the detection period. However, the curves of some sensors appeared to be quite different, such as TGS2444, MR516 and NAP55A. These pattern differences made the “breath fingerprint” more apparent.

Figure 3
figure 3

Typical response curves of the sensor arrays. (a) Response curves of the sensor arrays before preprocess. (b) Response curves of the sensor arrays after preprocess.

Comparison of feature extraction effects

When performing feature extraction, some methods involved parameter selection, such as the near neighbor number k in LE algorithm and the Perplexity parameter in the tSNE algorithm. We compared the LE mapping results of k = 3, 5, 7, 9, 11, finding the best one to be when k = 7. Similarly, after comparison, the Perplexity parameter in tSNE was set as 5. Besides, LDA is a supervised dimension reduction algorithm, such that the mapping results of different sample labels were different. In this study, we examined dimension reduction results via LDA for 4 types of sample labeling (lung cancer group, other respiratory disease group, healthy smoking group and healthy non-smoking group) and 3 types of labeling (healthy smokers and healthy non-smokers were classified together as a healthy group). The 2D mapping results of different feature extraction methods are as follows:

It can be seen in Fig. 4 (a–e) that different feature extraction methods may result in different classifications. However, we could also conclude from the 5 dimension reduction mapping results that:

Figure 4
figure 4

Mapping results of 5 algorithms (2D). (a) Represents the 2D mapping plot of PCA, (b) is the 2D mapping result of LDA with 4 classes of labels, c is the 2D mapping plot of LE, (d) is the 2D mapping plot of LLE, (e) is the 2D mapping result of tSNE, and (f) is the 2D mapping results of LDA with 3 classes of labels. Green dots represent samples from healthy non-smokers, blue dots represent samples from healthy smokers, red hexagrams represent samples from lung cancer patients, and light blue diamonds represent other disease samples.

First, in the above 5 kinds of feature extraction methods, LDA had the best classification performance for 4 types of sample labels in this study. This was indicated by the great distances between the 4 groups and the aggregation of samples in the same group as shown in Fig. 4b. In addition, there was little overlap between different groups. The next best classification results were from LE and PCA.

Second, no matter which dimension reduction algorithm was applied, healthy smokers and lung cancer patients could all be well classified, i.e. the blue dots and red hexagrams are mutually far away and have nearly no overlap in Fig. 4a–e. This also indicates that although many lung cancer patients in this study were also smokers or former smokers, their exhaled breath were significantly different from that of healthy smokers. This result was practically significant. Indeed, the smokers are a high risk population for lung cancer. However, the e-nose system used in this study could optimally distinguish lung cancer patients from healthy smokers in this high risk population. This also suggests that the simple screening of lung cancer patients in a high risk population is practically feasible.

Third, if samples were classified by the 3 classes of labels (healthy smokers and healthy non-smokers were classified as the healthy group), then we should only consider the dots (blue and green) as one group. This way, it can be seen that there is some overlap between the healthy group and lung cancer patients in the 2D mapping results of dimension reduction by PCA, LE, tSNE and LLE. However, Fig. 4f indicates that LDA still had a good classification performance for the 3 classes of sample labels.

Figure 4a–e reveals that no matter which dimension reduction method was adopted, the e-nose system developed in this study was able to distinguish healthy smokers from healthy non-smokers, as shown in Fig. 4; blue and green dots are far away from each other with only slight overlap. This result suggests that the breath components of smokers and non-smokers might be quite different, consistent with the conclusions of related studies47.

Lastly, Fig. 4 also showed that samples from other respiratory diseases (light blue diamonds) were all distinguishable from samples of healthy smokers (Figure a-e), from samples of the healthy population (Fig. 4f), and from samples of lung cancer patients (Fig. 4f). However, due to the insufficient sample size of patients with other respiratory diseases, this conclusion should be further validated with more samples.

Estimation error analysis

As mentioned before, to carry out this part of study, we excluded samples from patients with other respiratory disease, and classified the remaining 47 samples into 2 groups: a healthy group and a lung cancer group. Besides, according to the previous dimension reduction comparison, data for classification were from feature extraction results by LDA, LE and PCA. When using the mapping results by PCA, according to the methods introduced before, the principal component number was 21, and these components explained 99.08% of the variance.

When using Fuzzy k-NN for classification, the near neighbor number k and the proportionality coefficient m should be optimized. A 10 fold cross validation method was used for optimization in this study; results are shown in Fig. 5. It can be seen in Fig. 5a that when k = 5, 9, 13, 17, 21, classification accuracy of Fuzzy k-NN algorithm remains unchanged. Therefore, k = 5 was utilized in this study. Figure 5b showed the effects of proportionality coefficient m on the classification accuracy of Fuzzy 5-NN. It can be seen that when m = 2, the classification accuracy of 3 classification method yielded relatively better results. Therefore, m was set as 2 in this study.

Figure 5
figure 5

Optimization of Fuzzy k-NN algorithm parameters (k, m).

As described before, a grid search method was used to optimize the parameters (C, σ) of SVM48. Optimized results for PCA-SVM are shown in Fig. 6. When (C, σ) = (0.0625, 0.25), the classification accuracy of PCA-SVM is the best. Similarly, optimal (C, σ) of LDA-SVM and LE-SVM were also determined by this method.

Figure 6
figure 6

Contour map of PCA-SVM parameter (C, σ) optimizing. The arrow is pointing at the optimal parameter by cross validation of the grid optimization.

After all parameters were optimized, 47 breath samples were classified using 2 classifiers based on the mapping results of 3 feature extraction methods mentioned above. Here, we also tested the classification performance by the 10-fold cross validation method. Each combined classification method was tested 50 times to obtain the confidence intervals of classification errors (shown in Table 4). It can be seen from the table that the best classification performance among the 6 classification methods were the combined methods based on LDA (LDA-Fuzzy 5-NN and LDA-SVM). This was consistent with results observed in the dimension reduction mapping plots (Fig. 4). Among them, LDA Fuzzy 5-NN showed the best classification results, which produced greater than 90% of sensitivity, specificity and accuracy.

Table 4 Distinguishing results of lung cancer samples and healthy samples by different classification methods.

Effects of type-different sensor array on lung cancer recognition

In order to study the effects of type-different sensor arrays of the e-nose on lung cancer recognition performance, we classified the 14 gas sensors into 2 groups: group K and group T (Table 5). Of them, group K included all 4 types of sensors, and group T included only MOS sensors. According to the response results (Fig. 1), the sensitivity, response patterns and principles of sensors in group K had a greater relative difference from each other than that of sensors in group T.

Table 5 Sensors grouping.

We still used the 3 dimension reduction algorithms (LDA, PCA and LE) selected earlier to extract principal features of the data obtained by these 2 groups of sensors. For comparison, 52 samples were still classified into 4 groups (lung cancer group, healthy smoking group, healthy non-smoking group and other disease group). 2D plots of the mapping results are shown in Fig. 7. It can be seen that, no matter which group the data came from, the classification performance of all the 3 feature extraction methods decreased when comparing to the mapping results in Fig. 4. However, data from sensor group K gave better classification results (Fig. 7a,c,e) than that from sensor group T (Fig. 7b,d,f) when using the same feature extraction algorithm.

Figure 7
figure 7

Mapping results of PCA, LDA and LE using data from sensor group T and group K. (a,c,e) are 2D mapping plots of PCA, LDA and LE based on data obtained by sensor group K; (b,d,f) are 2D mapping plots of PCA, LDA, and LE based on data obtained by sensors of group T.

In order to further compare the recognition performance of sensor group K and sensor group T, we used the Fuzzy k-NN, which had better performance in the previous classification to classify breath samples based on mapping results depicted in Fig. 7. Similarly, for comparison, breath samples of other diseases were also excluded, and healthy smokers and non-smokers were classified as one group, i.e. 47 samples were divided into two groups: a healthy group and a lung cancer group. After parameter optimization, accuracy of 3 classification methods (LDA 5-NN, PCA 5-NN and LE 5-NN) were obtained using 10-fold cross validation. We ran the validation 50 times to get 50 accurate readings for each classification method. Then, the recognition accuracies of sensor group K and sensor group T were analyzed using one way-ANOVA; results are shown in Fig. 8.

Figure 8
figure 8

Accuracy of 3 classification methods based on the data from sensor group K and sensor group T. (a) Comparison of classification accuracy using LDA 5-NN based on data from sensor groups K and T; (b) Comparison of classification accuracy using PCA 5-NN based on data from sensor groups K and T; (c) Comparison of classification accuracy using LE 5-NN based on data from sensor groups K and T. ***indicates significance: p < 0.01.

Similar conclusions could be drawn from results shown in Fig. 8 as in Fig. 7. Firstly, no matter which sensor group was used, the lung cancer recognition performance was not as good as the performance that all 14 gas sensors showed. Secondly, lung cancer detection accuracy was better when using data from sensor group K than that from sensor group T. When utilizing PCA 5-NN (Fig. 8b) and LE 5-NN (Fig. 8c), the recognition accuracy of sensor group K was significantly greater than that of sensor group T (n = 50, p < 0.01). Although the classification accuracy of sensor groups K and T was not significantly different (n = 50, p = 0.21) when using LDA 5-NN, the average and highest recognition accuracy of sensor group K was greater than that of sensor group T after 50 times of cross validation. In summary, type-different sensors are notably helpful for improving the lung cancer recognition ability of the e-nose system designed in this study.

Conclusion

The e-nose system developed using 4 types of commercial sensors in this study could identify relatively specific “breath fingerprints” based on human breath, which could be used to recognize volunteers in different diseased or healthy states. LDA proved to be among the best methods for “breath fingerprint” recognition in this study. When this e-nose system was used in the differentiation of lung cancer patients from healthy volunteers, the classification specificity, sensitivity and accuracy as determined by LDA fuzzy 5-NN were all above 90%, indicating a comparable performance with traditional imaging modalities. In addition, difference in the type of sensor arrays are notably helpful for the improvement of the ability of the e-nose system to detect respiratory diseases.

In a word, the designed e-nose system based on optimized algorithms was low cost, noninvasive and was potential practicable in screening lung cancers from both healthy people and lung cancer high risk populations.