Introduction

Consumer awareness of quality and health consciousness is pushing the olive oil industry toward a market that meets consumers’ preferences. This trend has led to the adoption of assurance schemes, such as the Protected Designation of Origin (PDO) certification, as effective tools for distinguishing extra-virgin olive oils (EVOOs) originating from the Euro-Mediterranean region [1]. In today’s context, ensuring the accuracy of the PDO label is crucial for the olive oil market and its associated economic activities [2]. However, the registration of PDOs often hinges more on defined geographic boundaries determined by administrative aspects rather than objective chemical and sensory data [3]. Consequently, validating the authenticity of PDO oils poses a significant challenge for both producers and certification bodies, with misidentification arisen due to variations within the PDO category. This variability arises because oils bearing the same PDO designation can be extracted from olives of different cultivars and in varying relative proportions [4].

The challenge in clearly delineating specific boundaries for PDO makes the use of physico-chemical data of the target olive oils essential for both the administration and PDO councils to support their justifications for approving or rejecting new PDOs [3]. Several analytical techniques (e.g., high resolution melting analysis, evaporative ionization mass spectrometry, laser induced breakdown spectroscopy, LC–MS and GC–MS multi-class methods, and NMR) have been proposed to verify the authenticity of olive oil PDO labels, namely PDO versus non-PDO oils [2, 4,5,6,7,8]. Despite the effectiveness of the techniques mentioned earlier and their successful application in analysing PDO olive oils, some of them have drawbacks. They can be invasive or destructive, requiring pre-treatment of samples and the use of non-environmentally friendly solvents. Additionally, these techniques tend to be not portable, expensive and require skilled technicians to operate, making them challenging for in-situ implementation.

As a result, there has been an effort to develop eco-friendly, cost-efficient, user-friendly, and non-destructive alternatives like, for example, the use of sensor devices, particularly electronic noses (E-noses), for analysing olive oil. E-noses, combined with other E-senses devices, have been applied to assess the physico-chemical-sensory quality of olive oil and evaluate their shelf life [9, 10]. They have proven effective in discriminating the geographical provenance of EVOO [11,12,13]. Furthermore, E-noses have been successful in monitoring the oxidation of olive oils by tracking the concentration of oxidative volatile markers and assessing the influence of filtration systems on the volatile profiles of olive oils [14]. On a sensory level, E-noses have been used to predict fruity aroma intensity [15, 16], detect olfactory defects [15], and differentiate flavoured and unflavoured olive oils [17]. Regarding PDO olive oils, to the best of the authors’ knowledge, only one study [18] has mentioned the use of an artificial nose, either independently or in conjunction with NIR and UV–Vis spectroscopy, to characterize PDO Chianti Classico olive oil and distinguish it from olive oils originating from other Italian PDO regions.

In this context, the primary objective was to establish, for the first time, the specific volatile chemical classes present in olive oils from three distinct Portuguese PDO regions: “Azeite do Alentejo Interior”, “Azeites da Beira Interior”, and “Azeite de Trás-os-Montes”. Additionally, the study aimed to assess the intensity of major olfactory sensations. The broader goal encompassed investigating whether data on volatile compounds and olfactory attributes could serve as potential markers for PDO classification, considering the acknowledged within-region variation in PDO olive oils due to the different olive cultivars and their relative proportions used for producing oils eligible for the same PDO certification. Lastly, the study set out to assess the feasibility of employing a custom-designed E-nose, comprising metal oxide semiconductor (MOS) sensors, to effectively identify oils based on their PDO classification as well as to quantify the volatiles’ concentrations of the oils.

Materials and methods

Olive oil samples

Commercial olive oils with PDO declarations of “Azeite do Alentejo Interior”, “Azeites da Beira Interior”, and “Azeite de Trás-os-Montes” were obtained from certified olive oil producers within each of the referred geographical regions. During the sample selection process, the aim was to ensure the representativeness of the sampled olive oils with regard to the country’s overall production area. This consideration went beyond merely accounting for the volume of oil produced, extending to encompass certified PDO brands prevalent in the market. Portugal’s olive oil production is predominantly clustered in three distinct regions, ranked in decreasing order of significance as Alentejo, Trás-os-Montes and lastly Beira Interior. To capture this diversity, one PDO from each of these regions was chosen for the study. Specifically, for Alentejo, emphasis was placed on selecting the PDO that is linked to the largest production area, boasting a substantial volume of both oil production and certification—namely, “Azeite do Alentejo Interior.” In the case of the other two regions, where only one PDO exists per region, the representatives were “Azeite de Trás-os-Montes” and “Azeites da Beira Interior”. According to the producers, all selected samples were extracted in the 2021 campaign. After acquisition, the samples were coded, stored at room temperature, and protected from light exposure until analysis. From each region, 10 independent oils were studied, being acquired three 500 mL bottles for each one. Taking into account the legal regulation [19], the “Azeite do Alentejo Interior” PDO olive oils must be extracted from olives of cvs. Galega Vulgar (≥ 60%), Cordovil de Serpa and/or Cobrançosa (≤ 40%), although olives from other cultivars may be included (≤ 5%). The “Azeites da Beira Interior” PDO olive oils must be obtained from cvs. Galega, Verdeal Transmontana, Cobrançosa and Cordovil [20]. Finally, the “Azeite de Trás-os-Montes” PDO olive oils can be extracted from olives of cvs. Verdeal Transmontana, Cobrançosa, Madural, Cordovil, Santulhana, Borreira, Redondil, Negrinha, Carrasquenha, Bical, Lentisca and other varieties in smaller percentages [21]. In all the Portuguese PDO’s olive oils, foreign olive cultivars are not permitted.

Volatile characterization by HS–SPME–GC–MS

The volatile fraction of the olive oils was analysed using headspace solid-phase microextraction (HS-SPME) coupled with gas-chromatography-mass spectrometry (GC-MS). A fiber made of divinylbenzene/carbonex/polydimethylsiloxane (DVB/CAR/PDMS, 50/30 µm) from Supelco (Bellefonte, USA), was used. The GC-MS analysis was performed using a Shimadzu GC-2010 Plus chromatographer equipped with a Shimadzu GC-MS-QP2010 SE detector [22]. In 50 mL vials, 3 g of olive oil were spiked with 5 µL of an internal standard solution (4-methyl-2-pentanol, 98% from Sigma Aldrich) with a concentration of 0.127 mg/mL, being allowed the volatiles to be adsorbed onto the SPME fiber. The vials were conditioned at 40 °C for 5 min to effectively release the volatile compounds. Following this, the SPME fiber was exposed for 30 min at 50 °C, allowing for the adsorption of volatile compounds present in the headspace, while the samples were kept under agitation (350 rpm). A total of 60 chromatographic assays were conducted (3 PDOs × 10 bottles × 2 extractions × 1 injection). The peaks’ separation was achieved using a TRB-5MS column (30 m × 0.25 mm × 0.25 µm) from Teknokroma (Spain). The injector temperature was set at 220 °C, and manual injections were performed in splitless mode. Helium (Alphagaz from Air Liquide, Portugal) was used as the mobile phase at a linear velocity of 30 cm/s and a total flow of 24.4 mL/min. The oven temperature gradient was as follows: 40 °C/1 min; 2 °C/min until 220 °C (30 min). The ionization source was maintained at 250 °C with an ionization energy of 70 electronvolts and an ionization current of 0.1 kilovolts. All mass spectra were acquired by electron ionization, and the spectral fragments were identified by comparison with the NIST 11 Library (National Institute of Standards and Technology, Gaithersburg, MD, USA) and commercial standards. The chromatographic peak areas were determined by integrating the re-constructed chromatogram from the full scan chromatogram using the ion base (m/z intensity 100%) for each compound. For identification purposes, the minimum similarity percentage was set equal to 85%. For semi-quantification, the amounts of the identified volatiles were calculated by the ratio of each base ion peak area to the area of the internal standard base ion peak area, without considering the response factors, and converted to mass equivalents based on the mass of the internal standard used.

Olive oil olfactory analysis

The evaluation of olive oil’s olfactory sensations was performed by eight trained panellists (comprising 3 men and 5 women), along with the panel leader, within the age range of 30–58. The sensory panel followed the methodologies outlined in the European Union standard methods [23,24,25]. The perceived intensity of olfactory sensations was graded using an unstructured continuous scale ranging from 0 (no perceived sensory sensation) to 10 (maximum perceived intensity). While gustatory sensations were also evaluated, the related data were not exploited in the present study.

E-nose analysis

The E-nose used in this study was the same previously custom-designed and assembled within the research group’s laboratory, and described in detail by Teixeira et al. [16, 26]. In brief, the device comprises a heated sampling unit (28 °C) and a heated multi-sensor detection array placed in a chamber with controlled temperature (35 °C). The headspace gas phase from the sampling unit was delivered to the detection chamber using a diaphragm vacuum air pump (model SC3502PM, SKOOCOM, China). To ensure cleanliness of the system and sensors, ambient air was continuously flowed until a stable baseline was achieved. The in-house E-nose apparatus incorporated nine commercial metal oxide semiconductor (MOS) sensors (Table 1), whose electrical characteristics were sensitive to adsorption phenomena taking place on the sensors’ surface when exposed to volatile compounds.

Table 1 Target gases of the commercial MOS sensors integrated into the lab-manufactured E-nose

The sensor responses, measured in electrical resistance (ohms, Ω), were recorded using an Agilent data acquisition unit (model 34970A), controlled through Agilent BenchLink Data Logger software.

For the analysis, 0.5 mL of each olive oil sample were pipetted into a 25 mL glass vial, which was then placed inside the sampling chamber at 28 °C (temperature recommended by the International Olive Council for sensory analysis of olive oils) during 13-min, allowing generate a volatile fraction representative of the sample. Simultaneously, the E-nose system underwent a 13-min cleaning process using an air flow, which allowed achieving a stable signal baseline. Subsequently, the gas headspace from each sample was directed into the detection chamber, where it interacted with the MOS sensors for 2.5 min. The resistance signals of each of the nine MOS sensors were recorded by a data logger at 4-s intervals. For each olive oil analysis, a total of 37 resistance values were gathered from each sensor during the volatile-sensor interaction time-period. Thus, for each independent olive oil under study, an initial matrix of order (37 × 9) was generated and recorded by the software and then exported as an Excel file (37 lines corresponding to the resistance values recorded by each sensor during the 2.5 min at a 4-s interval; and, 9 columns corresponding to the signals generated by the 9 MOS sensors included in the E-nose device). Seven distinct feature-extraction methods were applied [27] to the initial data matrix: the last response point (LP), the integral of the response curve (INT, calculated using the 1/3 composed Simpson’s numerical integration rule), the maximum response point (MAX), the minimum response point (MIN), the sum of the response curve (SUM), the mean of the response curve (MEAN), and the standard deviation of the response curve (SD). Each feature-extraction method was applied to the initial data matrix (resistance response curve generated during the volatile-sensor interaction period), allowing obtaining a single pre-processed output signal per applied method and sensor for each one of the 30 independent PDO olive oils. So, a new data matrix of order (30 × 63) was obtained with 30 lines corresponding to the 30 independent PDO olive oils analysed and 63 columns (9 sensors × 7 feature extraction methods) corresponding to the average of the feature-extracted outputs of duplicate E-nose assays. This latter matrix was then used as the input data matrix for the statistical qualitative and quantitative analysis of the E-nose data.

Statistical analysis

The study employed a One-way Analysis of Variance (ANOVA) and, when appropriate, followed by Tukey’s post-hoc multi-comparison test, to assess statistical significant differences in the main volatile classes and olfactory sensations among the studied olive oils of the three studied PDOs. In cases where a volatile/olfactory sensation was detected/perceived in only two of the three PDOs, the t Student test was applied. Patter recognition multivariate techniques, both unsupervised (principal component analysis, PCA) and supervised (linear discriminant analysis, LDA), were applied based on volatile-olfactory data (matrix of order (30 × 17), corresponding the number of lines to the 30 independent PDO olive oils and the columns to the 17 volatile-olfactory parameters) and on the pre-processed signal profiles obtained from the lab-made E-nose-MOS device (matrix of order (30 × 63), corresponding to the 30 oils and the 63 pre-processed signals per oil). As previously mentioned, in the latter case, the dataset resulted from the application of seven feature-extraction methods to the initial resistance dataset recorded during the volatile-sensor interaction time-period, totalling 63 pre-processed signals for each sample (9 MOS × 7 different feature extractions) [28]. LDA was implemented with the simulated annealing (SA) algorithm to select optimum subsets of non-redundant variables (volatile-olfactory data) or pre-processed signals (E-nose) that provided the best classification performance. This evaluation was conducted using two internal cross-validation (CV) variants: leave-one-out CV (LOO–CV) and repeated K-fold-CV (4 folds × 10 repeats), allowing this latter variant to use 25% of the dataset for validation (i.e., 2–3 olive oils from each of the three regions considered), being used at each iteration the other 75% of the data for training purposes. The classification performances were assessed through the sensitivity (i.e., the percentage of correctly classified samples), 2D plots for the first two primary discriminant functions (DFs), and respective class membership ellipses computed by Bayes’ theorem [29]. The SA algorithm is a local search meta-heuristic optimization technique with inherent probabilistic and stochastic characteristics. It employs a random search engine, namely the Markov chain, to navigate away from local optima. Embracing suboptimal solutions, the algorithm facilitates the identification of the best solution or global minimum within a predetermined random search region. This is achieved through Monte Carlo simulations guided by a predefined probability schedule. In the context of the LDA-SA approach, the algorithm effectively selects the minimum number of non-redundant variables from either the 17 volatile-olfactory parameters or the 63 pre-processed E-nose signals. These variables enabled the construction of multivariate classification models. Various subsets of variables were systematically tested, encompassing 2 to 16 or 2 to 25 selected independent variables, depending on the dataset. For the former dataset, the maximum number of variables included in the classification models was constrained by the availability of only 17 independent volatile-olfactory parameters. In the latter case involving E-nose signals, the maximum number was constrained by the degrees of freedom, calculated based on the total number of independent olive oils (30) and the number of groups (3 PDOs). A total of 15 classification models were established using the volatile-olfactory dataset, while 24 classification models were developed based on the E-nose pre-processed signals. In each case, the model exhibiting the highest classification performance (i.e., greater sensitivity) and with the minimum number of variables was selected. Finally, the use of the E-nose to quantitatively estimate the concentration of the main volatile classes detected on the olive oils was evaluated by implementing multiple linear regression models (MLRM) based on subsets comprising 2 to 25 feature extracted variables (i.e., pre-processed E-nose signals), selected by the SA algorithm. The predictive performance of the developed models was assessed from the determination coefficients (R2) and the root mean square errors (RMSE). The open-source statistical program R (version 3.6.2) was used for the analysis, at a 5% significance level. For the LDA-SA approach, the function ldaHmat() was used to compute total and between-group matrices of sums of squares and cross-product deviations, generating matrices that were used as input to the anneal() variable selection search routine. These functions are available within the subselect package. The MLRM-SA procedure was computed by implementing the lmHmat() function that computes, in linear regression, total effect matrices of sums of squares and cross-product deviations, divided by a normalizing constant. As for the previous classification approach, the generated matrices can be used as input to the anneal() variable selection search routine.

Results and discussion

Volatiles and olfactory sensations of the studied PDO olive oils

The olive oil samples collected for this study comprised 10 distinct independent oils belonging to three (“Azeite do Alentejo Interior”, “Azeites da Beira Interior” and “Azeite de Trás-os-Montes”) of the six Portuguese PDO oils, sourced from various certified producers. The physicochemical quality of the oils was assessed to verify the compliance with the EU regulations for EVOO classification [30], as well as the specific requirements outlined by each PDO declaration, namely “Azeite do Alentejo Interior” [19], “Azeites da Beira Interior” [20], and “Azeite de Trás-os-Montes” [21]. The assessment included the evaluation of the free acidity, peroxide value, and extinction coefficients at 232 and 268 nm. The findings revealed that all oils under investigation met both the PDO stipulations and the legal thresholds established by EU regulations for EVOOs (data not presented). Moreover, the sensory panel noted no negative sensory perceptions, and all oils exhibited a fruity sensation with an intensity greater than zero.

In the PDO oils under study, different volatile compounds, totalling 28 distinct compounds, were identified by HS-SPME-GC-MS. Particularly, considerable variability was observed in both the number and relative abundance of the identified compounds across different oils within each region and among the three regions investigated. This intra- and inter-region variability may be attributed to several factors. Firstly, adherence to legal specifications allows for the production of each PDO oil using olives from various cultivars and at differing proportions. Secondly, variations could arise from the distinct origins of the olive oils within each demarcated geographical region, originating from different producers, extracted from different batches of olives at varying ripening stages, and utilizing different malaxation conditions (e.g., time and temperature). Overall, the identified volatiles could be grouped into seven chemical classes, namely, alcohols, aldehydes, carboxylic acids, esters, hydrocarbons, ketones, and terpenes. On the other hand, ten different olfactory sensations were perceived by the trained panellists, in the studied oils (fruity green, fruity ripe, apple, banana, cabbage, dry fruits, dry herbs, fresh herbs, tomato, tomato branches), at different intensities, showing both intra- and inter-region variabilities, being only five of them perceived in oils from the three demarcated regions. The high intra-region variability for each PDO oil can be tentatively explained by the same reasons previously discussed. The mean contents (± standard deviation) of the seven volatile chemical classes of the identified volatiles are presented in Table 2. Additionally, the mean intensities (± standard deviation) of the ten different olfactory sensations perceived in the olive oils studied are listed according to each type of PDO oil.

Table 2 Concentrations of the volatile chemical classes (mean ± standard deviation, mg of compound/kg of olive oil, as internal standard equivalents) and intensities of the perceived olfactory sensations (mean ± standard deviation, using an unstructured continuous scale from 0 to 10)

The analysis (Table 2) reveals that the specific geographic region under study significantly impacted the intensities of the perceived olfactory sensations, as indicated by the statistical significance (P-value < 0.05), more so than the contents of the volatile chemical classes. In fact, the results indicate that each PDO oil has a characteristic olfactory fingerprint that may be used as an origin marker. It is important to note, however, that the considerable variability observed within each PDO regarding volatile compounds might have hinder the region significant effect. Recently, Rodrigues et al. [13] highlighted the significant influence of geographical origin on the content of volatiles and on the intensities of positive olfactory sensations perceived in non-PDO olive oils from two neighbouring Portuguese regions, namely the Côa and Douro Valleys, located in the northeast part of Portugal.

In terms of volatile compounds, the most abundant chemical classes were aldehydes, followed by hydrocarbons and alcohols (with total mean contents for all oils at 5.63, 2.92, and 2.79 mg/kg of olive oil, respectively). Rodrigues et al. [13] also found that aldehydes and alcohols were the predominant chemical classes in non-PDO Portuguese olive oils from the Côa and Douro Valleys. In the present study, carboxylic acids constituted the least abundant volatile chemical class (with total mean content for all oils at 0.015 mg/kg of olive oil). Only for terpenes a significant statistical effect of the PDO was found (P-value < 0.05), with “Azeites da Beira Interior” oils being richer in terpenes compared to oils from “Azeite do Alentejo Interior” or “Azeite de Trás-os-Montes.” Also, “Azeites da Beira Interior” oils had a significantly greater concentration (P-value < 0.05) of total volatiles (18.2 ± 4.6 mg/kg olive oil) than oils from the other two PDOs.

Concerning the olfactory profile (Table 2), it is worth highlighting that only oils from “Azeite do Alentejo Interior” were classified as “fruity ripe,” while the other oils were categorized as “fruity green”, with the “Azeite de Trás-os-Montes” oils exhibiting the highest intensities for this latter sensation. Specifically, “Azeite do Alentejo Interior” oils displayed significantly higher intensities of apple and dry herb sensations. “Azeite de Trás-os-Montes” oils exhibited greater intensities of apple, cabbage, fresh herb, and tomato sensations.

The literature has established that several chemical compounds are responsible for specific sensory descriptors perceived in olive oils [31,32,33,34,35,36]. For instance, the perception of “green” is linked to alcohols, aldehydes, esters, and terpenes, while the “apple” sensation is associated with aldehyde and ester contents. Additionally, the “banana” sensation is correlated with several alcohols and esters. In this study, a significant correlation was observed between the total mean contents of “alcohols and esters” and the perceived intensities of the banana sensation (R-Pearson =  + 0.9095). Nonetheless, it is important to emphasize that non-volatile compounds, such as phenolic compounds, also play a role in influencing the aroma descriptors of olive oils [37].

Unsupervised and supervised differentiation of PDO olive oils according to the demarcated geographical region based on the volatile-olfactory profiles

Taking into account that the contents of the seven volatile chemical classes quantified by HS-SPME-GC-MS and, in particular, the intensities of the ten positive olfactory sensations perceived by the sensory panel in the studied olives oils differed according to the type of PDO, a PCA was implemented in order to assess if the volatile-olfactory data could be used as possible markers for geographical origin. A similar strategy, showed that olfactory profiles could be successfully applied for distinguishing non-PDO Portuguese olives oils [13].

The 2D-PCA plots demonstrate that the volatile-olfactory data (comprising 7 volatile chemical classes and 10 olfactory sensations, as outlined in Table 2) effectively distinguished among the three PDO oil types (Fig. 1). The oils from the “Azeite do Alentejo Interior” and “Azeite de Trás-os-Montes” regions exhibited distinct olfactory profiles, with the oils from “Azeites da Beira Interior” being overall richer in total volatile compounds. However, it is noteworthy that oils within each PDO exhibited noticeable dispersion, particularly pronounced in “Azeites da Beira Interior” oils, resulting in some overlap with the other two PDO oil types. The higher variability observed in the oils from the “Azeites da Beira Interior” region can be ascribed to the larger geographical coverage of Beira Interior compared to the other two regions being investigated, which can be even divided into two distinct sub-regions. Notably, no overlaps were observed between the oils from “Azeite do Alentejo Interior” and “Azeite de Trás-os-Montes”, which are also the two geographically more distant regions.

Fig. 1
figure 1

PCA analysis (biplot) differentiation of PDO olive oils according to the demarcated geographical origin: “Azeite do Alentejo Interior” (Green filled dot); “Azeites da Beira Interior” (Brown filled triangle); or, “Azeite de Trás-os-Montes” (Blue filled square). A PC1 versus PC2; B PC1 versus PC3 (Color figure online)

In order to better evaluate the classification performance, a LDA-SA technique was applied. The SA algorithm allowed to identify the 12 most discriminant variables among the 17 used in the unsupervised analysis. The selected variables included four volatile chemical classes (carboxylic acids, alcohols, aldehydes, and hydrocarbons) and eight positive olfactory sensations (fruity ripe, apple, banana, tomato, dry fruits, cabbage, fresh herbs, and tomato branches). A linear discriminant model with two DFs (explaining, respectively, 90.2 and 9.8% of the data variance) was established, allowing the correct classification of all the oils according to the respective PDO for the training (Fig. 2) and the LOO-CV procedure (100% of sensitivity and specificity). The repeated K-fold-CV internal validation variant, with 4 folds and 10 repeats, was applied to further evaluate the predictive classification performance. In this approach, 25% of the dataset, randomly selected, was left for validation during each repeat. This included 2–3 oils from each PDO, totalling 7–8 oils out of the 30 oils from the three regions under study. The results showcased an overall sensitivity of 98.0 ± 4.8%. Across the 300 validation runs (10 repeats × 30 oils), only four “Azeites da Beira Interior” oils were misclassified as “Azeite do Alentejo Interior” oils, along with one “Azeite de Trás-os-Montes” oil misclassified as a “Azeites da Beira Interior” oil. These findings confirm the viability of utilizing volatile and olfactory data as biomarkers to determine the geographic origin of EVOOs, aligning with previous literature [38,39,40,41,42]. Moreover, this study extends the applicability of the volatile-olfactory approach to PDO oils, specifically from Portugal. The results also highlighted distinct volatile-olfactory profiles of the studied PDO oils, despite the occasional administrative emphasis on registering PDOs mainly linked to demarcations of geographic areas, often overlooking the importance of an objective chemical-sensory basis [3].

Fig. 2
figure 2

Supervised discrimination of EVOOs according to the PDO, based on a LDA-SA model established using 12 selected volatile-olfactory variables (volatile contents of carboxylic acids, alcohols, aldehydes, and hydrocarbons classes and intensities of fruity ripe, apple, banana, tomato, dry fruits, cabbage, fresh herbs, and tomato branches olfactory sensations): “Azeite do Alentejo Interior” (Green filled dot); “Azeites da Beira Interior” (Brown filled triangle); or, “Azeite de Trás-os-Montes” (Blue filled square) (Color figure online)

However, it should be remarked that the successful discrimination achieved require resource-intensive and costly techniques, performed by well-trained technicians or expert panellists. Implementing these methodologies in real-time or on-site settings is challenging. Hence, investigating rapid and cost-effective analytical alternatives for non-invasive and direct analysis is imperative to ensure the accurate labelling of PDO oils.

Discrimination of PDO olive oils according to the demarcated geographical region using a lab-made E-nose

Numerous research studies have documented the effective utilization of E-noses in evaluating the quality grade, detecting adulteration, identifying geographical origins, or authenticating Italian PDO olive oil [11, 13, 16, 18, 43,44,45]. This research endeavoured, for the first time, to employ a custom-made E-nose in distinguishing three Portuguese PDO olive oils: “Azeite do Alentejo Interior”, “Azeites da Beira Interior”, and “Azeite de Trás-os-Montes”. The raw resistance signals from the nine MOS sensors were recorded by the device. Subsequently, these profiles underwent pre-processing using seven feature extraction methods, namely, LP, INT, MAX, MIN, SUM, MEAN, and SD resulting in a total of 63 processed signals for each independent oil sample (9 MOS sensors × 7 feature extraction techniques).

The LDA coupled with the SA algorithm allowed establishing a supervised classification model based on 16 treated signals from eight MOS sensors (S1_LP, S8_LP, S1_INT, S2_INT, S3_INT, S3_MAX, S6_MAX, S8_MAX, S1_MIN, S5_MIN, S8_MIN, S8_SUM, S9_SUM, S2_MEAN, S9_MEAN and S4_SD), with two DFs that explained 100% of the data variability. The MOS sensors S1 and S2 (TGS 2600 and TGS 2602 commercial sensors, respectively) were previously been reported for assessing the quality grade or geographical origin of EVOOs [46]. The model allowed the correct discrimination of all studied olive oils’ samples (100% sensitivity and specificity) for both original grouped data (Fig. 3) and LOO-CV variant.

Fig. 3
figure 3

E-nose-MOS-LDA-SA model performance regarding the supervised discrimination (original grouped data) of EVOOs according to the PDO: “Azeite do Alentejo Interior” (Green filled dot); “Azeites da Beira Interior” (Brown filled triangle); or, “Azeite de Trás-os-Montes” (Blue filled square) (Color figure online)

The findings underscore an enhanced recognition accuracy of the three studied Portuguese PDO olive oils in comparison to the discrimination capabilities of the same E-nose apparatus, previously documented by Rodrigues et al. [13] for non-PDO olive oils originating from two neighbouring Portuguese regions (achieving a sensitivity and specificity of 98.5% and 98.4%, respectively, for the LOO-CV variant). Given the potential for overoptimistic results with the LOO-CV variant, a more robust internal validation approach was considered. This differed from the prior approach taken by the research team in analysing non-PDO Portuguese oils [13]. Specifically, the repeated K-fold-CV variant was applied (4 folds × 10 repeats), where a greater number of independent samples were reserved for validation in each run compared to the single sample in LOO-CV. This ranged from 7 to 8 independent samples, each validation set consisting of at least 2 distinct oils, for each of the three PDOs under investigation. For this CV variant, a mean sensitivity of 99.7 ± 2.0% (varying from 87.5 to 100%) was obtained, corresponding to only one PDO olive oil of “Azeites da Beira Interior” misclassified as “Azeite de Trás-os-Montes” oil. Lastly, it should be mentioned that the predictive classification performance is slightly better than that achieved with the volatile-olfactory approach, showing that the E-nose device could be used as a complementary or even alternative method to verify the three Portuguese PDO claims. It should also be mentioned that the PDO’s discrimination predictive performances achieved with the proposed multivariate classification model (based on the 16 selected E-nose pre-processed signals) are comparable to those previously reported by Forina et al. [18]. These researchers used a lab-made artificial nose (headspace mass spectrometer), being able to establish classification LDA models (based on 8 to 20 parameters selected using the heuristic stepwise algorithm) that correctly classified 92.3 to 100% of the samples when discriminating between two Italian PDO olive oils. Additionally, the predictive classification performances achieved with the E-nose-LDA-SA approach proposed in the present study (LOO-CV: 100%; repeated K-fold-CV: 99.7 ± 2.0%) are similar or slightly superior to those reported in literature when using, for example, rapid evaporative ionization mass spectrometry (failure percentage greater than 5%) [4], or GC–MS sesquiterpene hydrocarbon fingerprints (average sensitivity of 93.6%) [8].

The study also assessed the potential application of the E-nose as a tool for quantifying concentrations of total volatiles and the three predominant chemical classes (alcohols, aldehydes, and hydrocarbons). A recent study demonstrated the ability to quantify volatiles in non-PDO Portuguese olive oils using MLRMs based on E-nose signal profiles [13]. However, in that study, the multivariate models were only validated using the LOO-CV method and required splitting oils by geographical origin, posing a limitation for practical use. In this study, the repeated K-fold-CV variant was employed to evaluate the predictive quantification potential of E-nose-based models. A single model was constructed for each volatile chemical class and for the total volatiles using data from all oils, irrespective of PDO declaration. Details regarding the established MLRMs and two common statistical regression metrics, i.e., the determination coefficients and the root mean square errors, are presented in Table 3.

Table 3 Quantification of the contents of alcohols, aldehydes and hydrocarbons released by olive oils from the three Portuguese PSO (30 oils), and of the content of total volatiles (alcohols + aldehydes + carboxylic acids + esters + hydrocarbons + ketones + terpenes): predictive performance of the MLRMs developed based on selected sub-sets (SA algorithm) of the feature variables extracted from the electrical resistance profiles of the nine-MOS sensors of the lab-made E-nose

The findings indicate that the E-nose-MLRMs developed can reliably estimate concentrations of the three main volatile chemical classes (alcohols, aldehydes, and hydrocarbons) identified in the studied PDO olive oils, along with the total volatile contents (0.923 ± 0.101 ≤ R2 ≤ 0.981 ± 0.018 and 0.29 ± 014 ≤ RMSE ≤ 1.32 ± 0.72 mg/kg oil, for the repeated K-fold-CV). Figure 4 corroborates the overall satisfactory predictive accuracy achieved (repeated K-fold-CV variant), underscoring the viability of the proposed E-nose-chemometric approach not only as a qualitative but also a quantitative tool for analysing Portuguese PDO olive oils. Indeed, individual regression lines were established to correlate predicted concentrations for each volatile class or the overall volatiles using E-nose-MLRMs, with concentrations determined by HS–SPME–GC–MS technique. These regression lines showed satisfactory fitting, as evidenced by R2 values ranging from 0.900 to 0.971. The slope values, closely approximating one (ranging from 0.946 to 1.032), indicated a nearly perfect fitting. Moreover, the intercept values were statistically equal to zero (ranging from -0.121 to 0.510), aligning with the theoretical intercept value for a perfect fit.

Fig. 4
figure 4

Concentrations (in mg/kg olive oil) of volatiles (alcohols, aldehydes, hydrocarbons and total volatiles) released by the PDO oils: predicted (repeated K-fold-CV variant) by the E-nose-MLRMs versus the experimental data determined by the HS–SPME–GC–MS technique

Conclusions

The study revealed distinct volatile (in terms of the amount of the five main chemical classes) and olfactory profiles in the three Portuguese PDO olive oils studied, namely “Azeite do Alentejo Interior”, “Azeites da Beira Interior,” and “Azeite de Trás-os-Montes”. These profiles allowed the accurate linear discrimination of the oils according to the correct PDO, underscoring the potential of volatiles (carboxylic acids, alcohols, aldehydes, and hydrocarbons) and olfactory sensations (such as fruity ripe, apple, banana, tomato, dry fruits, cabbage, fresh herbs, and tomato branches) to serve as reliable regional markers. Furthermore, the study demonstrated the practicality of employing an electronic nose as a fast, environmentally friendly, cost-effective, and non-invasive tool to verify the PDO designation of the three investigated Portuguese PDO olive oils. Also, it was confirmed that the E-nose can serve as a precise quantitative instrument for evaluating the concentration of key volatile compounds emitted by the studied PDO olive oils, namely alcohols, aldehydes, hydrocarbons and total volatiles. These findings extended the E-nose application beyond a particular geographical region (north of Portugal) and non-PDO oils to encompass PDO oils at a national level. Finally, this sensor-based device, could be easily used for in-situ assays with the possibility of being portable.