Introduction

Olive oil is widely used in the Mediterranean country also due to its healthy virtues. Besides the sensory features of this kind of oil, the high concentration of oleic acid and the presence of antioxidants have contributed to determine an increase its consumption also in the Australian and American countries [1, 2].

In Tunisia olive trees spread from lower semiarid to arid climatic conditions [3]. The adaptability of the olive varieties leads to the existence of significant genetic variations among the Tunisian olive varieties of precious virgin olive oils (VOOs) [4]. Chetoui and Chemlali are the main varieties of the Tunisian orchards. Some farmers have preserved secondary and minor olive varieties like Oueslati, Gerboui, Sayali and Chemchali in specific areas [5].

Despite the genetic diversity of Tunisian olive varieties and their cherished oils, almost all the Tunisian virgin olive oils (VOOs) are exported in bulk [6]. Thus, international consumer is not aware about the characteristics and specificities of Tunisian olive oils [6].

In order to improve the consumer knowledge about Tunisian VOOs, the inventory and the study of local varieties as well as their characteristics are crucial. The stage of maturity and the geographical conditions play a major role in the determination of the olive oil quality, including its organoleptic characteristics and the oxidative stability [7].

Chetoui and Chemlali are the main Tunisian varieties. Chemlali variety represents 56% of the total olive trees area and Chetoui covers 12% of the superficies [8]. The other varieties like Oueslati, Gerboui, Sayali, Chemchali, Tounsi, Toffehi are named secondary and/or minor varieties. They are maintained in areas where the farmers have selected some olive varieties that are adapted to the geographical conditions of the country. The quantity of oil or/and related olives from minor varieties cannot be estimated because farmers usually mix their fruits with olives from other varieties.

Sidi Bouzid and Gafsa are two regions in the center of Tunisia. There is a lack of information on the chemical characteristics of several minor varieties that are sustained in these regions. Consequently, research on varieties from these arid regions needs to be conducted.

The aim of this work is to characterize VOOs of some varieties from Sidi Bouzid (Sehli and Chemchali) and Gafsa (Baldi, Besbessi, Tounsi, Neb Jmel and Chemchali) by the study of their chemical composition with particular emphasis on the fatty acids, volatiles, tocopherols and polar phenols profiles.

Materials and methods

Virgin olive oil samples

The VOO samples were chosen from various olive varieties located in the center of Tunisia. The varieties Sehli and Chemchali were harvested from the region of Sidi Ali Ben Aoun in the governorate of Sidi Bouzid. They were coded as Sehli SB and Chemchali SB. The varieties Baldi, Besbessi, Tounsi, Neb Jmel and Chemchali were harvested from the region of Le Sned in the governorate of Gafsa. They were coded as Baldi G, Besbessi G, Tounsi G, Neb Jmel G and Chemchali G. Olive samples Chemchali G and Sehli SB were handpicked at earlier maturation degree and at later maturation degree respectively in the months of September and January. However, Baldi G, Neb Jmel G and Tounsi G samples were collected at lower maturation index stage of maturation in the month of September. These minor varieties have been usually used as green table olives. Chemchali SB and Besbessi G samples were obtained from fruits with higher maturation index collected in January. Besbessi variety is commonly eaten brined at an advanced stage of maturation. All the olive samples were obtained during the harvest 2014–2015.

Chemchali variety is an olive oil and a table olive variety. The drupe fresh weight is approximately 2.26 g. Sahli and Neb Jmal varieties are two olive oil varieties; their fresh fruit weight is between 0.65 and 1 g. Tounsi, Baldi and Besbesi varieties are used as table olives [8].

Three trees from each variety grown in the same pedoclimatic conditions were carefully chosen to collect the fruits used to produce the oil. The average age of the trees was between 60 and 70 years. The planting density was 7 × 7 m. Trees were not irrigated and were auto-rooted. The region of Sidi Ali Ben Aoun is characterized by its isohumic soil while Le Sned soil was a clay loam soil.

A lab-scale extraction system with a pilot extraction plant “Abencor MC2” (Comercial Abengoa S.A., Sevilla, Spain) were used for the extraction of the Tunisian VOOs (three samples from each variety within each harvest stage). Olive samples were pressed within 24 h from the collection. They were malaxed under 30 min at the ambient temperature, then, centrifuged for 5 min. Oil samples were stored in glass bottles (100 ml volume), at 4 °C in the dark until analysis. The analyses of the oil samples were done during the same year as the harvest.

Solvents and standards

All the solvents and standards used in the different analysis (methanol, n-hexane, potassium hydroxide, isopropanol, diphenyl ethyl, 3,4-dihydroxyphenylacetic acid, caffeic acid, 4-methyl-2-pentanone, myrcene, (E)-2-hexen-1-ol, (E)-2-hexen-1-al, hexyl acetate, butyric acid, methyl heneicosanoate, syringic acid and tyrosol) were obtained from Sigma-Aldrich (St. Louis, MO, USA). Distilled water was purified by a Milli-Q-water system.

Calculation of maturation index (MI)

Hundred olives were randomly chosen from each collected olive sample (1 kg). Then MI was calculated by evaluating their skin and pulp colors [9]. The MI was calculated as the mean of three values obtained for three olives samples collected on the same day from three different olive trees.

Analysis of fatty acid composition

A Fast GC Shimadzu (Tokyo, Japan) 2010 Plus instrument, supplied by a flame ionization detector (FID) was used for analyzing the fatty acid composition. Solution prepared by the mixture of 0.05 g of oil in 1 ml of n-hexane and 1 ml of 2 N potassium hydroxide in methanol were injected (0.3 μl) into a split 1:100 GC port set at 250 °C. The capillary column used was a BPX-70 column with fused silica: 10 m length, 0.1 mm i.d., 0.2 μm film thickness, from SGE Ringwood Victoria, Australia. The carrier gas was helium. Its flow rate was 0.7 ml min−1. Temperature of the FID detector was 250 °C. Temperature of the oven was kept firstly for 20 min at 120 °C. Then, at a rate of 110 °C min−1 it increased to 175 °C and kept for 2 min. Besides, the rate of temperature rised from 20 °C min−1 to 220 °C and from 50 °C min−1 to 250 °C (maintained for 1 min). The total required time for the run was 6.55 min. Retention times of the peaks were evaluated against those of authentic reference compounds (mix 463 from Nucheck Prep. Inc., Elysian, MN, USA). Fatty acids were measured by the internal normalization of the chromatographic peak area and its composition was showed as relative percentages of each fatty acid.

Analysis of tocopherols

Contents of α-, β-, and γ-tocopherols were measured with an Agilent 1200 Series HPLC system chromatograph (Hewlett-Packard, Waldbronn, Germany) equipped with an FLD, set at 295 nm and 330 nm as excitation and emission wavelengths, respectively. 1 g of oil was mixed with 10 ml of isopropanol then, filtered using a 0.45 μm nylon filter (Whatman Inc., Clinton, NJ) and injected into a Cosmosil Nap column (Nacalai-Tesque, Kyoto, Japan, 150 mm, 4.64 mm ID, 5 μm). Elution solvents A; methanol/water (90/10 v/v acidified with 0.2% H3PO4) and B; acetonitrile were utilized. Samples were eluted according to the following gradient: 100% A for 22 min; 100% B for 13 min; 100% A maintained for 15 min. The flow rate of 1.0 ml min−1 and the injection volume was 20 μl. The analyses were carried out at room temperature. Besides, tocopherols determination was obtained following the retention times belonging to pure standards. With the standard solution of α-tocopherol (r2 = 0.999), the curve of calibration was constructed. It was the reference of the quantification of tocopherols. Results are given in mg of α-tocopherol per kg of oil.

Polar phenolic fraction determination

The polar phenolic composition was obtained using HPLC. Their separation and quantification were conducted according to the International Olive Council [10] standard. 1 ml of syringic acid (Sigma-Aldrich, St. Louis, MO, USA) at a concentration of 0.015 mg ml−1 was added to 2.0 g of olive oil. The oil and the syringic acid as an internal standard were shaken for 30 s. Then, 5 ml of methanol/water 80/20 (v/v) solution was added and the solution was shaken again for 1 min. Next, it was extracted in an ultrasonic bath for 15 min at room temperature. Besides, the mixture was centrifuged at 5000 rpm/min for 25 min. At the end, the supernatant phase was collected and filtered using a 0.2 μm nylon filter (Whatman Inc., Clinton, NJ) special to HPLC tests. Moreover, an HP 1100 Series instrument (Agilent Technologies, Palo Alto, CA, USA) equipped with a binary pump delivery system, degasser, autosampler, HP Diode Array UV–Vis Detector (DAD), and mass-spectrometer detector (MSD) was employed. The chromatographic separation was conducted at 40 °C under controlled temperature by a Zorbax Eclipse XDB-C18 (Phenomenex, St. Torrance, CA, USA) column (5 μm particle size, 25 cm × 3.00 mm ID) in the gaps wavelengths 280 nm, 330 nm. The gradient elution was undertaken following the method used by Carrasco-Pancorbo et al. [11]. 5 μl of the prepared mixture was injected. Polar phenolic composition was obtained by a quadrupole MS coupled with an electrospray (ESI) interface working in positive ion mode within m/z 50–800 range within these conditions: drying gas flow, 9 l min−1 at 350 °C; nebulizer gas pressure, 50 psi; capillary voltage, 3000 V; fragmentation energy, 60 eV. The nebulizer and drying gas was nitrogen. The examination of the retention times of the UV–Vis and mass spectra with data provided by literature was used for the quantification of phenolic compounds and results were obtained in mg per kg of oil.

Determination of volatile composition

SPME–GC–MS (solid phase micro extraction—gas chromatography—mass spectrometry) technique was employed for the determination of the volatile composition of olive oil.

The GC–MS was a QP2010 Ultra Shimadzu (Shimadzu Co, Kyoto, Japan) equipped with an auto sampler AOC-5000 plus. An SPME fiber coated with a divinylbenzene/carboxen/polydimethylsiloxane (DVB/CAR/PDMS) phase (50/30 μm, 2 cm long from Supelco Ltd., Bellefonte, PA, USA) and a ZB WAX (Phenomenex, Torrence CA, USA) column (30 m, 0.25 mm, 1.00 μm), coated with polyethylene glycol phase, were used. Column temperature was held at 40 °C for 10 min and increased to 200 °C at 3 °C min−1; the carrier gas was helium at 1 ml min−1; split ratio: 1:10. The total time of analyses was 75.33 min. The mass spectrometer operated in the electron ionization mode in the mass range of 30–250 m/z at an ion source temperature 200 °C and an interface temperature 240 °C. The solvent cut time was 3 min. 4-methyl-2-pentanone (Sigma-Aldrich) was added to 1 g of olive oil sample as an internal standard: 0.1 g of internal standard (concentration 5 mg kg−1) was dissolved in refined peanut oil placed in a vial, and oil added up to 1 g. The injections were performed using an SPME auto sampler by exposing the fiber to the sample headspace for 30 min at 40 °C. After extraction, the fiber was thermally desorbed for 5 min in the injection port of the GC at 240 °C. The identification of the volatile composition was performed in respect of original reference compound. Standards: 4-methyl-2-pentanone, myrcene, (E)-2-hexen-1-ol, butyric acid, (E)-2-hexen-1-al, hexyl acetate from Sigma-Aldrich, corresponding to the chemical volatile classes of VOOs (aldehydes, ketones, alcohols, esters, acids, terpenes) were utilized to establish the calibration curves. Refined peanut oil was used as solvent of the different standards.

Statistical methods

Results are expressed as mean and standard deviation of three obtained values. One-way ANOVA supplied by Tukey’s multiple range tests with a 95% significance level (p ≤ 0.05) was employed for revealing differences between varieties. The differences between regions were also evaluated (p ≤ 0.05). Principal component analysis (PCA) has been applied to the found data. Hierarchical cluster analysis and k-means clustering method relying on Ward’s method with Square Euclidean distance were also utilized. K-means clustering is applied to validate the clustering results. The statistical analyses were carried out by SPSS 20.0 for Windows (SPSS, IBM, 20).

Results and discussion

Fatty acid composition

Fatty acid compositions of VOOs of Tunisian olive varieties from the regions of Gafsa and Sidi Bouzid are displayed in Table 1. Significant differences were obtained between varieties (p ≤ 0.05).

Table 1 Fatty acid composition (%) of VOOs of six autochthonous Tunisian varieties (Sehli, Chemchali, Baldi, Besbessi, Neb Jmel, Tounsi) from the regions of Sidi Bouzid (SB) and Gafsa (G) harvested at different maturation index

The content of the palmitic acid were in the range of 13.5% and 18.1%. The highest percentage was observed in Chemchali G VOO obtained from green olives (18.1%) while the lowest were noted in VOO from Sehli SB variety at the later stage of ripening (13.5%). High palmitic acid level was detected in Neb Jmel VOO from the region of Gafsa (17.4%).

The level of oleic acid was 62.1% in the oils of the Gafsa Chemchali, at earlier maturation degree and 72% in oils from later maturation stage olives of the Sehli SB. Regarding the linoleic acid, while the highest level was detected in oils from the later stage of maturation olives of the Besbessi variety (14.8%), the lowest percentage was found in oils from earlier maturation stage olives of Tounsi variety (9.5%).

The amounts of saturated (SFA), monounsaturated (MUFA) and polyunsaturated (PUFA) fatty acids in olive oils samples were also analyzed. The SFA levels in oils from Gafsa were significantly different from those of Sidi Bouzid (p ≤ 0.01). In fact, Chemchali, Baldi, Besbessi, Neb Jmel and Tounsi VOOs were rich in total saturated fatty acids essentially due to their high contents of palmitic acid. These contents with the heptadecanoic acid (C17:0), arachidic acid (C20:0) and the lignoceric acid (C24:0) levels differed significantly according to the region of cultivation (p ≤ 0.02, p ≤ 0.05, p ≤ 0.01 respectively). Besides, significant differences (p ≤ 0.01) were observed between MUFA levels in VOOs from Gafsa and Sidi Bouzid. Oil samples from the Sehli variety at both stages of olive maturation: earlier and later stages had significantly (p ≤ 0.05) the highest MUFA percentage (71.3%, 73.4% respectively). However, the Chemchali G VOO had significantly (p ≤ 0.05) the lowest MUFA percentage (64.3%) with lower maturation index. Significant differences were also found between oils from Sidi Bouzid and Gafsa in some fatty acid levels such as oleic (p ≤ 0.01), cis-9, cis-12, cis-15-octadecatrienoic (p ≤ 0.01) and cis-9-heptadecenoic (p ≤ 0.01) acids. However, differences of the PUFA levels in oils from Gafsa and Sidi Bouzid were not significant. Nonetheless, significant differences (p ≤ 0.05) were found in the PUFA percentages between varieties. El Riachy et al. [12] found that the interaction variety × fruit ripening affected significantly MUFA/PUFA, MUFA/SFA, PUFA/SFA of Lebanese olive oil (p ≤ 0.01).

The Besbessi VOO showed the highest PUFA level (15.6%) while the Tounsi and the Sehli olive oils had significantly (p ≤ 0.05) the lowest PUFA percentages (10.4%, 10.7% respectively).

Volatile profile

The examination of the different volatile profiles of the studied Tunisian varieties showed the diversity of the chemical classes of the VOO aroma (Table 2). However, C6 and C5 compounds resulting from the lipoxygenase (LOX) pathway were the most abundant compounds of the analyzed volatile profiles. Moreover, (E)-2-hexenal was the main volatile compound of the studied olive oil samples. It was also reported that (E)-2-hexenal, a compound characterized by green leaf and almond aroma, seemed to be predominant in high quality olive oil [13].

Table 2 C6 and C5 volatile compounds (mg 4-methyl-2-pentanone kg−1 oil) of VOOs from six autochthonous Tunisian varieties grown in Gafsa and Sidi Bouzid regions

The headspace of oil samples showed other C6 compounds: hexanal, (Z)-2-hexenal, 1-hexanol, (Z)-3-hexen-1-ol and (E)-2-hexen-1-ol that differed significantly according to the variety (p ≤ 0.05). This finding was in agreement with those of Baccouri et al. [14] and Krichene et al. [15] who found that volatile compositions of Tunisian olive oil varieties were affected by variety. The highest hexanal amounts were found in Baldi G and Besbessi G oils (4.51 mg kg−1 and 3.59 mg kg−1, respectively). This compound is characterized by green leaf and apple aroma.

Significant differences were detected in (Z)-2-hexenal, and (Z)-3-hexen-1-ol according to the region (p ≤ 0.01). Ouni et al. [16] found that the chemical composition of the volatile fraction of Oueslati olive oils was quite variable, depending on the region of cultivation of the plants. Oil samples from Gafsa were significantly richer in (Z)-2-hexenal than those from Sidi Bouzid (p ≤ 0.05). Moreover, the highest (Z)-2-hexenal, and (Z)-3-hexen-1-ol amounts were found in Chemchali G2 (1.22 mg kg−1) and Baldi G (2.77 mg kg−1). Caporale et al. [17] admitted that the (Z)-3-hexen-1-ol was associated to the green leaf and green banana aroma in olive oil.

The sum of C6-LOX varied significantly (p ≤ 0.01) according to varieties. Neb Jmel G oil had significantly the highest sum of C6 compounds (24.13 mg kg-1). However, Chemchali G oils were characterized by the lowest amounts for both earlier and later maturation degree (5.22 mg kg−1 and 2.85 mg kg−1, respectively).The headspace of the elaborated VOOs presented quite a large fraction of C5 compounds dominated by 1-penten-3-ol. Significant differences were obtained in C5 volatile compounds amounts according to the variety (p ≤ 0.01) and the region (p ≤ 0.01). Ben Hassine et al. [18] have also reported that C5 volatile compounds changed according to the variety and the geographical origin. Oils samples from Gafsa were significantly richer in 1-penten-3-ol than those of Sidi Bouzid (p ≤ 0.01). The highest 1-penten-3-ol amount was found in Baldi G oil (2.3 mg kg−1). 1-Penten-3-one, characterized by the strawberry, the green and the sweet notes, and (Z)-2-penten-1-ol, characterized by the wet earth attribute were also present at different levels according to the respective varieties. Baldi G, Neb Jmel G and Tounsi G had the highest (Z)-2-penten-1-ol amounts (0.63 mg kg−1, 0.64 mg kg−1 and 0.58 mg kg−1, respectively), whereas Chemchali G had the lowest (0.24 mg kg−1). Baldi G had the highest 1-penten-3-one content 0.27 mg kg−1 while Chemchali SB and Chemchali G had the lowest 1-penten-3-one contents (0.07 mg kg−1).

Polar phenolic compounds

The HPLC–DAD/MSD profiles of the phenolic compounds in the analyzed oils are summarized in Table 3. Significant differences were found between varieties and according to the region (p ≤ 0.01). Uylaşer [19] observed differences in both quantitative and qualitative fractions of phenolic compounds from the studied olive fruit samples obtained from different locations.

Table 3 Phenolic compositions and total phenols (mg kg−1) of VOOs from six Tunisian varieties grown in Gafsa and Sidi Bouzid regions

The benzoic acid derivative was detected only in Baldi olive oil (7.6 mg kg−1). Moreover, the ferulic and p-cumaric acids were not found at the later stage of maturation in oils Besbesi G and Sehli SB.

The major phenolic compounds identified in VOO samples from Gafsa and Sidi Bouzid regions were secoiridoids containing oleuropein aglycone (OA), decarboxymethyl oleuropein aglycone (DOA), the oxidized form of decarboxymethyl oleuropein aglycon (OxDOA), ligstroside aglycone (LA) and decarboxymethyl ligstroside aglycone (DLA). Ben Brahim et al. [20] and Hachicha Hb et al. [21] found that secoiridoids were the most abundant group of olive oil phenols from some Tunisian varieties (Chemchali, Chétoui, Jemri, Touffehi, Fakharri and Zalmati). Moreover, Manai-Djebali et al. [22] have reported that Tunisian VOOs with a high content of phenolic secoiridoid derivatives show a higher stability. Significant differences (p ≤ 0.01) were observed between the DLA amounts in the oil samples from the two regions of the center of Tunisia. In fact, oils Sehli SB at earlier and later maturation degrees had the highest DLA contents (33.7 mg kg−1 and 31.4 mg kg−1, respectively). Moreover, DLA amount of Chemchali SB with highest maturation index (26.87 mg kg−1) is higher than those of Chemchali G at earlier and later maturation degrees (21.5 mg kg−1 and 21 mg kg−1, respectively).

Oils Chemchali SB, Chemchali G2 and Neb Jmel G have significantly (p ≤ 0.01) the highest amounts of OxDOA (86.7 mg kg−1, 85.8 mg kg−1 and 86.1 mg kg−1, respectively) while Chemchali G1 has significantly (p ≤ 0.01) the highest DOA content (127.3 mg kg−1). Furthermore, the significantly (p ≤ 0.01) highest amounts of OA were observed in Chemchali SB and Chemchali G1 oils at earlier and later maturation degrees, respectively (100.1 mg kg−1, 98.9 mg kg−1, respectively). Nonetheless, Chemchali G olive oils had the significantly (p ≤ 0.01) highest contents of LA at both stages of olive ripening (at earlier and later maturation degrees) (34.3 mg kg−1, 33.5 mg kg−1, respectively).

The other major phenol detected in the olive oils from Gafsa and Sidi Bouzid was the ( +)-1-acetoxypinoresinol. This lignan was present in considerable amounts ranging from 30.4 mg kg−1 in Besbessi G oil to 156.6 mg kg−1 in Chemchali G oil. However, Loubiri et al. [23] reported that the acetoxypinoresinol was not detected in Neb Jmel oils from cultivated in the North of Tunisia in the region of Borj El Amri. Manai-Djebali et al. [22] have also evaluated the phenolic profiles of several Tunisian varieties. They found that acetoxypinoresinol is not abundant in Hor Kesra, Sredki and Chladmi varieties, in contrast with Aloui and Betsijina varieties. Owen et al. [24] have identified the lignans as major components in the phenolic fraction of VOOs.

Finally, two flavones apigenin and especially luteolin were also identified and quantified in oils from the different varieties. Clear and significant differences (p ≤ 0.01) were observed between the content of luteolin in Tounsi G oil (14.7 mg kg−1) and the amounts of this flavonoid in the other studied varieties ranging from 20.9 mg kg−1 in Baldi G to 39.5 mg kg−1 in Chemchali SB. Similarly, the lowest apigenin content (11.2 mg kg−1) was detected in Tounsi G oil and its higher amount was found in Chemchali SB oil (23.4 mg kg−1).

The total phenol content, as the sum of all quantified single phenolic molecules, ranged between 110.6 mg kg−1 (Besbessi G) and 350.4 mg kg−1 (Chemchali G1). It is known that these values are highly dependent also on the variety nature [25].

Tocopherol contents

Figure 1 showed the identification and quantification of (α, β and γ) tocopherols in oils samples from Gafsa and Sidi Bouzid regions. The high concentrations of α-tocopherol were observed in all of the VOOs selected from the center of Tunisia. The values ranged between 485.1 mg kg−1 (Chemchali variety) and 756.3 mg kg−1 (Tounsi variety). As reported by Tsimidou et al. [26] α-tocopherol was the major tocopherol in olive oil (90–95% of the total tocopherol content); most olive varieties may yield VOOs with more than 200 mg α-tocopherol per kg of oil.

Fig. 1
figure 1

α-,β-,γ- and total tocopherol contents of VOOs from varieties grown in Gafsa and Sidi Bouzid regions. Data are mean ± standard deviation (n = 3). Different letters for the same parameter indicate significant differences among varieties (p ≤ 0.01). (*)Asterisk indicates significant differences among region (** p ≤ 0.02)

In Fig. 1, α-tocopherol amounts showed significant differences between varieties (p ≤ 0.01). These results are in agreement with those reported by Manai-Djebali et al. [22] for Hor Kesra, Sredki, Chladmi, Betsijina and Aloui varieties growing in the region of Siliana (central Tunisia). Significant differences (p ≤ 0.01) were also found in the β-, γ- and total tocopherols amounts of the different varieties. However, only γ-tocopherol contents showed significant differences (p < 0.02) between regions. Similar results have been found by other authors, suggesting that tocopherol contents were highly variety-dependent. These differences were probably linked to genotype characteristic and metabolic behavior of each variety [27].

Regarding β-tocopherol, its amounts ranged from 6.3 mg kg−1 in Sehli oil to 50 mg kg−1 in Besbessi oil. Besides, Sehli and Besbessi olive oils had also the the highest and lowest γ-tocopherol levels: 3.7 mg kg−1 and 15.3 mg kg−1, respectively.

It is interesting to underline the very high levels of tocopherols in the studied varieties from the center of Tunisia. These amounts were higher than those reported for good quality oils that vary between 100 and 300 mg kg−1 [25]. This remarkably high level of tocopherol is promising in view of nutritional needs.

Chemometric analysis of compositional data

In order to identify patterns in the different collected data and replace variables by few factors, PCA was applied to polar phenols, volatiles, tocopherols, maturation indices and fatty acids whose data were from the VOOs from olive varieties of the center of Tunisia (Baldi, Chemchali, Neb Jmel, Tounsi and Besbessi varieties from Gafsa and Chemchali and Sehli varieties from Sidi Bouzid).

Due to different scales of indicators (percentage, mg kg−1, etc.), the data was standardized by Z-score method selected indicators. Then, Kaiser-Meyer Olkin (KMO) measure and Bartlett's test are computed by SPSS system to determine the suitability of data for the principal component analysis method.

The KMO index was higher than 0.6 and the Bartlett's test of sphericity was significant (p ≤ 0.05). Therefore, the data used in our example were adequate for PCA.

The first three principal components explained 74.18% (Fig. 2a). PC1 explained 36.65% of the total variance and positively correlated to OxDOA, DOA, LUT, OA, LA, API, AcPIN and total phenols, discriminating the VOOs mainly for the different polar phenols. PC2 explained 25.31% of the total variance and differentiated VOOs on the basis of the saturated fatty acids (SFA), volatiles compounds, especially the C5 and C6 alcohols and the benzoic acid derivative. PC 3 explained 12.22% of the total variance and differentiated VOOs on the basis of α- and γ- tocopherols. Previous study showed that variability in polar phenols and fatty acid levels between oil samples led to perform multivariate statistical methods to describe the characteristics of oils coming from the different varieties and to assess their differences [1, 25].

Fig. 2
figure 2

Principal component analysis (PCA) and hierarchical cluster analysis (HCA) applied to chemical data obtained from the different olive oil varieties. a Loading plots of principal component analysis (PCA) based on the maturation index, the phenolic compounds profile, the total phenols, the fatty acids, the contents of α and γ-tocopherols and the volatile profile of olive oils from varieties of the center of Tunisia (for each variable, the letter “Z” is related to the standardization by Z-score method selected indicators). MI maturation index. SFA saturated fatty acid, MUFA monounsaturated fatty acid, OxDOA decarboxymethyl oleuropein aglycon—oxidized form, DOA decarboxymethyl oleuropein aglycon. LUT luteolin, DLA decarboxymethyl ligstroside aglycon, Ac PIN ( +)-1-acetoxypinoresinol, API apigenin, OA oleuropein aglycon, LA ligstroside aglycon. b Dendrogram of cluster analysis using the Ward method

Then, the hierarchical cluster analysis was used to gather varieties that are very similar to others in groups based on the obtained PCs (Fig. 2). The Squared Euclidean Distance as a base of Ward’s method was selected as a criterion of the distance. Ward’s method is interesting because it looks for clusters in multivariate Euclidean space [28]. That is also the reference space in multivariate ordination methods, and in particular in principal component analysis (PCA).

Figure 2b presents the graphical illustration (dendrogram) of the arrangement of the clusters produced by the hierarchical analysis. The distance between merged clusters (0–25) increases with the level of merger. The dashed line in dendrogram represents the optimal selected solution of four clusters distribution:

Cluster 1 is composed by Baldi variety from the region of Gafsa. Cluster 2 represents the largest cluster and consists of three different varieties: Besbessi, Tounsi and Neb Jmel from the same locality Gafsa. Cluster 3 contains Sehli variety cultivated in the region of Sidi Bouzid. Cluster 4 consists of Chemchali variety from both regions under study (Sidi Bouzid and Gafsa).

In order to validate the classification of varieties into four clusters by the Ward’s method and to explicitly test whether that variability which can be attributed to the differences between experimental groups is significant, a cross validation of the hierarchical clusters by the K-Means Clustering was applied. Clusters found by the Ward method were also found by the K-Means Clustering method.

The statistics obtained for the clusters of the different varieties and their corresponding final clustering centers are presented in Table 4. Cluster 1 has significant high mean values of alcohols C5 and C6 and ketones C5 (p ≤ 0.01). Besides, the significantly highest α-, β- and γ-tocopherol amounts marked Besbessi, Neb Jmel and Tounsi varieties in cluster 2 (p ≤ 0.01). Rekik et al. [29] found that Besbessi and Tounsi varieties were probably distinguished due to their different genetic background. Then, cluster 3 is distinguished from the others by its highest significant MUFA and lowest SFA mean values (p ≤ 0.01). Finally, respected to the significantly highest mean amounts of polar phenols (p ≤ 0.01) cluster 4 is marked by its richness of antioxidant compounds.

Table 4 Means, standard deviations and significance (one-way ANOVA; p ≤ 0.01) of VOO compounds relatives to the four clusters of varieties

To conclude, olive oil samples obtained from Gafsa and Sidi Bouzid in the center of Tunisia showed significant differences in their SFA and MUFA levels (p ≤ 0.01) according to the region. The fatty acid composition of the studied oil samples revealed the highest oleic acid level in oil of Sehli variety at the later stage of olive maturation (72%) and the highest PUFA content in olive oil of the Besbessi variety, used as green table olive. The volatile profiles of the studied oil samples dominated by the presence of the (E)-2-hexenal compound, showed significant variation in the sum of C6-LOX according to the varieties (p ≤ 0.01) and in C5 compounds according to the varieties and the region (p ≤ 0.01). Besides, the analysis of the phenolic compounds in olive oil samples from Gafsa and Sidi Bouzid demonstrated that the secoiridoids were the major compounds in all the oil samples. Further, Chemchali oil is characterized by the highest secoiridoid amounts. Moreover, all oil samples from both regions of the center of Tunisia had notable tocopherol amounts exceeding 485 mg kg−1.

Subsequently, the chemical analysis showed that the investigated minor Tunisian olive varieties under study produced VOOs with interesting and peculiar compositional characteristics, especially in terms of antioxidant compounds, resulting in an efficient adaptation to the arid climate of central Tunisia. Chemometric analysis, applied to the compositional results, seems to be an efficient tool for the discrimination of VOOs from different olive varieties. In particular, the hierarchical clustering analysis combined with principal component analysis creates new opportunities for characterizing oils obtained from these Tunisian varieties. Four clusters were obtained using Ward’s method. The varieties of the same cluster were similar and could be distinguished from the others on the basis of their olive oils chemical compositions (volatiles, tocopherols and fatty acids). However, a cross validation analysis is mandatory to validate the clustering results and for this reason the K-Mean Clustering analysis was applied.

Future developments could be focused on the combination of different statistical methods. Investigations on other varieties are possible for the herein presented methodology with the use of larger number of samples.