Introduction

Crops can be designated based on traditional phenes such as differences in morphological traits (cultivar) or chemical component variation (chemotypes). While phenotype refers to the entirety of observable plant features, genotype refers to the genetic makeup of a plant. The concept of a phene is introduced as the plant’s elemental composition. Functional phenomics is a new field of plant science that uses sophisticated experimental techniques and large statistical analyses to investigate the relationship between phenome and their physiology or function in the ecosystem. It is highly valuable for population mapping. Plants were discovered to perform and then adapt based on their ecological strategies, absorbing information and manifesting it in a variety of activities in this environment.

The question we raise, based on plant phenomics, is as follows: can we track the origin of the plant cross-border ecosystem by using the tree’s physiology expressed through fruit oil production rather than its morphological features or genetic information? (De Castro 2019), Olive oil is obtained by pressing fruits of olive (Olea europaea Linn) (Kaniewski 2012) native to the east of the Mediterranean, exactly in Syria and Palestine, when the ancient man has used the olive oil as medication and in beauty (Infante-Amate 2012). Being a crop of the Mediterranean region, which is considered a diversity panel, more phenotypic variation of the plant would be expected. It is affected by the long periods of rain, drought, soil and water resources (Costa et al. 2019). The high demand of olive oil made it a “celebrity commodity” that promoted competitive producers or retail sellers to exercise adulteration to maximize their profit.

It is well-known that the least processed form of olive oil is extra virgin olive oil (EVOO) that is why it retains its natural antioxidants and vitamins which are often lost during processing (Mandal 2020). Refined olive oil means that the constituents have been altered, such as through chemical or physical refining processes ex demucilagination and neutralization. Yet, some well-known olive oil products are promoted to consumers as genuine (EVOO), with unusual three-year expiration dates describing it as refined olive oil on the label. In truth, the label is incriminating information because EVOO with a three-year shelf life is a complete scam as it should only be a few months old (Gharby 2022). This means that they are actually marketing bottled extra virgin olive oil that has been processed.

The IOC in August 2021 stated that prices of olive oil in Spain, Italy and Greece, the most representative olive oil markets in the European Union, have an impact on the price of oils that other producing countries aim to export. This explains that over 123 virgin olive oils have been granted protected designations of origin (PDO) products. In other words, as a result of some countries’ national prejudice, the parameters of olive oil quality are skewed in their favour, regardless of the true quality of the oil.

On the other hand, since the authentic and pure olive oil is basically produced by processing a mixture of many cultivars, its quality standard has to be intrinsically flexible to comprehensively contain or adapt to all tolerable variables caused by natural attributes (Reboredo-Rodríguez 2018 and Ismaili. 2015).

Physical descriptive parameters, like colour, taste, refractive index or viscosity, are, as it thought to be, most often used in a mere pictorial nonparametric descriptive expression rather than quality-binding parametric attributes. But, in the matter of fact, viscosity is a cumulative, not specific, attribute, and refractive index varies with change in the metabolome of the material.

Extra virgin olive oil is the pure olive fruit juice obtained by expression. The unique colour of olive oil ranges from light gold to rich green. This colour is due the presence of several natural pigments in a specific metabolomic stage, which could be directly related to special attributes of the oil rather than quality. The concentration of these pigments varies with several environmental factors such as cultivar, climate, harvesting time, storage conditions and storage time. In addition, many pigments naturally degrade by light exposure, temperature or oxygen.

In recent studies of olive oil consumer preferences, the attributes of origin, region of production and colour were found to influence the market. Consumers worldwide were more aware of quality and valued green colours, which were perceived as expensive, richly flavoured and tasty oils (Peršurić 2020). Accordingly, profiling “pigments” would be a feasible fast, economical and a direct approach to recognize the physiological change in the pattern of constitution of such metabolomes in olive oil produced in different ecosystems.

To target the abovementioned postulation, our plan of study considered to collect olive oil samples from three remote cultivation areas (Egypt, Libya and Tunisia). Astonishingly, we noted they possess different shades of green and yellow colour which we tried to capture as colour models exploring the possibility of its use as functional phenomics indicator. This view point has not been explored before. The most commonly used expression is the RGB model in a quantitative manner, which is based on the mechanism of colour formation in the human eye where combinations of light waves at these wavelengths provide the desired response or sense of colour. RGB was used to estimate ripeness level of fruits (Dadwal and Banga 2012) and in detecting fruit defects (Phakade et al. 2014).

Previous research investigation, on olive oil, through “image analysis” cited a couple of studies on using pictures of olive oil samples for detection of adulteration or expiration status (Fengxia et al. 2001).

The goal of the study in hand is to introduce how variation in some functional phene like colour, viscosity and refractive index of extra virgin olive oil (EVOO) can be linked to the authenticity, freshness and quality and track the geographical origin of the oil using image analysis and visible spectroscopy with the aid of multivariate analysis.

Material and Methods

Olive Oil Samples

The experiment was designed to cover samples representing different numbers of cultivars, locations and freshness. Because of these criteria, a combination of 60 different olive oil samples for three different locations of the Mediterranean climate in different environments over lines of longitudes of 33.78 for Rafah, Egypt, to 9.36 for Siliana, Tunisia, was collected. From elevation from sea levels of 393 for Tarhona, Libya, to − 19 for Siwa, Egypt, was recorded, while latitudes of 28.93 of Hone, Libya, to 36.35 to Bo-arada, Tunisia, were recorded. Samples from eastern Egypt were collected from Sinai, (Arish, Tour and Rafah); samples collected from western Egypt come from Marsa Matruh, Halazine and Siwa in addition to local markets (2 samples). Samples from East Libya were collected from Tobruk, Elmarj and Benghazi. One sample was from Hone in the south of Libya, and the western Libyan samples were collected from Misurata, Al-Khomes and Tarhona. Three Tunisian samples were obtained from Boarada, Sidi Bozaid and Siliana. Samples were collected mainly in 2019 and 2020 (47 samples). Some old samples were used from 2010 to 2018 (6 2018, 4 2017 and 3 2010) to observe the effect of aging. Olive oil was extracted from olive trees located in commercial orchards and grown traditionally where drupes were milled and expressed in a hydraulic press. All the samples were stored in an ambered glass filled to the edge to avoid oxidation. Old samples were collected from the farms. They were preserved by their producers in glass containers.

Apparatus and Software

Visible (VIS) spectra were measured on a Perkin Elmer® UV/VIS spectrometer (Singapore) supported with UV Win Lab™ software, using 1-cm bath-length plastic cuvette cells.

Viscosity was measured using the Brookfield DV2 T Viscometer with cone spindle CPA-40Z; each measurement on a 0.5-ml oil sample was done using shear 7.5 N/s, in a stable temperature of 22 °C speed 60 rpm. Refractive index was measured using the Bellingham Stanley 22–75 RFM970-C digital refractometer.

Olive oil samples were filled in an ELISA 96-well microplate, Greiner bio-one, Germany, to be photographed, and visualization was done using Duo-UV lamp, Desaga, Hiedelberg, Germany. Images were taken using an iPhone 5 with camera 1.4 µm, AF pixel. All the images captured were stored as.jpg files with maximum available quality.

Average luminosity and RGB values for each sample were measured using Adobe Photoshop 7 software. SIMCA14.1® by Umetrics, Sweden, was used for the multivariate data analysis.

Prior to data analysis, the raw spectral visible data matrix of all the samples was mean-centered, which is the default option applied to the software.

All the viscosity and refractive index parameters were determined in triplicate, and the average values for each sample were computed and used in successive calculations.

Accelerated Stability Study

To determine whether the method of analysis suggested can detect the deterioration of the samples, artificially deteriorated samples (ADS) were prepared as follows: 10 randomly chosen samples collected from Libya and Egypt were subjected to heat deterioration as follows: Samples were heated in a boiling water bath for 20 min, 40 min and 60 min. Aliquot of the same samples was stored in a half-filled vial to be exposed to air oxidation for 12 months for a real-time testing as recommended by the European Medicine Agency. All of these samples were subjected to the same method of analysis: visible spectrophotometry, viscosity, refractive index and image analysis.

The choice of these temperatures for the experiment was based on Vaskova et al.’s observation that very small amounts of carotenoids are observed after heating 160 °C for 30 min (Vaskova 2015). Majchrzak et al. (2017) stated that the degradation of olive oil is relatively low at 60 °C for 24 h, and rancid at 140 °C while the frying temperature is considered to start at 150 °C.

Capturing and Treatment of Images

A total of 2.5 ml from the representative olive oil sample was placed in a plastic cuvette, and the entire set of cuvettes (48) was captured in one image to ensure that they were all subjected to the same conditions. The images were then analyzed using “Photoshop” software. A fixed-size square window for each cuvette was maintained throughout the analysis to describe the sample (colour). Two images were taken for the same samples to assess the precision and accuracy of the applied methods. The mean value for each property of luminosity, red, green and blue colour components, was calculated by the software, recorded and then arranged in a matrix. This matrix was used to build a PCA model using SIMCA (Soft Independent Modeling of Class Analogy) using six components and 0.95 confidence level for Q and T2 Hotelling limits for outliers.

Validation of the Visible Spectral Data Model

The reproducibility of the technique is secured by repeating the analysis on the same day and on different days and using different spectrophotometers to ensure the consistency and precision of the applied method.

Results and Discussion

RGB Analysis

Digital image can perform non-invasive low-cost analysis; it is considered a replacement for the human visual system which is subjected to ambient condition and inconsistencies. The colour model aims to specify colour in a standard way; the most commonly used is the RGB model which is based on the mechanism of colour formation in the human eye where combinations of light waves at these wavelengths provide the desired response or sense of colour. RGB is the model mainly used to display digital images where each pixel shows a different combination of colour tones.

Previous investigation on olive oil through “image analysis” cited a couple of studies on using pictures of olive oil samples deliberately adulterated with soy bean oil to aid in the detection of adulteration of olive oil. Photos were analyzed using graphical phase in MATLAB, a PLS model (Fengxia 2001). and second, photos were taken for edible oils to discern the expiration status of the oil based on colour change due to oxidation on heating (Azimi 2020).

The mean RGB values of all olive oil samples in addition to those of ADS were used to build a model. Its score scatter plot for the first two pc component is shown in Fig. S1; we can clearly see that ADS samples as well as old ones are grouped together away from the cluster of new samples.

Unsupervised Pattern Recognition of the RGB Values

Initially, the first multivariate project was created using SIMCA to construct a mathematical model that relates the samples’ colours expressed as average RGB values to the country of origin of the oil. Consequently, predict its affiliate fit-in origin. The score scatter plot of the PCA (Fig. 1) showed that samples from the three countries were grouped into three clusters.

Fig. 1
figure 1

Score scatter plot of first two PC’s PCA of RGB values of olive oil samples procured 2019

Supervised Pattern Recognition

Another model was built using 17 samples of olive oil procured only during 2019 from Egypt (6) and Libya (11) to explore whether the chosen phene feels a significant difference among the members of the model or reflects the effect of such variables on the current members. The quality of the model was described by the goodness-of-fit R2 (0 ≤ R2 ≤ 1) and the predictive ability Q2 (0 ≤ Q2 ≤ 1). The Tunisian samples were excluded because of their limited numbers. Then PCA-class was validated using a prediction set of 5 samples, where it could be correctly categorized into their respective classes (Table 1; Fig. 2).

Table 1 Correct classification rate for the prediction set samples after PCA-class
Fig. 2
figure 2

PCA class of Libya and Egypt samples

Not only do RGB values serve in classification according to the countries of production but also could cluster the samples according to province and city within the same country as seen in Fig. 3 and S1 of RGB values of Libya and Egypt samples, each in a different model, coloured according to different provinces then city.

Fig. 3
figure 3

Scatter plot of of RGB values of olive oil samples from Libya coloured according to harvesting time groups are highlighted by the manually drawn circles

The colour of extra virgin olive oil is a result of pigments, specifically carotenoids and derivatives of chlorophyll. Some of the factors that affect the concentration of the colours in the oil are the ripeness of the olives prior to oil production, the cultivar and the region of origin of the olives. During storage, pigments in olive oil naturally deteriorate; this deterioration may be brought on by exposure to light, heat and oxygen (Lazzerini 2016). As a matter of fact, pigments possess conjugated systems and can be analyzed via RGB colour measurement. This is why the results of the previously mentioned multivariate analysis based on RGB models have directed us to explore imaging of these samples under UV light 366 nm to capture the fluorescence caused mostly by chlorophyll and pheophytins. In this context, a model was constructed for average RGB values of samples produced from 2010 to 2019 aiming to use the date of production as another factor to label the samples (Fig. 4). The score scatter plot shows that the samples were grouped into 4 clusters according to time of production, and we can clearly see in the biplot Fig. S2 that five samples appear near the green and blue variables away from the characteristic red fluorescence of chlorophyll (of which 4 samples are old ones procured in 2010 and samples produced on April 2017 as claimed by its label). The score scatter plot of the RGB of ADS (Fig. 5) samples clearly shows that the samples were grouped into 5 distinct clusters; the cluster of degraded oil because of oxidation, with the symbol (O) coloured green, lies beside the two clusters of the samples degraded because of the heat coloured in orange (heated for 20 min) and coloured red (heated for 40 min), and those heated for 60 min (coloured in brown) and the well-preserved samples coloured in blue lie away from them.

Fig. 4
figure 4

a, b Score scatter plot of RGB values of olive oil samples from a Libya and b Egypt coloured according to harvesting time

Fig. 5
figure 5

Score scatter plot of RGB values of ADS samples

To confirm that the colour of the sample contains embedded information about the cultivation region, the colour values for 28 samples from only Libya, production year from 2010 to 2020, were used to construct a model. Wonderfully, we have observed the presence of different clusters based on the production year, with the oldest from 2010, in Fig. 1. The same note applies to Egypt samples in a similar model.

The described project has proven to be valuable in showing that the RGB values of the oil were successful in predicting the origin as well as freshness of the oil. These preliminary results claim that image processing of colour can indeed help in tracking the authenticity and/or purity of olive oil samples.

Visible Spectroscopy

In order to substantiate the abovementioned findings, the same models were reconstructed but by using the visible spectral data for the same group of samples. The visible absorption spectra of extra virgin olive oil have in common the characteristic features of the presence of three peaks in the blue region 390–500 in addition to a sharp one in the red region at 660–675 nm. Apparently, the carotenoids lutein and β-carotenoids and the chlorophylls pheophytin a and b are the main contributors for the absorption in this region. Figure S3 shows that there are three distinct clusters representing the three countries, Egypt, Libya and Tunisia, while Fig. S4 clearly shows the grouping of the samples from Egypt into four clusters with samples harvested in 2020 far from that of 2019. Similarly, 30 oil samples from Libya produced from 2010 to 2020 were also grouped into 4 clusters according to their production date as seen in Fig. S4.

Supervised Pattern Recognition Using OPLS-DA Analyses of the Visible Spectra

Orthogonal Projection to Latent Structures-Discriminant Analysis (OPLS-DA) was used to model the variance across the data of the Libyan and Egyptian samples; Tunisian samples were excluded because of their small number, where the score scatter plot represented clearly the in-between class discrimination of samples of different geographical origin.

It is well-known that OPLS-DA can easily produce statistically unreliable group separation in the absence of thorough validation; the discriminant model indicators were found to be very good, with high R2X, R2Y (goodness-of-fit) and Q2 (prediction-quality) values (0.996, 0.831 and 0.662, respectively), indicating the stability and reliability of the developed OPLS-DA-based model. The permutation plot assesses the risk that the obtained OPLS is valid and predicts well for new observations, intercepts R2 = (0.0, 0.203), Q2 = (0.0, − 0.473).

The model’s efficiency was defined by the goodness-of-fit R2 = 0.95%, whereas predictive ability by Q2 = 0.98 (Figs. 6, 7 and 8; Table 2).

Fig. 6
figure 6

OPLS DA

Fig. 7
figure 7

Coefficient plots of OPLS-DA model

Fig. 8
figure 8

Permutation plot of OPLS DA model

Table 2 Misclassification of OPLS DA

The horizontal component of the OPLS-DA score scatter plot will capture the variation between the groups, and the vertical dimension will capture the variation within the groups. We clearly notice the resemblance of the Libyan samples as they were clustered in the vicinity of each other where the Egyptian ones were more diverse. The loading plot was extracted to highlight the most discrimination variables, and the most important wavelengths have been shown to be 580–590 nm which may be due to the content of pheophytin a and chlorophyll a.

Since the colour is both a phonetic feature of organisms and a feature of the environmental space occupied by organisms (Menesatti 2012), image analysis, visible spectroscopy and physical attributes methods should be encouraged for routine analysis especially in such countries with low economic income as they require no sample preparation and no highly skilled personnel for operation, and they represent an option available to the majority of laboratories.

Refractive Index and Viscosity Measurements

Refractive index is an intrinsic property of some chemical materials, and it refers to their ability to refract light. The observed value of this property varies with any change in the chemical composition of the material in hand. Accordingly, if the refractive index of a material changes, it immediately means that the chemical constitution of that material has been altered somehow. The same argument could be viewed valid with “viscosity” as well. Consequently, those two attributes for olive oil should be very well efficient in reflecting any alteration in the composition of the oil due to environmental processing factors.

All samples exhibited refractive index (RI) values within the internationally accepted limits for olive oil (1.4677–1.4705) (Stan and Olive Pomace Oils. Codex Alimentrarius 2013) except for the old samples. Tunisia samples are the highest (1.4694, Table 3).

Table 3 Average refractive index and viscosity of olive oil samples

To prove that the refractive index of Libyan oil samples decreases by the effect of storage years, we compared the refractive index of ADS samples versus the original ones. We noticed a decrease in refractive index on storing which agrees with the previous study on olive oil samples from Palestine that revealed that refractive index decreases as a function of storage years (Bahti 2015).

Studies suggested that the viscosity of olive oil is related to the fatty acid composition which in turn differs according to the cultivar, maturity, altitude and climate. This can be explained that, in a cooler region, olive oil might be rich in monounsaturated acids such as oleic acid. According to Nierat et al., the dynamic viscosity is dependent on several factors; it decreases as olive oil is stored, in addition to altitude and amount of rain, where some samples they measured had higher values in areas of lower altitude and lesser rain than others (Neirat et al. 2014). The viscosity is also affected by the harvesting method as Sağlamet al. suggested the lowest viscosity obtained by olives harvested by hands (Sağlam, et al. 2018).

Despite the high stability of virgin olive oil, it is liable to oxidative processes, enzymatic oxidation during the extraction process, photo-oxidation and autoxidation which occur during processing and storage (Bendini et al. 2009). However, it must be noted that virgin olive oil is usually protected from exposure to light radiation from its time of harvesting until it is packaged in bottles on shelves when it begins to degrade, and thus, opacity of light of the packaging material is a must to protect it from degradation (El-atla 2020).

Factors Affecting Viscosity of Olive Oil

Figure 9e suggests that the viscosity increased in the oxidized samples ADS, and this finding disagrees with that reported by Abdelraziq in his MSc thesis (Musameh 2014) while Fig. 9d shows the effect of time between sample acquisition and analysis on the viscosity of olive oil that increases with storage. To observe the effect of geographical location of the farms on the viscosity, it was found that viscosity of samples increases with longitude of city of origin and decreases with its latitude and elevation from sea level. On comparing between the viscosity of ADS samples and the preserved ones, we can clearly observe that the viscosity increases upon temperature degradation of the oils.

Fig. 9
figure 9

Change of viscosity with respect to position: a latitude, b elevation from sea level, c longitude, d with respect to collection years, e with increasing temperature

Effect of the Position on Province or City of Oil Source on Refractive Index

Figures S4 and S5 show clearly the distribution of samples in the far right side representing western areas (Misurata, Zleiten, Khoms, Misllata, Alos, Tunisia) while the left side represents eastern areas (Arish, Tour, Matrouh, Siwa and Tobruk). Figure 10 shows that samples 52 from Libya and 57–60 from Tunisia which are characterized by the elevation from sea level have lower viscosity. Although the values of latitude, longitude and elevation from sea level are arbitrary considered, we encourage further studies for the effect of these variable on the consistency of olive oil.

Fig. 10
figure 10

a, b Biplot of refractive index and viscosity in relation to the position of the area of production (longitude, latitude and elevation from sea level), a coloured according to viscosity and b coloured according to refractive index

Conclusion

The results from this study verified that physical attribute of the oil namely colour, viscosity and refractive index could serve as non-invasive functional phene platform for clustering of olive oil samples from the three countries to track down the authenticity and according to harvest and production time through image analysis and accurate determination of refractive index and viscosity of oils from different regions. Samples mentioned that have been profiled and collected from very remote areas were found to be clustered according to the proposed hypothesis. The discrimination ability of the methods was enhanced by the diversity of the samples used in this study.

The applied procedures were found to be successful in detecting the accuracy of the labelled information (region of cultivation, freshness, purity, production date and storage conditions) and could be extrapolated to study the position of harvested areas.

Although the values of latitude, longitude and elevation from sea level are arbitrary considered, we encourage further studies for the effect of these variable on the consistency of olive oil.

Finally, the applied procedures proved to be accurate and precise through multivariate analysis where models were constructed by carefully selected samples, and the models were challenged by internal and external samples.