Chromatographic fingerprint application possibilities in food authentication

The aim of the study was to compare the effectiveness of the use of low-peak chromatographic fingerprints for the differentiation of various food products. Three groups of unprocessed products (mushrooms, hazelnuts and tomatoes), food preparations (bread, dried herbs and tomato juice) and alcoholic beverages (vodka and two types of blended whiskey) were examined. A commercial electronic nose based on ultrafast gas chromatography (acquisition time 90 s) with a flame ionization detector was used for the research. Static headspace was used as a green procedure to extract volatile compounds without modifying the food matrix. Individual extraction conditions were used for each product group. Similarities and differences between profiles were analyzed by simple Principal Components Analysis. The similarity rating was determined using the Euclidean distances. Global model was built for recognition chromatographic fingerprints of food samples. The best recognition results were 100% and 89% for tomato juices, spices, separate champignon elements and hazelnuts. On the other hand, the worst recognition results were 56% and 77% for breads and strong alcoholic beverages.


Introduction
Food adulteration is still a problem of our time. One of the methods used to assess the authenticity of food is the analysis of the volatile compounds profile, which are a characteristic of food under certain conditions. The volatile profile of food varies over time and depends on many factors, including product freshness, food additives, food processing and preparation, food preservation method, storage conditions, including presence of other products with a strong odour in the place of storage [1][2][3][4].
The electronic nose (E-nose) is a popular tool for quickly assessing the full aroma profile of food products. This measuring technique uses various types of sensors. The most commonly sensors are metal oxide semiconductors, conducting polymers, piezoelectric sensors, optical or calorimetric sensors [5]. The group of E-noses also uses typical analytical techniques such as gas chromatography with a flame ionization detector (GC/FID) or coupled with a mass detector (GC/ MS). Gas chromatography with a mass detector dominates among the methods of analyzing volatile compounds used to detect food adulterations. The entire chromatogram may be a "chromatographic fingerprint" (ChF) of the volatile compounds profile that characterizes the product as a whole [6]. This technique is constantly modernized. Two-dimensional chromatography (GCxGC) systems with a time of flight (ToF) detector are used, enabling the separation of several hundred compounds simultaneously. However, these techniques are still expensive and require specialized service.
Many researchers use different methods of fingerprint to confirm the authenticity of food [7]. Researchers of [8] used gas chromatography coupled to an ion mobility spectrometry (GC-IMS) to analyze the fingerprint of volatile compounds in lamb samples. The use of chemometry made it possible to distinguish the age of lamb. An infrared spectra fingerprint in the 760-2500 nm range was used for the detection of genetically modified corn Bing et al. [9]. Also headspace solid phase microextraction coupled to GC-ToF-MS was implemented for profiling volatiles in different apple varieties [10]. The authors of [11] employed an advanced technique selected ion flow tube mass spectrometry (SIFT-MS) to obtain volatile fingerprints for the identification of sheep cheeses from different producer.
As noted by the authors [12], there are difficulties in comparing the results of single tests carried out on different products under different analytical conditions. Extraction of volatile compounds from the food matrix can be carried out by many methods, on which the amount of volatile compounds obtained depends [13][14][15]. Additionally, depending on the composition of the tested matrix, e.g. sugar, fat or protein content, the extraction conditions may affect the formation of artifact compounds [16][17][18]. Also, the use of devices sets with various kinds of detectors results in obtaining varying amounts of volatile compounds. These types of results are not easy to compare and studies are difficult to replicate.
Volatile compounds profile can be obtained directly from food sample without complicated analytical work that could cause changes in that profile. One of the simplest methods is Head-space (HS) dedicated to the extraction of volatile compounds. The most important parameters in the HS method are temperature and thermostating time of the sample. Volatile substances migrate from the sample (in liquid or solid form) to the gas phase reaching equilibrium between these phases. The partition coefficient limits the ability of the component to transfer from the matrix to the gas phase. HS without matrix preparation allows the extraction of highly volatile components. HS is usually used in GC analyses coupled with mass spectrometry (MS), which allows the identification of a large number of volatile compounds characteristic of a given food matrix [19]. In most works, information about volatile compounds characteristic of a given food is used to recognize the authenticity of food [20][21][22].
The aim of the study was to evaluate the effectiveness of using only the chromatographic fingerprint with a small number of peaks for distinguishing food products. The use of the same method for the analysis of different food products, unprocessed and processed, allows an objective evaluation of the method and its practical use. An electronic nose based on ultrafast gas chromatography (acquisition time 90 s) with a flame ionization detector was used to obtain ChF. An optimized HS method was used to extract volatile compounds without modifying the food matrix. The effectiveness of the applied method was assessed with the use of chemometric methods. The similarity and differences between the analyzed samples were assessed on the basis of the distance between the samples in the experimental space. The advantages and disadvantages of the method used were also summarized.

Samples
Three groups of products-unprocessed (champignons, hazelnuts and tomatoes) and processed (breads, dried herbs and tomato juices) food-and alcoholic beverages (vodka and two types of blended whiskey) were used in this study. Information on the tested products can be found in Table 1.
Six varieties of hazelnuts were collected from orchard located in southern Poland. These varieties are listed in The Polish National List of Fruit Plant Varieties 2016. All other food products were purchased in stores in original sealed packaging to prevent loss of odour. The tests were carried out shortly after products were purchased and transported to the laboratory. Nine samples were taken from each product. Each sample was tested in triplicate.

Chromatographic fingerprints analysis
The study used the Heracles II electronic nose (Alpha M.O.S., Toulouse, France) in accordance with the methodology described by the authors of [23]. Heracles II was based on ultrafast gas chromatography (acquisition duration 90 s) with a sorbent trap. The analysis was performed simultaneously on two parallel capillary columns of different polarity DB-5 and DB-1701 (each 10 m long, 0.18 mm in diameter and 0.4 mm film thickness). Hydrogen was the carrier gas (flow rate: 10 ml/min). The temperature conditions were as follows: injector at 200 °C; isotherm at 40 °C for 5 s, a 4 °C/s ramp to 270 °C, and isotherm at 270 °C for 30 s; FID1/FID2 at 270 °C. Extraction of volatile compounds was carried out by automated HS method without solvents. ChF was saved in the integration files. The method was optimized for each type of food product in the first phase of the study in order to obtain strong and reproducible signals from the GC.
The main task was to establish the thermostating parameters of the various products. The most important factor was temperature. The partition coefficient of volatile compounds in the sample-gas phase system decreases with an increasing temperature, but the increase in temperature causes the risk of irreversible destruction of sample. The determination of the method parameters was carried out individually for each product, checking repeatability and the ability to distinguish chromatographic profiles. The optimized parameters of the analysis are presented in Table 2.
In the case of hazelnut samples, the lowest temperature HS thermostating (55 °C) was used due to the sensitivity 1 3 of lipids to oxidation processes. At temperatures below 55 °C, the chromatogram signals were too weak and there was no repeatability of the analysis results (the acceptable limits of variation was ± 10%).

Statistical analysis
The similarities and differences between the chromatographic profiles were analyzed with the AlphaSoft software (Alpha M.O.S., Toulouse, France). For each integration data set, peak areas and peak heights were normalized to relative values expressed as percentage of the total peak area of all peaks and total peak height, respectively. An unsupervised PCA (at 95% confidence level) was used to reduce the dimensionality of a large number of data set. The calculations were performed on the measurement data table in the form: retention times vs. relative peak areas and relative peak heights. Score plots PC1/PC2 were used for visualization. The Euclidean distance between the centroids of the defined groups and pattern discrimination index [24] were used for assess the similarity between samples. The closer index to 100% was, the smaller was the dispersion within groups.
To assess recognition possibility of analyzed samples basing on ChF original AlphaSoft application for global recognition model building was used. Method is based on creating a set of reference chromatograms which next creates an 2-dimensional array (n, p) where n = number of rows (measurements/chromatograms) associated with samples and p j = number of columns associated with a retention time. From all measurements within given set for every retention time random variable X j is generated and for every p i mean m(j) is calculated together with standard deviation σ(j) for each variable. Variable X j has normal distribution, so measurement limit range is given by formulas: For every unknown sample above condition is verified in local tests for every retention time T j . Next percentage of positive answers of local testes is calculated. This value is compared with assumed required percentage of positive answers (recognition threshold = 95%) and basing on this decision is taken if unknown sample is similar to pattern or not. The 2/3 of the samples was used to build the global recognition model (establish pattern). The 1/3 of the samples was used for testing to determine the classification accuracy.

Champignons (Agaricus bisporus)
The studies of volatile compounds profiles in champignons do not reveal many chemical compounds as shown by the authors [25] using the GC-MS and GC-FID method. The authors of [26] analyzed the changes in the profile of volatile compounds during mushroom drying. They used GC/MS (acquisition time 22 min) and an electronic nose (based on metal-oxide semiconductor) with headspace to record changes in volatile compounds. Twenty-two volatile compounds were found in the fresh mushroom. E-nose Heracles II (acquisition duration 90 s) recorded 21 and 26 chromatographic peaks, respectively in the analysis of white and brown champignons and 21 peaks in the case of Portobello (Fig. 1). Paulauskienė et al. [27] also used E-nose Heracles II to study volatile compounds in mushrooms. Researchers found 11 compounds in mushrooms, which were used to characterize two types of mushroom cultivation-conventional and organic. On the other hand, Paulauskienė et al. [27] did not use the entire chromatographic profile to differentiate between types of crops.
The PCA score plot (Fig. 2) shows the distribution of the obtained chromatographic profiles in PC1 ver. PC2. The first component, explaining about 81% of the variation, differentiates all three types of mushrooms.
The most similar chromatographic profiles were found for Portobello and white champignons, while the profiles of white and brown champignons were more distant (Table 3). Additionally, within the group of brown mushrooms, the greatest variability was observed as compared to other products. The distance between the extreme samples of brown mushrooms was over 70 units in relation to the PC1 axis. In order to find the cause of the lack of homogeneity of the brown mushroom samples, the chromatographic profiles of individual mushroom elements, i.e. the cap, gills, stipe, scale, were examined. The tests carried out on four basic elements of the brown mushroom showed that the greatest diversity of volatile compounds chromatographic profiles was between the cap and the gills of the brown mushroom. Twenty chromatographic peaks for cap and stipe and 23 and 27 peaks for gills and scale, respectively, were recorded (Fig. 3).
Visualisation of ChF in PC1 vs. PC2 is presented on Fig. 4.
The first component divides the ChF of the mushroom elements into two groups. One has gills, a stipe and a scale, and the other has only a cap, which is consistent with the results noted in Fig. 3. The distribution of the samples (Fig. 4) shows the presence of other volatile compounds in the cap than in the remaining elements of the mushrooms. The greatest value of the Euclidean distance (169.6) was between the cap and the gills. At the same time, high homogeneity of the samples (99.7%) was observed in all elements. The brown mushrooms were of varying size, unlike the other mushrooms (white and Portobello). Probably the differences in the size of the mushrooms caused that in the nine randomly prepared samples there were different proportions of the individual elements of the mushrooms. The detected cause of the lack of homogeneity of the brown mushroom samples proves the high sensitivity of the ChF method to even the smallest changes in the composition of the analyzed samples. On the one hand, this is an advantage of the method, as it can be used, for example, to track the quality of a raw material or a product. On the other hand, the operator needs extensive experience in order to detect and verify possible errors in the product assessment with the use of ChF.    Figure 5 shows that the first compound (PC1) accounted for 99.3% of variation. The tested hazelnut cultivars were correctly divided into six groups.
Based on the Euclidean distance, the differences between the chromatographic fingerprints of 6 varieties of hazelnut were determined. The greatest distance (105.7) was observed between 'Kataloński' hazelnut (KAT) and 'Olbrzym z Halle' hazelnut (OzH), which are differentiated by the first component (PC1). The profiles of KAT and OzH are correlated negatively. The smallest distance was between 'Barceloński' (BAR) and 'Webba Cenny' (WC), which are differentiated by the second component (PC2). The analysis allowed to distinguish four groups of hazelnuts. The first group consists of BAR, NOT and WC hazelnuts whose ChF differ just slightly from each other (the Euclidean distance below 10). In the second group is Cosford (COS) and its distance from the first group is between 10 and 20. However, the third group (KAT) and the fourth group (OzH) were characterized by significantly different profiles. Hazelnuts contain more than 50% of fat which is sensitive to oxidation processes. Rate of these processes can vary depending on the hazelnut variety. As other researchers showed [29], varieties of hazelnuts differ from each other in content of bioactive substances that influence fat oxidation. Oxidation processes of lipids that are contained in food products are a source of volatile compounds that compose chromatographic profile.

Tomatoes and juices
In general, tomatoes are characterised by rich volatile compounds profile. So far, about 400 volatile compound were identified used proton transfer reaction-mass spectrometry (PTR-MS) and SPME-GC-MS [30]. E-nose recorded the following number of chromatographic peaks in the three tomato cultivars: 33 for 'Maluno', 44 for 'Pink King' and 34 for 'Zadurella' (Fig. 6).
The tomatoes were in the maturity stage red-ripe. As tomatoes ripen, the volatile compound profile changes. In order to ensure the comparability of the results of the analyses, it must be ensured that the tested tomatoes are at the same stage of ripeness. Storage conditions, especially temperature, have an equally significant impact on the ChF of tomatoes [31]. Principal Component Analysis correctly groups all three tomato varieties (Fig. 7). The first component (PC1) shows clear differences between the M, Z and PK chromatographic profiles (64% of variance). The biggest Euclidean distance (about 37) was found between profiles PK and M. The smallest Euclidean distance (about 21) was found between the Z and M profiles. The second component includes discrete differences between the profiles of all three tomato varieties (22% of variance). According to [32], volatile compounds profile (from method GC-MS) was successfully used for the determination of geographical origin of tomatoes. For differentiation two multivariate chemometric techniques: Linear Discriminant Analysis (LDA) and Soft Independent Modeling of Class Analogy (SIMCA) were used.
The next group of tested products consisted of tomato juices of different brands. In the case of juices, the number of peaks recorded was 38 for A, 35 for B, 45 for C, 32 for D, and 41 for E. Comparing the chromatographic profiles of all juices, it can be seen that the juice C has a different chromatographic profile than other juices (Fig. 8). Dimethyl sulfide was detected on all samples. This is a typical compound for tomato juices, as shown by the authors [33], they used the GC-MS method in research.
On the other hand, β-pinene was detected in C juice, but it was not present in other juices. PCA analysis showed that the two first components explained more than 99% of variance, including 96.5% explained by PC1 (Fig. 9). Profiles of juices of brands A, B, D were the most similar and they were correlated negatively with profile of juice of brand C. The biggest Euclidean distance (about 104) was found between profiles A and C. The comparison of the labels shows that the composition of brand C juice differed from other formulas. Tomato concentrate dominated in brand C juice. The composition of the E brand juice was distinguished by its sea salt content. On the other hand, the PCA analysis (according to PC2) divided the juices into two groups: group A, B, D, C and group E. The juices of brands A and B as well as B and D have similar profiles and the Euclidean distances between them are 5.5 and 7.3, respectively. However, despite the slight differences in distance, the juices are still distinguishable. In 2021 the authors of [34], using the GC-MS method, found differences in the volatile compound profile in tomato juice and tomato juice inoculated with A. alternata. With even the slightest differences in the ChF of the samples, they can be distinguished. The use

Breads
Only few researchers use HS for profiling volatile compounds in bread due to the fact full profile of odourants is not obtained with this method. Extraction with solvents or HS-SPME and GC-MS are the most effective methods in obtaining large amounts of volatile compounds in bread [35]. On the other hand, a complete odourants profile is not necessary for sample identification. A single chromatogram carries a lot of information about the sample. E-nose showed the following number of chromatographic peaks in three types of bread: 21 for 'Polish grain' bread (PGB), 29 for wheat bread (WB) and 29 for 'village' bread (VB). The characteristic and repeatable pattern is specific enough to identify the sample. PCA analysis (Fig. 10) gave satisfactory results of differentiation of the examined groups of bread.
Graph of the two first principal components (circa 68% of total variation) visualizes differentiation of three groups of bread. PC1 differentiates VB and PGB from WB. It is probably due to presence of rye flour in VB and PGB. PC2 covered differences between VB + WB and PGB. When searching for similarities between the formulas of the products it should be noted that both VB and WB are made with baking yeast, while PGB does not contain yeast. Ethanol was the main volatile component in the VB and WB chromatographic profile. However, in both the VB and PGB samples, 2,3-butanedione was present. Using distances to indicate similarities between profiles, the greatest similarity of chromatographic profiles was found between VB and PGB samples and the smallest between samples of WB and PGB (Table 3). However, these differences are not large, as otherwise they could be a problem when applying modelling for confirmation of sample authentication. According to [36], profile of volatile compounds in bread depends composition of bread, e.g. on salt content, yeast content. Therefore, it can be considered to apply fast HS GC-FID method for monitoring of e.g. changes in bread formulation.

Dried herbs
Saffron, one of the most expensive spices, is used as a colouring and flavouring agent. The composition of the volatile components of saffron is also highly dependent on the extraction method and the detector used. More than 100 volatile compounds have been identified in saffron using GC-MS [37][38][39]. As early as the Middle Ages adulteration of the valuable spice was attempted by using cut orangeyellow marigold petals. Differentiation of saffron and marigold volatile compounds profiles with GC FID is effective and fast. E-nose recorded 48 chromatographic peaks for saffron and 30 for marigold. Score plot is shown on Fig. 11. More than 99% of total variability is explained by the first component (PC1). The Euclidean distance is more than 108, while the chromatographic profiles are correlated negatively. It is possible to differentiate between both spices that have very different volatile compounds profiles using the simplest method of HS extraction.

Alcoholic beverages
Research results are available that use the volatile compounds profile to differentiate alcoholic beverages [40,41]. Chromatographic profile of volatile compounds from samples of strong alcoholic beverages is often not easy to the dominant concentration of ethanol. Various analytical techniques used for authenticity of whiskey have been described by [42]. Typically, samples are subjected to preliminary preparation, e.g. pre-concentration treatment [43]. The authors of [44] used samples diluted with water in a ratio of 3.6: 1 (water: agricultural distillate) to test alcoholic strengths. Samples diluted in this way were analyzed by electronic nose and chemometry. The method used allowed for a good differentiation of distillates. Whereas the authors of [45] found discrete compounds in various types of whiskey for didactic purposes, and then the chromatographic profiles obtained by the GC/MS method were used to identify the whiskey. The analytical procedure was based on dispersive microextraction with chloroform. Thanks to this procedure, the negative influence of ethanol in whiskey on the recognition result was eliminated. With our Heracles II device, it is not possible to use chloroform due to the Tenax present in the sorption trap.
The HS method without the preparation of alcohol samples is rarely used. However, this work compares the usefulness of the chromatographic fingerprint obtained for the raw, undiluted samples. Using this method, in our case we obtained 12 peaks for vodka samples, 35 peaks for Scotch whisky JW and 36 peaks for Scotch whisky WB on the chromatograms. Figure 12 shows the PCA score plot of chromatographic fingerprints of alcoholic beverages-vodka and two types of Scotch whiskey. The first two principal components (PC1/ PC2) explain only about 40% of the total variation of input data. This is not a satisfactory transformation of input data. Divide into two groups, vodka and whiskey, is not problematic. However, differences within whiskey group are too subtle to allow for correct differentiation. Distance from WA samples to WB samples was 4.9. Maximum distance between samples, 14.3, was found for V and WB.

The recognition tests
Results of ChF rapid recognition test using global model are presented in Table 4. In set of 29 food products (9 samples were analyzed in triplicate from each product) 9 products were recognized in 100%, 14 products had recognized in 89%. The worse recognition result (56%) was obtained for PGB bread samples and for brown champignon samples. The better was homogeneity within sample group (i.e. juices, hazelnuts), the easier was to identify chromatographic pattern. The greater was dispersion in the group (worse homogeneity), the less effective recognition was (i.e. brown champignons). In the case of alcoholic beverages, ethanol had the greatest impact on the quality of chromatographic profiles, as it dominated the chromatographic profiles of all beverages. Therefore, such samples are usually specially prepared for research [46].
Tests of global model specificity were also done. The wrong recognition capability by model created for one kind of samples for testing other kind of samples was also verified. The chromatographic pattern for VB bread was verified with model created for PGB bread (Fig. 13A). In all cases 100% specificity was obtained also for pairs with minimal Euclidean distances (Table 3), for which errors in recognitions could be expected. Figure 13 shows four examples of the chromatographic fingerprint comparison of samples to the pattern from the global model. Clear differences may be stated at a glance between ChF Marigold and ChF Saffron (Fig. 13B), as well as between V and WA samples (Fig. 13C). Subtle differences may be observed between Scotch Whisky WA and WB ChF (Fig. 13D).

Conclusions
Optimal conditions under which the HS extraction was carried out for the food products tested in the study were determined. Chromatographic patterns from GC-FID rapid analysis, which are not rich in peaks unlike GC-MS, are sufficient to identify the majority of food samples tested. The electronic nose (HS UFGC method) proved to be the most effective for internally homogenous samples and those that give strong chromatographic signals (e.g. tomato juices, spices, separate champignon elements, different varieties of hazelnuts). The best ChF recognition result (100%, 89%) was obtained for these samples. However, in case of two sample group of 'bread' type and strong alcoholic beverages, easily differentiated ChF were not obtained. Authenticity confirmations for these products were on 56%, 77% level. Use of other extraction method by diluting strong alcohol with water, and change of chromatographing conditions would allow to increase ChF differentiation. The tested ChF method has many advantages: no need to use chemical reagents (a green procedure); time saving-no introductory sample preparation; short analysis time (acquisition time 90 s); the ability to "pipeline" continuous analysis of various samples; possibility of easy creation of chromatographic profile database based on integration files that in conjunction with chemometric methods allows to create a database of 'reference material' for confirming authenticity of new tested samples as well as to conduct 'in-house' product inspection.
The main disadvantage of the ChF method is the variability of the volatile compound profile depending on various external and internal factors of the samples, which makes standard formation difficult. It should also be remembered that the volatile compounds profile is a variable in time and it depends on the kind, homogeneity and state of tested sample. When collecting ChF with an intension of creating a database, it is also necessary to collect detailed information about the analyzed samples. During creation of ChF database special care should be taken in internal homogeneity of a sample. A good example here is the analysis of a champignon, as its elements have essentially different ChF. If samples with different ratio of these components are taken for tests, then we will end with various ChF for practically the same champignons.  Data availability The authors declare the availability of data and materials.

Declarations
Conflict of interest The authors declare that they have no conflict of interest.
Compliance with ethics requirements There were no experiments in the study.
Consent to participate All authors have read and agreed to the published version of the manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.