Identification of the hazelnut cultivar in raw kernels and in semi-processed and processed products

The request for an efficient traceability system able to identify hazelnut cultivars along the entire processing chain is becoming a critical point for avoiding fraudulent practices and safeguarding the interests of growers, food processors and consumers. In this study, DNA was extracted from different hazelnut matrices, including plant material (leaf, kernel and kernel episperm), and processed foods (paste, grain, flour and different types of snacks containing hazelnuts). The efficiency of Simple Sequence Repeat (SSR) markers was tested to identify the hazelnut cultivar ‘Tonda Gentile’ in all the supply chain. The analysis at 10 SSR loci was able to verify the presence/absence of the alleles of a declared cultivar contained in these matrices. The SSR analysis of DNA from raw episperm offers the possibility of identifying the mother cultivar and is suggested as an effective way to discover frauds since DNA analysis can be performed on individual kernels. For food matrices containing hazelnuts, the presence of the mother cultivar’s DNA can be assessed based on the identification of its alleles in the sample, although the presence of multiple alleles from the pollenizers makes the interpretation of results more difficult.


Introduction
Hazelnuts (Corylus avellana L.) are mostly destined for the confectionery and baked food industry and are used to produce a large number of foods. Due to the strong demand for nuts from the food industry, the commercial interest in hazelnut cultivation has increased in the last decades, leading to the continuous growth of production and harvested area [1] in several countries, including Italy, USA, Chile and Eastern European countries [2].
The average annual world production of in-shell hazelnuts is about 1,023,500 t (means 2018-2020) with a harvested area of 1,015,216 ha in 2020 (FAO 2022). Turkey is the first hazelnut producing country, representing more than 60% of the world's hazelnut production with 652,000 t (means 2018-2020) and 734,538 ha harvested in 2020, followed by Italy, which produces about 123,900 t (means 2018-2020) with 80,280 ha harvested in 2020 [3].
Italian hazelnut production is based on traditional cultivars selected for their adaptability to soil and climate and for their excellent kernel quality, matching the demand of the processing industry. Major producing regions that account for 95.8% of Italy's hectares of hazelnut orchards are Piedmont, Campania, Latium and Sicily, but the industry is growing also in Tuscany, Veneto, Basilicata, Umbria and Calabria.
In Piedmont the total hazelnut area has shown a 70% increase in the last decade, reaching 25,807 ha in 2021 (ISTAT, 2021) and represents about 29% of the Italian area. The production is still based on the cultivar 'Tonda Gentile' (syn. 'Tonda Gentile delle Langhe', 'Tonda Gentile Trilobata'), which is considered worldwide as one of the most valuable cultivars for processing due to the excellent quality of its nuts. The production of this cultivar in Piedmont is protected under the Geographical Indication (PGI, Reg. Latium is the second for total hazelnut area (24,864 ha in 2021) where the cultivar 'Tonda Gentile Romana', is much more common than its pollenizer 'Nocchione', although the latter is appreciated for its organoleptic quality. The food products from these cultivars are protected under the Protected Denomination of Origin (PDO, Reg. (CEE) no. 2081/92 and following modifications) 'Nocciola Romana'. Currently in the Region of Latium, together with the cultivar 'Tonda Gentile Romana', there are many new areas where 'Tonda di Giffoni' is cultivated [4].
In Campania, where the total hazelnut area is 22,027 ha in 2021, the cultivar panorama is larger with 3 major varieties, including 'Tonda di Giffoni', a high-quality cultivar appreciated by the food industry. Production of this cultivar in Campania is protected under the GPI mark 'Nocciola di Giffoni'.
In Sicily (13,800 ha in 2021) the cultivation is less specialised and based on a dominant cultivar, known under different names in a different area, including 'Mansa', 'Santa Maria del Gesŭ', 'Nostrale' and 'Comune di Sicilia', but with identical DNA fingerprints [5]. Further, it has the same DNA fingerprint as 'Nocchione' grown as a pollenizer in Lazio.
The request for an efficient traceability system able to identify hazelnut cultivars along the entire processing chain is becoming a critical point. The large difference in quality between cultivars and production from different areas of the world causes significant price differences on the market and this can encourage fraudulent practices. Hazelnut cultivar misidentification can economically damage the processing industry and may have safety risks for consumers, due to possible presence of aflatoxins in nuts of unknown origin. In addition, it should be underlined that hazelnuts are marketed as shelled kernels and thus it is not possible to identify the cultivar or origin of the raw material.
Currently, DNA-based analyses represent the most valid methods for food traceability because DNA can be isolated [6] and amplified [7] from highly processed food [8,9]. On the contrary, protein-based methods are not advantageous to solve problems relating to processed food traceability because proteins may be easily denatured by processing. Indeed, Gryson [7] suggests that normal temperatures used for processing induce the degradation of DNA into small fragments, but with optimized protocols for DNA extraction and PCR amplification, low-quality DNA can be successfully analyzed.
Thanks to their high level of polymorphism and reproducibility, Simple Sequence Repeat (SSR) molecular markers represent still today an important and advantageous DNAbased method used for food authentication [10]. Indeed, SSR markers have been used for the identification of the cultivar in processed products of several species, such as dried canned pears and pear juices [11], apple nectar and purée [12], must and wines [13][14][15], olive oil [16][17][18], diced, peeled and canned cherry tomatoes and tomato products [19,20], sweet cherry jams and biscuits [21], and many other foodstuffs.
At present, there are few references concerning the genetic traceability of hazelnut-based processed food [22,23]. Cultivar identification using DNA markers in hazelnut is critical since the edible part is the seed. Hazelnut is a wind-pollinated, monoecious species that exhibits a sporophytic self-incompatibility system [2]. This means that pollination is always carried out by a different genotype than the maternal one and that orchards require pollenizers. In fact, the main cultivar is commonly grown together with other cultivars that are used as genetically compatible pollenizers; yet, in many orchards of traditional areas pollination is performed by wild hazelnuts, abundant in the surroundings. As a result of cross-pollination, the embryo of the seed contains both maternal and paternal DNA, and this makes the identification of the maternal cultivar using nuclear DNA markers more difficult. The only maternal tissue of the seed is the episperm, i.e. the thin pellicle surrounding the embryo originated from the ovule integuments. This pellicle is rich in fibre and is always present in raw kernels, while it is removed after roasting before processing. The aims of this study were: (i) to test protocols of DNA extraction, including a commercial kit, for optimising DNA extraction from different matrices of hazelnut (kernel, episperm, paste, grain, flour and different types of snacks containing hazelnuts such as cream, cookies and chocolate); (ii) to evaluate SSR marker efficiency for cultivar identification in hazelnut food products and determine possible drawbacks that hamper the applicability of the method.

Plant material
The plant material (Table 1) was collected from a single true-to-type plant of 'Tonda Gentile' grown in Piedmont, Cuneo Province (Italy). Young fresh leaves were used as a reference to compare the DNA extraction and amplification from other matrices.
The following samples were prepared for DNA analysis: -Raw episperm from a single kernel collected after soaking the kernel for two days in deionized water; -Single raw kernel without episperm; -Episperm from a single kernel, roasted at five differ- The industrial roasting temperature is usually set at 150-160 °C. Episperm was completely removed by rubbing the seed after roasting.
All samples were stored at + 4 °C until DNA extraction.

Food material
The food matrices used for the trial are reported in Table 1.
Industrial paste (kernels without episperm chopped finely until a paste is obtained), grain (kernels deprived of episperm and coarsely chopped) and flour (roasted kernels reduced to powder) are industrial hazelnut products obtained using exclusively roasted hazelnut kernels deprived of the episperm. Cream, cookies and chocolate are products that contain hazelnut kernels and other ingredients. Paste: hazelnut paste made by 'Tonda Gentile' kernels (homemade paste, and industrial roasted paste) were analysed.
The homemade paste was prepared in the laboratory using 100 kernels roasted for 20 min at 160 °C and deprived of the episperm, ground in a mixer to obtain a thick paste.
The industrial paste labelled 'Tonda Gentile', was bought from an Italian hazelnut processing company that uses only nuts of this cultivar.
The pastes were aliquoted in 50 ml tubes, centrifuged at 1200 rpm at room temperature for 15 min to remove the oil, and then stored at + 4 °C until DNA extraction.
Grain and flour: samples of industrial hazelnut grain and flour (cultivar not declared) were obtained from a Company in the sector and were stored at + 4 °C until DNA extraction.
Industrial processed hazelnut products: samples of cream, cookies and chocolate of industrial origin containing hazelnut (cultivar not declared) and other ingredients were ground at room temperature and then stored at -80 °C until DNA extraction.

DNA extraction method
DNA extractions of all samples used in the trial were performed using the Doyle and Doyle protocol [24], making appropriate changes according to plant and food material. Three different incubation times in CTAB buffer (10, 30, 45 min) and in isopropanol (30 min, 2 h, overnight) were tested. In addition, since DNA from highly colorated matrices may contain strong PCR inhibitors [25], two DNA washing steps with 70% ethanol were carried out.
The Nucleospin ® Food (Macherey-Nagel, Düren, Germany) kit was also tested to improve the DNA extraction from food material. The extraction was performed according to the manufacturer instructions.
The concentration and quality of extracted DNA were checked by Ultrospec 2100 Pro ® UV-visible spectrophotometer (American Biosciences, Piscataway, New Jersey, USA). DNA was treated with RNase A (Sigma-Aldrich, Germany) to completely remove RNA traces.

PCR amplification
For setting up the Polymerase Chain Reaction (PCR), the DNA was used at a final concentration of 10 ng/µL. Samples with high DNA concentration were previously diluted to a final concentration of 10 ng/µL before proceeding with the amplification process.
The fluorochromes 6-FAM, HEX, and NED were used to label the forward primers and the amplification products were analysed using a 3130 Genetic Analyzer sequencer (Applied Biosystems, Foster City, California, USA). Gen-eMapper software v. 4.0 (Applied Biosystems, Foster City, California, USA) was used to analyse the results and the ladder GeneScan ™ 500 LIZ ® Size Standard was used to estimate the allele sizes.

DNA extraction
The parameters tested such as the incubation times in the CTAB lysis buffer and in isopropanol are contributing factors in improving the quality of extracted DNA. For leaf tissues and kernel, a 10 min incubation turned out to be sufficient to get good quality DNA, while for episperm and food materials an increment in incubation time in CTAB buffer is needed to isolate DNA. For episperm, grain, flour, cookies, cream and chocolate the best incubation time was found to be 30 min while for paste 45 min. An extended incubation time in isopropanol was successful for DNA recovery from food matrices. RNA was totally removed using RNase A. The Nucleospin ® kit has been more suitable for DNA extraction in food matrices, allowing the recovery of a higher concentration and quality of DNA.
The DNA quantity and integrity were evaluated both by agarose gel electrophoresis (Fig. 1) and by spectrophotometric analysis; the ratios 260/230 nm and 260/280 nm are reported in Table 3. For plant matrices, since the DNA was of good quality and quantity, it was possible to visualize its integrity on the agarose gel, showing a clear band; for roasted episperm, since the quantity and quality of extracted DNA was low, it was not visualised on agarose gel. Concerning the food matrices, homemade paste, grain and flour produced a clear DNA band, while for industrial paste, cream, cookies and chocolate, DNA bands could not be visualized on the agarose gel, probably because they contain highly degraded DNA, in part masked due to the presence of inhibitors.

SSR genotyping
PCR amplification of all samples either extracted using the Doyle and Doyle protocol or the Nucleospin ® Food kit protocol, was successfully performed at all ten nuclear SSR loci. For food material, an increment of PCR cycles from 28 to 34 was found to be optimal for increasing the amplification yield.

Plant materials
The SSR profiles of 'Tonda Gentile' leaves revealed their correspondence to the cultivars included in the database developed at DISAFA [5] at all ten SSR loci, 5 of which (Cat-B501, Cat-B504, Cat-B502, Cat-B107 and Cac-B028) are reported as examples in Fig. 2.
The correspondence with the reference profile of 'Tonda Gentile' was also confirmed for raw and roasted episperm samples. Specifically, the SSR fluorescence signals were clear and reproducible for raw episperm and for samples of episperm roasted at 110 °C and 130 °C, as shown in Fig. 3. When the roasting temperature was increased (samples roasted at 140 °C, 150 °C and 160 °C), the repeatability of the analysis failed with an SSR profile characterized by low and not clear signals, and sometimes by a total lack of peaks at all SSR loci.
For raw kernel and kernel roasted at ≤ 130 °C, the DNA amplified well and the SSR profiles were clear. As expected, the SSR profiles included the pollenizer alleles that limited the correct identification of the maternal cultivar. However, in all DNA samples from raw and roasted kernels the presence of one of the two alleles of the cultivar 'Tonda Gentile' was always confirmed.

Food materials
DNA samples from homemade paste, industrial paste and other processed food produced genetic profiles at all tested SSR loci. Food DNA samples isolated through the Nucleospin ® Food kit protocol produced SSR profiles  Nuclear SSR profiles of homemade and industrial paste, grain, flour, cookies, cream and chocolate were characterized by the presence of many alleles. Among alleles found, two alleles at each locus were often dominant and matched with the two alleles of the cultivar of origin in the case of matrices originated from 'Tonda Gentile'; additive alleles were due to the presence of paternal DNA from different pollenizers in the bulk of kernels that were processed (Fig. 4: homemade and industrial paste; Fig. 5: cookies and chocolate).

DNA extraction protocol
In this work, we tested the efficiency of the genomic DNA extraction protocol Doyle and Doyle [24] on hazelnut plant material (kernel and episperm), and on food materials containing only hazelnut (paste, grain, flour) or hazelnut as an ingredient (cream, cookies and chocolate).
The Doyle and Doyle extraction protocol [24] was successfully adapted and allowed to obtain DNA from different types of plant and food material. The use of the There is complete identity with the reference profile of cultivar 'Tonda Gentile' commercial Nucleospin ® Food kit allowed to standardize and speed up DNA extraction, as suggested by many authors [17,22,28] and it should be recommended as a quick and simple way to isolate DNA in the quality control lab of the food industry.
The Nucleospin ® method allowed us to obtain the best results for the extraction of hazelnut DNA from food matrices, according to several authors [25,29]. Nevertheless, also CTAB-based method, with small changes, such as a greater incubation time and an extended incubation time in isopropanol was successful. In both cases, RNase A enzyme was used to remove RNA residues to obtain amplifiable DNA as suggested by Costa et al. [30,31].
However, the DNA isolated from food matrices (homemade and industrial paste, grain, flour, cookies, cream and chocolate) was of low quality, as 260/280 and 260/230 absorbance ratios showed (Table 3), suggesting contaminations with polysaccharides, proteins or lipids or other organic compounds. In addition, the amount of DNA was low and poorly or not detectable on agarose gel, due to degradation ( Fig. 1). Indeed, low DNA concentration and low 260/280 ratio did not necessarily indicate the unsuitability of DNA for PCR analysis [28].

SSR analysis
In this study, we evaluated the applicability and efficiency of 10 SSR markers for the traceability of the cultivar in different products based on hazelnuts. We have to consider that food processing conditions, including high temperature and mechanical stress, may lead to the degradation of DNA into small fragments [7,28] and that the extracted DNA, despite optimization of the protocol, can contain several PCR inhibitors.
To obtain DNA amplification, the use of BSA was required due to the presence of inhibiting substances that prevented the activity of Taq-DNA polymerase and consequently interfered with PCR amplification. Indeed, PCR can be negatively affected by polysaccharides, oils, fatty acids, polyphenols and tannins, compounds commonly present in commercial food. According to Schrader et al. [32], the addition of BSA to the PCR mixture is effective against some PCR inhibitors, which may interfere with nucleic acids, hamper the annealing of primers or inhibit/alter/degrade DNA polymerase. The allelic profile obtained from 'Tonda Gentile' leaves ( Fig. 2) was used as reference to estimate the correspondence of profiles obtained from the other matrices from this cultivar.
The SSR profiles, obtained through amplification of DNA isolated from raw (Fig. 3a) and roasted (Fig. 3b) episperm perfectly matched the allelic profile of the cultivar 'Tonda Gentile' at all ten loci. This result was expected since episperm is a maternal tissue. In particular, the fluorescence signals were clean-cut for raw episperm and for episperm roasted at temperatures ≤ 130 °C, but increasing the roasting temperature to 140 °C, 150 °C or 160 °C resulted in a loss of reproducibility of results. The consequences of roasting were thoroughly studied by Ogasawara et al. [33] on soybean, who found that only short DNA fragments could be amplified after roasting. Similarly, in hazelnut, roasting temperatures > 130 °C hindered the amplification of DNA from the kernel episperm, which is a thin plant tissue very easily penetrated by heat.
Therefore, when a batch of raw hazelnuts has to be identified by food companies and control authorities, the analysis of DNA of the raw episperm offers the possibility of correctly determining the cultivar of origin and could represent an effective way to discover frauds. Moreover, kernels could be roasted at ≤ 130 °C to facilitate the manual removal of episperm, avoiding kernel contaminants that may compromise the correct cultivar identification, due to the presence of paternal DNA in the embryo. Indeed, removing the episperm from raw seeds is more complex than removing it from seeds roasted at low temperatures, although hazelnut cultivars show different ease of pellicle removal: e.g. 'Tonda Gentile' is highly peelable, while in other cultivars episperm removal can be difficult even after roasting.
Clear-cut SSR profiles at all loci were also obtained for DNA extracted from homemade paste (roasted at 160 °C for 20 min). As reported by Gryson [7], although temperatures above 100 °C induce a physical DNA degradation (denaturation, significant strand scissions, irreversible loss of secondary structure), temperatures used for processing do not destroy all DNA, but rather they shear it into smaller fragments available for PCR amplification. Therefore, the results of SSR analysis confirmed the possibility to detect and amplify small DNA fragments (100-300 bp which is the range of sizes of the SSR loci used) from processed kernels, such as roasted homemade and industrial paste. In this case, the interpretation of SSR profiles is not easy because of the presence of many alleles, owing to the pollenizers' DNA, hampering the identification of the main cultivar. Yet, it was always possible to identify the two alleles of the 'Tonda Gentile' cultivar (Fig. 4).
SSR profiles at all loci were also obtained for DNA isolated from processed food products (cream, chocolate and cookies that contain hazelnut kernels mixed with additional ingredients). However, as described for hazelnut paste, SSR profiles were affected by the presence of alleles from the pollenizers or from other cultivars, in case the used hazelnuts are a mixture of cultivars that hampered the identification.
In the case of monovarietal products of known cultivars declared on the label, in our case 'Tonda Gentile', it was always possible to identify the alleles of the reference cultivar, although the presence of pollenisers's alleles prevents the possibility of excluding that the lot is a blend of kernels from more than one cultivar.

Conclusions
SSR marker analysis is a well-established, simple and common technique used for germplasm characterization of plant species. The aim of this work was to demonstrate its applicability to hazelnut DNA traceability along the supply chain.
SSR markers proved to be efficient for hazelnut cultivar identification in plant materials, such as raw episperm and episperm roasted at ≤ 130 °C. For kernel-based matrices, they allowed verifying the presence/absence of the alleles of a declared cultivar, but the identification is hampered by the signal of the pollenizers since it is not possible to distinguish alleles from the maternal and paternal parents. Indeed this problem is further amplified when the commercial lot is a mix of varieties as happens for Turkish cultivars. However, this is not the case with cultivar protected under geographical denominations commercialised in purity by the industry such as 'Tonda Gentile' productions. Lang et al. [23] were able to separate European cultivars from Georgian and Azerbaijan cultivars, with Turkish ones showing an ambiguous behaviour, based on polymorphic variations in the chloroplast genome.
Indeed the presence of polymorphisms in the chloroplast genome would be a straightforward way to distinguish productions in seeds of self-incompatible species but, at the moment, there is no evidence of molecular markers able to separate cultivars of the same geographic area, such as European hazelnut cultivars. Our studies already considered this possibility but no stable polymorphism was found among European cultivars (unpublished data) after the sequencing of the chloroplast DNA. There are possibilities that further studies are able to select stable markers for distinguishing Turkish commercial provenances from productions of European cultivars, a result that would allow the recognition of frauds in case Turkish semi-processed products are sold under the name of renowned cultivars of higher quality. At this stage of advancement of the research our study demonstrated that, given a reference cultivar, it is possible to identify the SSR alleles of the mother cultivar in all considered food matrices. In case the alleles of the putative mother cultivar are not present, it can be excluded that the hazelnuts are produced by that cultivar.