Introduction

Durum wheat is a staple food for human nutrition since it is the main ingredient of pasta, couscous, bulgur, and some breads in Mediterranean areas. In Italy, a consistent part of the durum wheat is consumed in several forms of bread. Such breads may have very different characteristics, are obtained through various methods, although they share some features in common. This frequently results in breads with a yellowish colour, a characteristic taste, smell, and aroma, a fine crumb structure, and a prolonged shelf-life, all of which give these breads high appeal to consumers (Ficco et al., 2017). This has raised the consumers’ interest in durum wheat-based bread. So, producers are trying to set up standardized industrial processes that enhance state-of-the-art production, still based on an artisanal approach in most cases. Despite that, contrary to ordinary wheat flour, the studies about semolina dough characterization are less numerous in literature. Although there are several studies concerning the dough rheological properties (Martín-Esparza et al., 2018; Meerts et al., 2017a, b; Mironeasa et al., 2019), very few studies analyze the relationship between these properties and the microstructure of the material which is responsible for the dough mechanical properties.

In this sense, Fourier transform infrared spectroscopy (FTIR) is an up-and-coming technique regarding the dough characterization field because the measurements are easy to perform but at the same time can give both qualitative and quantitative chemical information about the microstructure. Throughout this practice, it is possible to identify molecular functional groups by detecting the absorption of infrared light in the wavenumber range 400–4000 cm–1. In this regard, the dough’s main components (gluten protein, water, fat, and starch) can be easily identified due to known infrared absorption frequencies (Karoui et al., 2010). Gluten proteins represent the protein part of the dough, mainly consisting of two types of macromolecules, the monomeric soluble gliadins and the glutenins, aggregates of insoluble proteins linked by interchain disulphide bonds (Wieser, 2007). Glutenins are able to create a three-dimensional network in which they interact with gliadin by non-covalent forces, mainly hydrogen bonds (Tuhumury et al., 2014). The characteristics of this network are of great importance in determining the quality of the final product. The water absorption capacity and water mobility in flour products greatly depend on the distribution of polar groups, on the accessibility of these groups to water, on the relative strength of water-water and water-macromolecule interactions, on the degree of crystallinity of the matrix, and on the relative humidity conditions (Wang et al., 2014). Spectroscopic methods, like infrared spectroscopy, are regarded as a valuable tool to study changes in the gluten network structure (Nawrocka et al., 2018a). In particular, the secondary structure of the proteins is studied by analysis of the characteristic Amide I band (1570–1720 cm−1) (Rumińska et al., 2020). However, this band is strongly dominated by the presence of the water OH deformation peak at approximately 1640 cm−1 (Kong & Yu, 2007), which in most cases overlaps with it (van Velzen et al., 2003). For this reason, the Amide III band (1200–1340 cm−1) is utilized since it is less affected by water oscillations (Nawrocka et al., 2020).

As regards secondary structures, the unhydrated gluten is mainly characterized by β-sheet (39%) and random (30%) conformations that are converted into β-turn one when gluten is hydrated, as it occurs in dough development (Dong et al., 1990). The α-helix content, instead, increases as humidity increases, so its destruction might be related to the reduction of the hydrogen bond in the gluten (Jia et al., 2018).

Another advantage of FTIR is its capability to provide quantitative information about water populations in the bread dough (Bock & Damodaran, 2013). Free water availability is one of the essential properties affecting the dough structure building, and its amount should be carefully dosed (Fanari et al., 2020). Water has characteristic infrared resonances at wavenumbers of 3350 cm–1 (OH stretching), 1640 cm–1 (OH bending), and in the region beneath 800 cm–1 (OH deformation) (van Velzen et al., 2003), being the latter one sometimes related to the excess water (Warren et al., 2016). The water populations analysis through IR band investigation can reveal a decrease or increase in the number of strong and weak hydrogen bonds between gluten proteins and water molecules, and lack or excess of free water in the gluten network, with respect to the dough formulation or wheat characteristics (Nawrocka et al., 2017). The dehydration of the gluten matrix is regarded as an effect of competition for water between gluten proteins and polysaccharides that is connected with a redistribution of the water in the wheat dough (Xuan et al., 2017). This phenomenon is strongly influenced by the gluten content of the dough since a higher amount of gluten proteins is usually connected with a higher water absorption capacity (Fanari et al., 2019).

The connection between rheological properties and IR measurements can be an effective tool in the material characterization field, as demonstrated in the recent literature (Chen & Zhen, 2021; Öztürk, 2021; Radebe et al., 2021). This work correlates the possible gluten protein structures on a molecular level to the macroscopic rheological properties of the dough. The IR spectrum analysis has a specific focus on the Amide III band. A Partial Least Squares (PLS) model is proposed here to relate FTIR measurements and rheological model parameters. Chemometric approaches are recently been used to relate food properties and measurements obtained by different spectroscopic techniques (Carbas et al., 2020; Cortés et al., 2019; Cueto et al., 2018; Liu et al., 2014a, b; Wu et al., 2014).

In this work, in order to estimate the rheological properties of durum wheat dough from FTIR measurements, the Amide III band was investigated, and a data-driven model was built by considering different semolina species that significantly differ in gluten, gluten index, and proteins, to cover a wide variety of possible semolina features (see details in Table 1). Dough composition is further varied by considering different water/flour ratios.

Partial least squares regression allows the estimation of the Variable Importance in Projection (VIP) scores for the regressor model (Liu et al., 2014a, b). This technique enables identifying the most significant wavenumbers, and thus the corresponding protein structures and their role in defining rheological properties. The purpose is to find if this technique is effective to detect the changes in the dough microstructure, in view of a possible implementation of an on-line monitoring system based on rapid and non-invasive FTIR measurements.

Materials and Methods

For each sample, 300 g of semolina and distilled water in different amounts were kneaded using a measuring mixer type 350 (Brabender). The mixing time was determined as the one required to reach the dough maximum strength point as observed in previous investigations (Fanari et al., 2021), and it was found to be 5–8 min, depending on the sample. Values of the mixing time are reported in Table 2. The rotational mixing speed was set to 20 rpm to avoid structure breaks in the dough network. Three different types of semolina were investigated. Two are of non-commercial monovarietal species cultivated in the “S. Michele” experimental farm of the Agricultural Research Agency of Sardinia (AGRIS) in Ussana (Italy), resulting from the milling of Karalis and Cappelli grains. The third kind is a commercial blend acquired in the German retail stores (Gold Puder-Hartweizen Grieß, Aurora Mühlen GmbH, Hamburg, Germany). It should be remarked that their properties are quite different in terms of protein, gluten percentages, and gluten index, as can be seen in Table 1. Gluten index (G.I.), a parameter providing information on both gluten quality and gluten quantity, was determined by the AACC Int. 38-12A Standard method (AACC International, 2000). Table 2 reports the water content in the dough samples. The percentage of water is based on the semolina's total weight. The semolina protein content was determined through the nitrogen combustion method (ISO 16634–2:2016, 2016) using a Leco FP528 nitrogen analyzer (LECO, Stockport, U.K.). Gluten content and G.I. of semolina were determined following the ICC standard method No. 158 (ICC, 1995) by using the Glutomatic 2200 system (Perten Instruments AB, Huddinge, Sweden).

Table 1 Semolina varieties characteristics (means ± STD); values with different letters in the same column are significantly different (p-value ≤ 0.05)
Table 2 Samples water content based on semolina weight

FTIR Measurements

The IR Spectrometer used for the measurements was a Vertex 70 spectrometer (Bruker, Ettlingen, Germany), equipped with a Universal ATR (attenuated total reflectance) sampling device containing diamond crystal. Spectra were collected in the 600–4000 cm−1 infrared spectral range at room temperature. Each spectrum was an average of 32 scans at 2 cm−1 resolution. Data were processed by the OPUS software (Bruker, Ettlingen, Germany). Each spectrum was manually corrected with a linear baseline using ORIGIN (v.9.0 PRO, OriginLab Corporation, USA). All experiments were repeated three times on three different portions of the sample. In order to reach quantitative data, the height of the peaks was considered and compared after normalizing all the spectra dividing them by the intensity of the CO/CC peak. Consequently, this peak cannot be used for quantification purposes of the CO/CC band. The second derivative of the processed spectra was calculated to identify the protein secondary structure in the Amide III band. Local minima of the second derivative, assuming only negative values, were used to identify and assign each secondary structure to its correspondent wavelength range (Seabourn et al., 2008). Each Amide III peak was then integrated into the corresponding band interval. This interval was chosen based on the literature (intervals and references are reported in Table 3) to estimate the percentage content of each structure, dividing the integral of the interval by the integral of the entire band. Mean values and standard deviations for each sample triplet were calculated.

Table 3 Secondary structures frequency range in Amide III region, with reference to the literature (Cai & Singh, 1999; Singh, 1999)

Rheological Measurements

Rheological experiments were performed with an ARES-G2 strain-controlled rheometer (TA Instruments, New Castle, USA) equipped with a 25-mm parallel plate geometry. Immediately after the kneading process, a piece of dough was loaded on the rheometer, compressed to a gap of 2 mm, and then left at rest for 15 min to allow material relaxation, as suggested in the literature (Phan-Thien & Safari-Ardi, 1998). A layer of silicon oil was applied to the edge of the parallel plate geometry to prevent water evaporation from the sample. The measurement temperature in the rheometer was kept constant at 25 °C using a Peltier temperature control system. Frequency sweep tests were performed with frequencies ranging from 0.1 to 100 rad·s−1 with a constant strain of γo = 0.1%, that is the upper limit of the linear viscoelastic regime as evaluated through preliminary amplitude sweep tests. Complex module data were modelled as a function of the deformation frequency using the Weak Gel model (Gabriele et al., 2001), reported in Eq. (1):

$${G}^{*}\left(\omega \right)=\sqrt{{{G}^{\mathrm{^{\prime}}}\left(\omega \right)}^{2}+{{G}^{\mathrm{^{\prime}}\mathrm{^{\prime}}}\left(\omega \right)}^{2}}={A}_{F} {\omega }^{^{1}/_{z}}$$
(1)

where \({G}^{*}\) is the viscoelastic modulus, \(\omega\) is the angular frequency, \({A}_{F}\) is a model parameter that is related to the strength of the network structure, and \(z\) is a model parameter linked to the extension of the three-dimensional network. In the literature, several examples of the application of the Weak Gel model on doughs can be found (Angioloni & Collar, 2013; Baldino et al., 2014; Lucas et al., 2019; Meeus et al., 2020). Each test was repeated three times for each sample, and the mean value was taken into account for the study.

Statistical Analysis

The spectral data were related to the rheological parameters of the samples by means of a PLS model (Geladi & Kowalski, 1986), which links the dependence of the multivariate data matrix X (for the case at hand, the IR spectra) to the rheological properties measured on the same samples. The PLS model is reported in Eq. (2), where the reference values y are the parameters AF and z estimated through the Weak Gel model.

$$\begin{array}{c}{\varvec{X}}={\varvec{T}}{{\varvec{P}}}^{{\varvec{T}}}+{\varvec{E}}={\sum }_{{\varvec{i}}=1}^{{\varvec{m}}}{{\varvec{t}}}_{\mathrm{i}}{\mathbf{p}}_{i}^{T}+{\varvec{E}}\\ {\varvec{y}}={\varvec{T}}{{\varvec{q}}}^{{\varvec{T}}}+{\varvec{f}}={\sum }_{{\varvec{i}}=1}^{{\varvec{m}}}{{\varvec{t}}}_{\mathrm{i}}{\mathbf{q}}_{i}^{T}+{\varvec{f}}\\ {\varvec{T}}={\varvec{X}}{\varvec{W}}{\left({{\varvec{P}}}^{{\varvec{T}}}{\varvec{W}}\right)}^{-1}\end{array}$$
(2)

where E (n × p) and f (n × 1) are error matrices containing the part of X (n × p) and y (n × 1), respectively, which the model does not explain, n and p are the number of samples (rows) and variables (columns), respectively. In the present study, n = 36, and p = 74. The vector ti is the i-th column vector that composes the score matrix T (n × m), pi and qi are the loadings that compose the loading matrices P (p × m) and q (1 × m), where m is the number of latent variables chosen to explain the significative variance of the data. The matrix W (p × m) is the weight matrix obtained by the PLS regression.

After PLS regression had been accomplished, the interest was focused on discriminating wavelengths in correspondence of which the absorption signal is the most influential on the changes in y (i.e., the rheological parameters) from the ones having no discrimination power. The relative importance of wavelengths in the model could be described by the VIP scores. For the j-th variable, the VIP scores in a PLS model with m principal components can be calculated as:

$${VIP}_{j}=\sqrt{p\frac{{\sum }_{a=1}^{m}\left({q}_{a}^{2}{t}_{a}^{T}{t}_{a}{\left(\frac{{w}_{ja}}{\Vert {w}_{a}\Vert }\right)}^{2}\right)}{{\sum }_{a=1}^{m}{q}_{a}^{2}{t}_{a}^{T}{t}_{a}}}$$
(3)

where \({t}_{a}\) is the a-th column vector of the score matrix T, \({q}_{a}\) is the a-th element of the regression coefficient vector q of T, \({w}_{a}\) is the a-th column vector of the weighting matrix W (Mehmood et al., 2012). Wavelengths at which the VIP scores were above a threshold value of 1.0 were considered significant (Eriksson et al., 2006).

Results

FTIR Spectra, Peak Assignments, and Comparison

In Fig. 1, the FTIR spectrum of the CAP50 sample, considered the reference for the whole series of IR spectra, is reported. Six main peaks were detected.

Fig. 1
figure 1

FTIR absorbance spectrum of CAP50 sample, normalized with respect to CO/CC peak intensity

Going from 4000 to 600 cm−1, the first high-intensity peak was identified as the stretching of the OH in polymers, OH-H hydrogen bond, and free OH (Socrates, 2001). The center of this peak is located in the range 3250–3400 cm−1, varying from one sample to the other. Unfortunately, it is not possible to distinguish which part of OH is hydrogen-bonded or not. This region is followed towards lower wavenumbers by a very small peak, typical of the C-H stretching (CH) phenomenon of CH and CH2 groups (Alvarez & Vázquez, 2006); this band, approximately situated between 2800 and 3000 cm−1, suggests the presence of aliphatic groups like methyl or methylene, and in general hydrophobic groups. Thus, it can be linked to the presence of unsaturated lipids and the minor contribution of small carbohydrates. This reading may explain its limited intensity since it combines the concentration and the specific dipole moment of the bond under investigation. Furthermore, going to lower wavenumbers, the next identified band is in the region 1600–1710 cm−1, usually associated with C = O stretching of the amide group (Fevzioglu et al., 2020) and called “Amide I” ( A1) in combination with the OH bending of water (1640 cm−1) (Kong & Yu, 2007). This area contains several peaks, and, as already stated, it is frequently subjected to deconvolution because this procedure is able to give information on the amount of the different protein structures. In all the samples here investigated, the peak centered at around 1635 cm−1, and linked to the -OH in-plane bending of water was visible (Wang et al., 2001). Another peak region linked to proteins was identified at 1400–1550 cm−1. This is the typical localization of the so-called “Amide II” (A2) band associated with the N–H bending and to gluten protein C-N and C–C stretching (van Velzen et al., 2003). Also, the “Amide III” band (A3), at 1200–1340 cm−1, which mainly arises from N–H bending and C-N stretching vibrations (Fevzioglu et al., 2020), was identified. Although its intensity is limited, a more detailed analysis of this band can be informative about the protein structure characterization, as already discussed in the introduction section. Another spectral region (CO/CC), located at 900–1200 cm−1, is associable with the coupled C-O and C–C stretching vibrations of polysaccharide molecules (Sivam et al., 2012), mainly starch in this case. In particular, two interesting peaks are distinguishable: the first one, at about 1020 cm−1, is related to amorphous regions of starch, and the second one, at about 1080 cm−1, is linked to crystalline starch, according to Almeida and Chang (Almeida & Chang, 2013). Finally, the last peak, at around 700 cm−1 (COH), was assigned to the out-of-plane bending of the hydroxyl groups, sometimes linked to free water molecules (Célino et al., 2014). A summary of identified peaks, localization and assignment is reported in Table 4.

Table 4 Summary of the detected peaks, band wavenumbers, and assignments

Figure 2 addresses the comparison among the FTIR spectra of samples achieved from the same semolina but with different water amounts. Figure 2a shows the FTIR spectra measured from the samples with Karalis (KAR), Fig. 2b the samples with Cappelli (CAP), and Fig. 2c the samples obtained with the commercial flour (COM). All of them were obtained at different water amounts. In Table 5, it is possible to observe the peak intensities and integral values for all the samples. As a general statement, it can be seen that water affects mainly the peak height and area, which increase with water content. The peak position is affected to a lesser extent.

Fig. 2
figure 2

FTIR absorbance spectra of CAP (a), KAR (b), and COM (c) samples with different water amounts, 40% (blue), 50% (black), and 60% (red). All spectra were normalized with respect to CO/CC peak intensity

Table 5 Center (cm−1) and height of peaks and band area for spectra normalized with respect to CO/CC peak intensity (means ± std)

Concerning the water content impact, the most influenced peaks are OH, COH, the peak in the Amide I band at about 1640 cm−1, which is linked to OH bending, and, to a lesser extent, the peak at about 1550 cm−1 in the Amide II band. KAR and COM samples present the most significant changes when the water quantity is increased from 40 to 50%: OH peak intensity undergoes an increase of about 70% and 25%, respectively, while COH intensity of about 50% and 15%, respectively. On the contrary, CAP samples peak heights show changes of 10–15% maximum when the water quantity increases from 40 to 50%, and the most significant variations occur when water goes from 50 to 60%, with an increase of 40–45% for OH and COH peaks. KAR samples, on the contrary, show very low changes (less than 5%) in the peak intensity when water goes from 50 to 60%. Figure 3 compares three samples prepared with the same water amount (50%) but different semolina varieties. This figure shows a similar spectrum for KAR50 and COM50 samples, especially in the water-linked peaks, while CAP50 presents quite different intensities, particularly for OH, CH, A2, and COH peaks. In more detail, CAP50 sample shows significantly lower intensities in the OH and COH peaks and slightly higher ones regarding CH and A2 peaks.

Fig. 3
figure 3

FTIR absorbance spectra, comparison among CAP50 (black), KAR50 (blue), and COM50 (red). All spectra were normalized with respect to CO/CC peak intensity

Amide III Band Analysis

Amide III band analysis was performed to identify the protein structure bands by means of the second derivative of the spectra. The analysis of the spectra second derivative is shown in Fig. 4 for one of the CAP50 samples. However, comparable results in terms of peak positions were obtained also for the other samples. The negative peaks of the second derivative were associated with hidden peaks in the band, representative of the protein structures conformations. From Fig. 4, it can be seen that seven peaks are detectable. The peaks at 1242, 1263, and 1284 cm−1 can be assigned to β-sheet (β-S), random coil (RC), and β-turn (β-T) structures respectively, while peaks at 1302, 1315, and 1335 cm−1 can be associated with α-helix (α-H) structure, according to several previous works (Cai & Singh, 1999; Nawrocka et al., 2017, 2018b; Wang et al., 2015). The peak at 1302 cm−1 is not always clearly visible, as it presents very small values in the second derivative. The peak at 1207 cm−1 is often assigned to tyrosine (Ngarize et al., 2005) or the associated conformation of phenylalanine (phe) and tyrosine (tyr) (Frushour & Koenig, 1975; Xie et al., 2004). Phenylalanine is one of the most abundant aminoacids in ω-gliadins fraction (Seilmeier et al., 2001). Tyrosine is mainly contained in high molecular weight glutenins (Peña et al., 2006). It might be involved in forming covalent interactions like tyr-tyr crosslinks between gluten polypeptide chains, influencing the gluten network structure (Kłosok et al., 2021). Moreover, tyrosine residues are able to create hydrogen bonds that improve the stability of the dough during the kneading process (Nawrocka et al., 2016). The involvement of tyrosine residues in hydrogen bond formation possibly accounts for intermolecular hydrogen bond formation between β-sheets acting as junction zones in stabilizing the gel network (Wang & Damodaran, 1991). This can explain the position of this peak inside the β-sheet region, confirming that the 1207 cm−1 peak can be related to aminoacid, tyrosine in particular, capacity to create intermolecular bonds between protein chains.

Fig. 4
figure 4

Amide III band second derivative for the CAP50 sample

Concerning the analysis of the protein structures distribution, results of integration, in terms of percentual contribution of each structure and total peak area of Amide III band are reported for all samples in Fig. 5. As a general statement, it can be seen that water influence reflects a slight increase of β-sheets and a slight decrease of α-helices in CAP and KAR samples. However, comparing the absolute values among the samples prepared with different durum wheat varieties, KAR has a significantly higher amount of α-helices compared to the other varieties. Moreover, regarding β-sheets and random structures, KAR is the variety with the lowest percentage amount of these conformations. On the other hand, CAP and COM varieties have similar percentage amounts of the four structures, with no significant differences, especially when the added water amount is 40–50%. Instead, the β-turn fraction does not show substantial differences in the comparison among the three varieties.

Fig. 5
figure 5

Secondary structures percentage contribution (means ± STD) in Amide III Region for CAP (black), KAR (blue), and COM (red) samples as a function of water content

Rheological Measurements

The rheological measurement results and the Weak Gel model function representation are reported in Figs. 6, 7, and 8. In order to show the typical response to sinusoidal deformation of dough, storage, and loss modulus as a function of frequency are reported in Fig. 6 together with tanδ data for the CAP50 sample chosen as representative of the entire data set. The results do not qualitatively change for the other samples. It is possible to notice that the trends of the modules are almost linear in the logarithmic scale. Concerning tanδ, a slight increase with frequency can be appreciated. The tanδ parameter is smaller than 1 in the frequency range investigated, implying a rheological response mainly dominated by the elastic contribution. Furthermore, the two modules vary similarly following two almost parallel linear trends, justifying the use of the Weak Gel model (Gabriele et al., 2001).

Fig. 6
figure 6

Storage and loss modulus (red squares and blue triangles respectively), and tan(δ) values measured for CAP50 sample, reported as a function of frequency

Fig. 7
figure 7

Comparison among |G*| of samples with an amount of water of 40% (blue), 50% (black), and 60% (red) for each variety investigated, CAP (a), KAR (b), COM (c). The black lines represent the Weak Gel model fitting curve

Fig. 8
figure 8

|G*| comparison among CAP50 (black), KAR50 (blue), and COM50 (red); the black lines represent the Weak Gel model fitting curve

Figure 7 shows |G*| as a function of frequency for each variety and at different water amounts. KAR, CAP, and COM samples are reported in Fig. 7a–c, respectively. Figure 8, instead, shows the semolina variety influence on |G*|. Table 6, in the Appendix, reports the values of the Weak Gel model parameters, their confidence interval, and the adjusted R2 value for the regression. In Fig. 9, instead, these values were reported as means with confidence intervals. From Fig. 9, one should appreciate that rheological properties of CAP samples show lower differences when water amount changes, whereas COM samples are the most sensitive.

Fig. 9
figure 9

Weak Gel model parameters (means and confidence intervals) reported for CAP (black), KAR (blue), and COM (red) samples as a function of water content

In more detail, when water goes from 40 to 50%, all samples registered a decrease in the network strength of about 25–30%, but when water is increased from 50 to 60%, KAR and COM samples show a decrease of AF parameter of about 55–60%. In comparison, CAP samples decrease their network strength by about 35%, showing an almost linear dependence. Moreover, the influence of water on the extension of the network, observable in Fig. 9, is clearly visible only for KAR samples. Regarding Fig. 8, it is possible to notice that the CAP50 sample has the highest G' and G" values, so the Cappelli semolina variety is the strongest one, followed by KAR. At the same time, COM turns out to be the weakest, in rheological terms, with the lowest strength of the network (Table 6). Concerning the network extension, explained by the parameter z, as reported in Fig. 9, KAR has a considerably higher network extension than the other two semolina doughs.

Relationship Between Rheological Measurements and Amide III Band

A PLS regression model relating the rheological AF and z parameters with the spectra of the A3 band was developed to assess a possible correlation between rheological properties and changes in the Amide III band composition. The number of latent variables chosen for the models was 5. The regression results for AF are reported in Fig. 10, where the parameter values predicted by the PLS regression model are shown as a function of the Weak Gel model regression ones. A good correlation between the AF parameter, predicted by the PLS model, and the experimental rheological data was found, as confirmed by the high value of R2 for the regression, equal to 0.832. On the other hand, the PLS technique failed to establish a solid correlation between z and the Amide III spectra, as seen from the low value of R2 for the regression (0.550). Therefore, it is likely to assume that the extension of the network, which is strictly related to the z-value, is not clearly influenced by the protein structure's conformation and its quantities, at least for the experimental conditions here investigated.

Fig. 10
figure 10

AF data (a) and z data (b) predicted by the PLS regression reported as a function of the ones estimated by the Weak Gel model (Gabriele et al., 2001)

Since the PLS regression revealed to be effective in AF prediction, VIP scores calculation was used to establish the secondary structures in the Amide III that are more relevant in defining the rheological properties. Wavenumbers showing VIP scores higher than 1 were considered more significant in the definition of AF parameter in this case.

VIP scores as a function of the wavenumbers are reported in Fig. 11a and compared to the second derivative (Fig. 11b) to analyze which intervals of the Amide III band are the most important. The local highest VIP score positions are pointed with an arrow in Fig. 11a. As previously asserted (Cai & Singh, 1999), the amide III band can be divided into four spectral regions, corresponding to 1200–1250, 1250–1270, 1270–1295, and 1295–1330 cm−1, assigned to β-sheets, random coils, β-turns, and α-helices, respectively. In this case, it is possible to see that VIP scores are higher than 1 in correspondence to the second derivative minima occurring at 1207 cm−1 and 1242 cm−1 that are well within the β-sheet interval and in correspondence to the minimum at 1335 cm−1, belonging to the α-helix interval. Thus, one can conclude that these latter protein configurations have a higher importance in defining the network rheological strength. On the other hand, random coils and β-turns regions show lower significance in this sense. For the sake of completeness, the calculation of VIP scores concerning the z parameter has also been reported in the appendix (Figure 12). The graph, where the comparison between VIP scores and the second derivative is reported, shows that the most influencing areas in the case of z are comparable to those identified for AF.

Fig. 11
figure 11

Comparison between VIP scores of the PLS regression (a) and second derivative of the Amide III spectrum (b) for the AF parameter. Figure 11a is a representative example corresponding to the sample CAP50-3

Discussion

The spectra analysis reveals that OH and COH peaks are positively related to the water amount. Indeed, looking at the differences in the OH peaks in the spectra of CAP50, KAR50, and COM50, it is possible to hypothesize that CAP50 has a lower amount of free water compared to the other two semolina samples. This aspect is supported by the higher amount of proteins and gluten of this variety, which confers it the capacity to bond higher quantities of water. Moreover, it was observed that CAP50 samples show more pronounced peaks (in terms of height and area) in the regions of CH and Amide II. Thus, one can hypothesize that these spectral bands might be related to the bound water in the network.

Since KAR has a higher G.I. value and a lower gluten quantity, it can absorb a lower amount of water (Dhaka & Khatkar, 2015). So, when water is added at 40%, the dough has limited water content, and probably most of this is bonded in the network. When the added water quantity increases to 50% and then to 60%, the network is not able to completely absorb such large amounts of water. Thus, a consistent part of the latter stays in the dough as free water, significantly enhancing both OH and COH peaks. CAP, on the contrary, has a higher gluten content and a low G.I., which means that it can absorb larger amounts of water in the network and requires lower mixing times. Consequently, the free water amount in the dough significantly accumulates only starting from a content of added water of 60%, for which OH and COH peaks are enhanced.

The differences in the water binding are also confirmed by the changes in the network strength, more pronounced for KAR and COM samples when water is added up to 60%. The quantity of free water is an important parameter to predict the final rheological properties of the dough since CAP50 with a lower amount of free water results to be the strongest dough with the strongest gluten network, as previously stated. This finding originates from the composition of its gluten network. As previously discussed, CAP has the highest percentage amount of β-sheets, which are considered the main component contributing to the formation of the network structure and, consequently, to the elastic properties of gluten proteins (Li et al., 2006). COM and CAP semolina varieties have a similar protein structure amount regarding β-sheets and α-helices. Still, the COM one, due to the lower amount of total gluten, shows less resistance to deformation, as can be observed by the rheological curves. KAR dough, on the contrary, despite its low percentage content of β-sheets and its low water absorption capacity, still presents higher consistency with respect to COM. This leads to the hypothesis that the high amount of α-helices, which is related to the high content of glutenins (Shewry et al., 2002), is able to counterbalance the lower amount of β-sheets. This assumption is supported by the results of the VIP scores analysis.

Concerning the influence of water, it results in a slight increase of β-sheets and a slight decrease of α-helices in CAP and KAR samples but, since these differences are very small, it is difficult to relate them to the changes in rheological properties as a function of water amount. More likely, the water distribution in the dough is the dominant factor that controls rheological response.

Regarding the chemometric technique proposed in this work, the PLS-VIP method combined with the analysis of the second-derivative spectra appears as a promising tool to understand the role of the protein configuration in the strength and extension of the gluten network, based on the information provided by the rheological measurements. In detail, the parameter AF of the Weak Gel model (linkable to the strength of the network) might be estimated from the Amide III band spectra. Moreover, from the VIP scores analysis, it emerged that β-sheet and α-helix structures have a higher importance in defining gluten network strength. This statement is also consistent with results recently obtained by Wang et al. (2022) through a different approach, which related the ratio of β-sheet to α-helix to the dough consistency. Instead, the extension of the network represented by the z parameter did not depend on the changes in the protein structure contribution. The low variability shown by the z parameter among the different samples and the results obtained with VIP scores analysis, which identified that the same regions are more significant both for AF and z, suggest that more in deep analysis would be necessary to better investigate the link between extension of the network and protein conformations.

Conclusions

In the present work, durum wheat doughs were studied by exploiting the combined use of rheological and infrared spectral analysis. The goal was to infer rheological information from indirect spectral measurements by resorting to a PLS data-driven model. For the scope, different kinds of semolina were tested on samples with different water amounts in order to cover a wide range of dough consistency. Satisfactory results were found regarding the prediction of the gluten network strength. Additionally, the methodology was capable of giving valuable insights on the protein conformations that mainly affect the rheological behavior. Availability in the future of larger datasets, including also other semolina varieties, may be helpful for a better estimation of the dependence of the rheological parameters on IR spectra. This can also allow implementing a model validation, by exploiting experimental points not used in the PLS calibration step.

In conclusion, the method could be promising for future development of FTIR-based prediction systems for the on-line monitoring of rheological and structural properties of dough in the baking industry, considering the fast response times guaranteed by FTIR measurements.