Introduction

Coffee is one of the most popular beverages since millions of people all over the world consume it several times a day. Roasting is an essential step in the coffee production process, as it enables the development of colour, aroma, and flavour, which are crucial for coffee quality. During roasting, the coffee beans undergo a steady weight loss, due to initial dehydration and then to a complex set of chemical and physical processes (Iaccheri et al. 2015). At the same time, the beans undergo a significant increase in volume, essentially caused by internal formation of gas (CO2 and other thermal decomposition products) (Franca et al. 2005). During roasting, a large number of chemical transformations take place simultaneously, leading to the formation of thousands of different molecules (De Maria et al. 1996). The chemical composition of roasted beans is related to the variety and geographical origin of green coffee, and it has a relevant influence on the sensorial properties of the beverage. Furthermore, the roasting conditions and technological, agronomic and physical parameters can deeply affect the final coffee quality (Esteban-Díez et al. 2007). Thus, the roasting process must be controlled accurately to meet the consumers’ expectations, which are accustomed to a constant and consistent level of coffee quality.

For this reason, the assessment of the optimal roasting degree during the roasting process represents a powerful screening tool to quickly evaluate the overall performance of the process, and to ensure its reproducibility. This important indicator is closely related to the sensorial properties of the final coffee beverage, and it was found to be highly affected not only by both the characteristics and the species of the green coffee beans, but also by the roasting process conditions (Andueza et al. 2007). The coffee roasting degree has been experimentally assessed using different analytical approaches, mainly based on off-line analysis of coffee batches, i.e. gathering this information only at the end of the process. In this context, earlier research studies focused on detecting the variations in the chemical composition profile of the gas developed during roasting using gas chromatography-flame ionization detector (GC-FID) systems (Dutra et al. 2001), whereas in more recent studies, the volatile component profile of the coffee beans was analysed under different process temperatures using solid-phase microextraction (SPME) and gas chromatography-mass spectrometry (GC–MS) (Franca et al. 2009), or by monitoring the chemical composition of the roasting off gases by online mass spectrometry (Heide et al. 2020; Czech et al. 2020). All these techniques remain relatively distant from real-time and on/in-line process control setups.

During the last decades, the use of spectroscopic approaches, mainly based on near-infrared (NIR) spectroscopy, has proven to be undoubtedly advantageous with respect to other classical analytical techniques, due to the reduced time between performing the analyses and obtaining the results, as well as to the reduced or even unnecessary sample pre-treatment. Nowadays, NIR spectroscopy associated with chemometrics is a widespread and well-known methodology already adopted to monitor crucial quality parameters of food products and processes, leading to the implementation of process analytical technology (PAT) approaches in the food industries (Grassi and Alamprese 2018). In the framework of coffee roasting, NIR spectroscopy has been successfully applied to predict important coffee quality parameters and to determine the roasting degree itself. Different off-line applications have been proposed to predict sensory attributes of espresso coffees from the NIR spectra of roasted coffee beans (Esteban-Díez et al. 2004) and to estimate roasting colour and caffeine content (Pizarro et al. 2007). As far as it concerns roasting, different approaches have been investigated for predicting relevant process variables such as weight loss, density and moisture (Alessandrini et al. 2008; Shan et al. 2015), roasting degree (Bertone et al. 2016; Pires et al. 2021) and the effects of different roasting conditions on coffee samples having different geographical origins (De Luca et al. 2016). NIR has been also investigated, using on-line or even in-line acquisition probes, for the quantification of acidity (Santos et al. 2016a), sugar content and colour (Santos et al. 2016b) and antioxidant properties (Catelani et al. 2017) of coffee beans during the roasting process. An interesting study by Catelani et al. (Catelani et al. 2018) evaluated the effectiveness of an approach based on the combination of NIR spectroscopy and multivariate data analysis for the early detection of roasting process disturbances. Moreover, in situ NIR spectroscopy was successfully applied by Yergenson and Aston for the real-time determination of coffee roasting degree based on the prediction of first and second crack events during the roasting process (Yergenson and Aston 2020b, a), by combining the chemical information of the NIR spectra with the audio recordings of the cracking events, thus not continuously tracking the roasting process.

All the above-mentioned research studies clearly highlighted that the changes occurring during roasting are ascribable to many different chemical and physical parameters, which are potentially correlated with each other and can be determined using fast and non-invasive analytical techniques like NIR spectroscopy.

In the present study, the potential of an approach based on in-line Fourier transform near-infrared (FT-NIR) spectroscopy coupled with chemometric modelling was investigated at laboratory scale to track the coffee roasting evolution and to assess the process performance in real-time from an overall viewpoint, i.e. by considering many different chemical-physical parameters at the same time.

The in-line FT-NIR spectra were acquired on the entire coffee beans by inserting a NIR probe into a laboratory coffee roaster. Several roasting experiments were performed considering different coffee species, roasting temperature settings and roasting time durations. The FT-NIR spectra were modelled using principal component analysis (PCA) and multivariate curve resolution-alternating least squares (MCR-ALS) (de Juan and Tauler 2006), with the aim of obtaining structured information about the roasting progress. Furthermore, three kinetics critical parameters of the process (maximum rate, acceleration and deceleration) were derived from the MCR-ALS concentration profiles, and the potential effects of roasting temperature or coffee variety on these kinetic parameters were investigated and discussed.

The final aim of this study was to explore the reliability of a roasting monitoring tool based on NIR technology to be implemented in the coffee roasting chain for automating the real-time control of the roaster’s parameters.

Materials and Methods

Setup of the Roasting System

The roasting experiments were conducted using a laboratory coffee roaster (Sample PROBAT PRZ 2/4/6 Barrel Roaster, CoffeeTec, Germany) kindly provided by a leading Italian coffee industry. The roaster was mounted on a wood base, and it was electrically powered. It consisted of a cylindrical rotating drum with a circular window on the front end and a hopper to load the coffee beans on the back end. The drum had a dedicated air temperature display to allow simple adjustments to the heat dial. Temperature control was realized by means of a PID controller, which regulated two resistors positioned on the bottom of the drum. During the roasting process, the cylinder was kept in rotation to ensure even roasting for all the coffee beans.

Preliminary roasting tests were performed to identify the best setup for the in-line acquisition of the NIR spectra during the process. In this context, some relevant parameters had to be considered. First, the appropriate stream of air to flux inside the roaster during the process had to be determined. This parameter could be manually regulated in each drum by means of a lever. Its adjustment is important to avoid the accumulation of vapour and combustion fumes, which normally develop during the roasting process and that can deeply affect the quality of the acquired NIR spectra. As a result of the preliminary tests, the air flow was set at a constant level of medium intensity to remove the gaseous components while avoiding, at the same time, excessive cooling of the coffee beans during the process.

A second parameter that had to be accurately controlled is the air temperature: four different temperature settings were considered (hereafter referred to as TS1, TS2, TS3 and TS4). An optical temperature probe (TS Temperature Sensors, Micronor LLC, USA) was employed to measure the actual temperature of the coffee beans in real time (one acquisition per second) throughout the roasting process.

In-line NIR Acquisition System

A total of 1636 in-line NIR spectra were collected during 24 roasting experiments. The NIR spectra were acquired in diffuse reflectance mode using an “Emission Head” NIR probe coupled to a MATRIX-F FT-NIR spectrometer (Bruker Optics, Germany). The Emission Head allowed collecting the NIR spectra by placing the probe on a stand near the open end of the roaster. An angle of incidence of 45° downwards between the probe and the window of the roaster was found to be the optimal solution to properly focus the NIR light beam on the coffee beans during the roasting process. The NIR spectra were acquired with a resolution of 8 cm–1, 64 co-added scans and considering the wavenumber range between 10,265 and 4135 cm–1 (974–2418 nm). The FT-NIR spectrometer was controlled via OPUS software (version 7.5, Bruker Optics, Germany). The considered NIR acquisition system allowed collecting one spectrum every 15 s. Before starting the NIR spectra acquisition during each roasting experiment, a background spectrum was acquired using a 100% reflectance reference material (PTFE Teflon).

Coffee Samples and Roasting Conditions

The coffee samples treated in this study originated from the same harvesting period and were kindly provided by a leading Italian coffee industry. The coffee beans showed worldwide geographical origins, and both the two most relevant commercial species, i.e. Arabica and Robusta, were considered. The roasting experiments were carried out within 48 h from the sample delivery and were planned considering an experimental procedure covering coffee species and roasting temperature levels.

Concerning the coffee species, all experiments were performed considering Arabica and Robusta separately. Four levels of increasing roasting temperature (from TS1–TS4) were considered to span a large enough range of temperature values with respect to the common industrial setup (200–220 °C). Finally, the roasting process lasted from a minimum of 650 to a maximum of 1580 s, depending on the temperature setting. Three different roasting batches were performed considering the Arabica and Robusta species separately at each temperature level, thus for a total of 24 experiments (2 species × 4 temperatures × 3 analytical replicates).

The actual average temperature profiles in the roaster along the process recorded by the optical temperature probe are reported in Figure S1 (strong temperature drops occurred during the roaster loading and discharge steps, marking the beginning and the end of the roasting process, and were excluded from the charts for clarity). All the temperature profiles were featured by an initial rapid temperature increase after coffee loading, followed by the approaching of a plateau, that was however reached only in the case of the long-lasting TS1 tests. The temperature always went over 210 °C during the tests, even if after longer times for the low roasting temperature experiments. Moreover, the maximum temperature was significantly higher for the TS3 and TS4 (~ 231 °C and 238 °C at the end time of the shortest roasting processes of the set, respectively) compared to the TS1 and TS2 (~ 218 °C and 212 °C, respectively), despite in the former cases the process was usually stopped earlier. Finally, good reproducibility of the measured temperature profiles was achieved, as shown by the low standard deviation calculated for the average profiles (Figure S1), pointing out that the coffee batches roasted at the same temperature level were subjected to very similar operative conditions.

Execution of the Roasting Experiments

Before beginning each roasting experiment, 120 g of green (i.e. unroasted) coffee beans was weighed (VWR scale 611–2602, accuracy of ± 0.01 g; VWR, Radnor), while their moisture analysis was carried out by means of a thermobalance (RADWAG MAC 210/NH; Radom, Poland). The same analysis was performed after roasting to evaluate the sample weight loss.

When the temperature set on the roaster was reached, the coffee sample was introduced into one of the roaster’s drums. At the same time, the temperature and NIR probes, previously directed towards the inlet lid of the roaster drum, started recording temperature data and spectra. When the process ended, according to the scheduled time of the corresponding experiment, the coffee was discharged into a container and cooled down by a stream of cool air, while the acquisition of the NIR spectra and the coffee temperature were promptly halted, and the data saved.

The coffee colour was determined on 25–30 g of roasted coffee sample using the single-beam Color Test II (Neuhaus Neotec, Germany), which performs a reflectance acquisition at 900 nm, with a ± 0.01 Arbitrary Unit (A.U.) accuracy. The same instrument was used to determine the colour of ground roasted coffee in our previous work (Bertone et al. 2016).

The details of the randomized roasting experiments, in terms of coffee species, roasting conditions, colour, initial moisture and weight loss, are reported in the Supplementary Materials in Table S1.

Multivariate Data Analysis

Chemometric elaboration of the NIR spectra was performed under MATLAB environment (ver. 9.2, The Mathworks, Inc., Natick, USA) using appropriate software packages, i.e. the “PLS_Toolbox” (ver. 8.5, Eigenvector Research, Inc., USA), the “MCR-ALS 2.0 Toolbox” (freely downloadable from https://mcrals.wordpress.com/download/mcr-als-2-0-toolbox last access 24/02/2023) and additional in-house generated MATLAB scripts. The 1636 in-line NIR spectra that were collected during the 24 roasting experiments were organized in a single dataset, where the spectra belonging to each roasting experiment were labelled and paired with all the additional information related to the process conditions and the coffee samples, as listed in Table S1.

Pre-processing of NIR Spectra and Exploratory Data Analysis

Different data pre-processing techniques were tested to remove potential artefacts from the NIR spectra, to correct nonlinear behaviours and to improve the signal-to-noise ratio. In this context, the “one-dimensional median filter” (“medfilt1” MATLAB function) algorithm was applied to remove, when needed, spikes from the acquired NIR spectra. Subsequently, the performances of the Savitzky-Golay smoothing algorithm were investigated, alone or in combination with the standard normal variate (SNV) or multiple scattering correction (MSC) algorithms. Regardless of the applied pre-processing, all the data were subjected to mean centering prior to any multivariate data analysis.

The exploratory data analysis was carried out by principal component analysis (PCA) (Wall et al. 2005). The optimal number of principal components (PCs) to be retained in the PCA model was chosen from the analysis of the corresponding scree plot (Bro and Smilde 2014). First, PCA was used to identify and remove the outliers, which mainly corresponded to initial spectra recorded before the coffee roaster was completely loaded. Subsequently, different PCA models were built to gather a preliminary overview of spectral variability in the multivariate space. In this last case, all the combinations of the above-mentioned pre-processing methods, coupled with mean centering, were investigated.

MCR-ALS Modelling

Multivariate curve resolution-alternating least squares (MCR-ALS) (de Juan and Tauler 2006) was employed to get a deeper insight into the evolution of the roasting process at each considered roasting temperature. MCR is a decomposition technique which allows extracting deconvoluted “pure” spectral profiles and their relative concentrations in each sample, using an ALS procedure. Once the number of components to be modelled is defined, the output of MCR is a set of resolved spectra (one for each component) and a set of relative concentrations (one for each component). The pure spectral profiles can be inspected to identify all relevant signals, thus allowing for proper chemical interpretation of each modelled component. The relative concentrations can be inspected to find similarities and differences among the samples, and if these correspond to specific time points (like in the present study) the evolution in time can be easily inspected.

In this context, the whole “clean” spectral dataset (i.e. after removing the outliers) was split into four subsets, each one including the experiments performed using both the Arabica and the Robusta coffee samples, and at the same roasting temperature level. On each subset, a separate MCR-ALS model was thus obtained.

The models were initially developed by considering the whole spectral range (10,265–4135 cm–1). However, to further evaluate the performance of this modelling approach and in view of the industrial use of simplified NIR instruments with a reduced spectral range, models based on the selection of a limited number of spectral variables were also implemented. In detail, a reduced spectral range (range 1, 6120–4961 cm–1) was considered according to the regions identified by Catelani et al. (Catelani et al. 2018) as the most suitable to monitor and control the roasting process. Moreover, a second spectral range was tested (range 2, 9777–7417 cm–1), which is mainly related to the physical properties of the coffee beans, such as their size, shape and density, as suggested by Catelani et al. (Catelani et al. 2018).

In all cases, the optimal number of MCR components to be used in the model was determined by means of PCA. An initial estimate of the pure spectral matrix being calculated, the non-negativity constraint was imposed on both the concentration (C) and the profile (S) matrices using the fast non-negative least squares algorithm (“fnnls” option in the toolbox (Bro and De Jong 1997)), whereas normalization of the S matrix was performed using the “Euclidean” criterion (de Juan and Tauler 2006). The ALS optimization was performed considering 100 iterations and 0.1 as the convergence criterion, while the model performance was evaluated in terms of fitting error (lack of fit, LOF) of explained variance and standard deviation of the residuals.

Kinetic Fitting of the MCR-ALS Concentration Profiles

The concentration profiles of each model were fitted as a function of roasting time, using the sigmoid Eq. (1) implemented in Table Curve software (v. 4.0, Jandel Scientific, San Rafael, CA, USA) as reported by Grassi et al. (Grassi et al. 2013):

$$y=a+\mathrm{b exp}\left(-exp\left(-\left(\frac{(x-d1n\left(1n\left(2\right)\right)-c)}{d}\right)\right)\right)$$
(1)

The first and second derivatives of the calculated sigmoid functions were used to identify the kinetic critical points, i.e. time-related to the maximum rate and acceleration and deceleration of the roasting phenomena. Two-way univariate analysis of variance (ANOVA) was performed on the kinetic indexes obtained for all the models, to assess the influence of roasting temperatures and the differences among the spectral ranges used in the MCR-ALS models. In both cases, coffee bean variety was considered as the second independent factor, together with the other factor, i.e. temperature or spectral range. Furthermore, PCA models were built to better visualise the effect of the temperature and the variety on the kinetic parameters derived from the MCR-ALS models developed for each spectral range considered, as reported in Fig. 5.

Results and Discussion

Exploratory Data Analysis

The PCA model developed on the mean-centred NIR spectra led to successful detection of the outliers (mainly spectra acquired when the toaster was not completely loaded) as shown in Fig. 1a, with colour coding reflecting the roasting temperature (TS1–TS4). The coffee-related samples are grouped into a very compact cluster located on the top-left section of the plot, whereas the outliers appear to be randomly scattered within the fourth quadrant. Regarding the coffee-related cluster, the sample distribution along PC2 can be ascribed to the NIR spectra variability during the roasting process. To better explain the samples’ distribution in the score plot, the loadings of PC1 and PC2 are reported in Fig. 1b. It was found that for both components, most of the loading values resulted different from zero, meaning that almost all the spectral regions contributed to defining the observed variability.

Fig. 1
figure 1

PCA model results of the raw NIR spectra. a PC1 vs PC2 score plot coloured according to the roasting temperature level. b PC1 and PC2 loading plot. c From the same PCA model, the Hotelling T.2 values of each spectrum collected during the experiments with Arabica coffee samples (plot coloured according to each individual single roasting experiment)

To confirm the outliers’ identification, the Hotelling T2 statistics was inspected. This parameter corresponds to Student’s t-statistic for multivariate data and measures the distance of each sample within the model. This approach is commonly used in the frame of the multivariate statistical process control (MSPC) to get an earlier detection of extreme, atypical and not well-modelled sampled values leading to strange process behaviours (Li Vigni et al. 2013; Catelani et al. 2018). Figure 1c reports the Hotelling T2 statistic calculated for each collected spectrum, grouped according to the experimental run and ordered according to the roasting time passing. The plot, together with the inset figure, clearly shows how the spectra collected immediately at the beginning and at the end of each roasting experiment have higher T2 values than the remaining spectra belonging to the same experiment (with 95% of confidence limit). Similar results were also obtained from the roasting experiments performed with Robusta coffee samples (data not shown). Therefore, the Hotelling T2 statistic demonstrated to be a satisfactory tool for the rapid and automatic detection of the spectra to be considered outliers. For a graphical example of problematic spectra at the beginning and the end of the roasting process, please see Figure S2.

After removing the outliers, only the 1501 coffee-related spectra were subsequently explored with PCA. The best model was obtained by applying the following pre-processing methods: median filter (order 30), Savitzky-Golay smoothing (window width 11, polynomial order 2, derivative order 0) and mean centering.

A graphical representation of the PCA results is given in Fig. 2. The score plot of the first two PCs coloured according to the roasting time is reported in Fig. 2a, and it can be deduced that PC1 (which corresponds to the largest amount of the dataset’s variability 84.63%) is related to the roasting time. The spectra collected on coffee samples at the beginning of the roasting process are located at positive values on PC1, and by moving towards more and more negative PC1 score values can be interpreted as the process evolving until the end is reached. No trends of separation among roasting experiments performed under the same temperature level were instead observed, as shown in Fig. 2b, which shows the same PC1–PC2 score plot coloured according to the roasting temperature. Moreover, no differences according to the coffee species were found (data not shown). The corresponding loading plot (Fig. 2c) shows that, also in this case, many spectral regions contributed to explaining the spectral variability on the considered PCs. In fact, some peaks can be identified by looking at the loading profile of PC1, for example the ones between 8500 and 8000 cm–1, at approximately 7000 cm–1 and in the range 5500–5000 cm–1. All the variables belonging to these spectral regions pull the sample distribution towards positive PC1 score values. Therefore, these spectral regions can be related to the presence of chemical compounds whose NIR signal is more noticeable in the early stages of the roasting process. The spectral signals in the regions 7000 cm–1 and 5500–5000 cm–1 are mostly linked to the water content (first overtone and combination bands of O–H bonds stretching and bending in hydroxyl groups (Alessandrini et al. 2008; Santos et al. 2016b, a; Yergenson and Aston 2020b)), whereas the range 8500–8000 cm–1 is linked to C–H vibrations, associated with not only the content of organic acids, mainly chlorogenic and quinic acids (Esteban-Díez et al. 2004), but also physical features of the coffee beans, like dimension, shape and density (Catelani et al. 2017).

Fig. 2
figure 2

PCA model results of the coffee NIR spectra. a PC1 vs PC2 score plot coloured according to roasting time (seconds). b The same PC1 vs PC2 score plot coloured according to the roasting temperature level. c PC1 and PC2 loading plot

MCR-ALS Models

All MCR-ALS models well described the roasting process (explained variance > 99.9%, LOF < 0.63%, and standard deviation of the residuals lower than 0.0072).

Figure 3a reports the two spectral profiles (referred to as sopt1 and sopt2, respectively) corresponding to the two components identified by MCR-ALS at each roasting temperature level. All the spectral profiles obtained by the temperature-based MCR-ALS models show superimposable peaks and trends, demonstrating that the different tested roasting temperatures mainly bring the same physical and chemical modifications to the coffee beans. Few differences were noticed for the extreme temperatures (TS1 and TS4).

Fig. 3
figure 3

MCR-ALS spectral profiles and FT-NIR spectra. a Spectral profiles (sopt1, sopt2) of the first and second component (solid and dashed line, respectively) computed by the MCR-ALS model at the different temperature levels: TS1 (blue), TS2 (green), TS3 (yellow), TS4 (red). b Example of a typical evolution of the NIR spectra of the coffee collected in real time from the beginning (blue curves) to the end (red curves) of the roasting process (experiment 17)

Moreover, the comparison of the calculated profiles with the raw NIR spectra of a representative experiment (Fig. 3b) highlights the similarity, both in terms of vibrational pattern and shape, of the first and second spectral profiles to the initial and final spectra recorded during the roasting process, respectively. Such observation suggests that the MCR-ALS models identify two components corresponding to the typical spectra of the green and the roasted coffee beans.

Concerning the vibrational bands, both spectral profiles show a clear peak in the region 7500–6700 cm–1, which is mostly related to the water content (6900 cm–1, first overtone of the O–H stretching). The water signal is a broad band in the first spectral profiles, in agreement with the higher water content and the lower temperature at the beginning of the roasting process, whereas in the second profiles, representing the roasted beans, the band becomes sharper, and a shoulder appears at 6685 cm–1.

In both profiles, within the region 6699–5400 cm–1, essentially capturing the chemical information concerning C–H and S–H bonds (Catelani et al. 2017), two peaks at 5800 and 5700 are present, related to CH2 first overtones present in long-chain fatty acids (Esteban-Díez et al. 2004).

The peak at 5200 cm–1 in the first spectral profiles, related to water content, is highly reduced in the second spectral profiles and it almost disappears in the one related to the trials performed at TS1. Indeed, the trials performed at TS1 reached very high weight losses (18.53% on average), comparable only to the experiments at TS4 (18.59%), being the roasting time 1500 s. The decrease of the peak at 5200 cm–1 resulted in a visible small peak at 5170 cm–1 for the second spectral profile of the experiments performed at TS4.

In the region 4999–4000 cm–1, containing chemical information related to C–H, N–H and C–O vibrations (Catelani et al. 2017), a broad band at 4810 cm–1 is present, assigned by Esteban-Díez et al. (Esteban-Díez et al. 2004) to carbohydrates. Additionally, two small peaks emerge in the first spectral profiles at 4343 cm–1 and at 4254 cm–1, whose absorption increases according to roasting temperature in the roasted beans profiles, especially for the peak at 4343 cm–1. The appearance of these two peaks, attributable to combination modes between different vibrations of the C–H bond in CH2 groups, was previously reported by several works (Alessandrini et al. 2008; Catelani et al. 2017) and may be related to the migration of oily compounds (mainly fatty acids) to the surface of the coffee beans, a phenomenon known to occur as a consequence of the formation of porosity during the roasting process (Schenker et al. 2000; Yergenson and Aston 2020b). The accumulation of fatty acids at the surface of the beans could have an impact also on the 6699–5400 cm–1 spectral range, whose absorbance indeed increases in the second profiles.

It is worth noticing that the reduced absorption of water after the roasting process influences the intensity of all the other vibrational bands, which are less covered by the water-related bands and resulted well resolved by the second MCR-ALS component. Moreover, the results confirmed what was observed by Gowen et al. (Gowen et al. 2013), i.e. that variations of the hydrogen-bonded network between water and other constituents have an impact on both components, as a multi-state system.

Beyond the chemical changes, a relevant modification in coffee bean roasting is also linked to changes in the beans’ density. The second MCR-ALS profile is highly affected by coffee beans’ physical changes, visible in the marked slope between 10,000 and 7300 cm–1 (Catelani et al. 2017). The increased background in this range also masks the actual variations that the vibrational bands between 8000 and 9000 cm–1, assigned to the second overtone of C–H stretching of carbohydrates, lipids and amino acids, may undergo during the roasting process (Giraudo et al. 2019; Munyendo et al. 2021). As already observed in the raw spectra of experiment 17 (Fig. 3b), the curve slope increases with roasting time, indicating a change in the physical properties of the sample, which, in coffee roasting, could be ascribed to the change in density of the beans (Bustos-Vanegas et al. 2018). This behaviour was observed for all the considered batches.

The slope of the spectral profiles in this region is also linked to the roasting temperature. Indeed, the highest slope is visible for the batches roasted at TS1, whereas the lowest slope was recorded for the batches roasted at TS4. Intermediate temperatures (TS2 and TS3) are quite similar and are characterised by intermediate slopes. However, a correlation with the roasting time could also be hypothesized, as the TS1 and TS4 trials were the longest and shortest experiments, respectively.

The MCR-ALS concentration profiles of the first and second components (referred to as copt1 and copt2, respectively) relative to the whole spectral range (10,265–4135 cm–1) are reported in Fig. 4 for the four considered temperature profiles. It is possible to notice that all the first profiles (filled markers) describe an inverse sigmoid: the first part of the process is characterised by almost constant concentration values extended for different times, according to the different temperature levels considered, then a decrease of the concentration leads the profiles to values tending towards an asymptote, except for the TS4 model. The second profiles (empty markers) are characterised by an opposite behaviour resulting in a sigmoidal trajectory for all the investigated temperatures, except for TS4 concentration profiles, which do not reach a plateau within the registered times. Two trials (TS2_R_2 and TS2_R_3; Fig. 4b) are characterised by a mild slope if compared to the steep one recorded for the other Robusta and Arabica trials performed at the same temperature. It is worth noting that the inspection of the raw spectra reveals that this difference can be attributed to the reduced spectral variation during these two roasting processes, and not to the variety characteristics.

Fig. 4
figure 4

MCR-ALS results, concentration profiles (copt) for trials at TS1 (a), TS2 (b), TS3 (c), TS4 (d). In all panels, full-coloured markers represent the trend of the first profile, edge-coloured markers represent the trend of the second profile. Different markers represent different trials: batch A_1 = black triangle; batch A_2 = black circle; batch A_3 = black squrea; batch R_1 = black inverted triangle; batch R_2 = asterisk; batch R_3 = diamond

Kinetic Fitting of the MCR-ALS Concentration Profiles

With the purpose of uncovering the kinetic critical points of the roasting process, the MCR-ALS concentration profiles were fitted with the sigmoid function described in the “Kinetic Fitting of the MCR-ALS Concentration Profiles” section, from which the maximum rate, acceleration and deceleration of the roasting phenomenon were calculated. The effects of temperature on the kinetic indexes calculated for the second concentration profiles were investigated by a two-way ANOVA also considering the coffee variety as a factor (A, Arabica, and B, Robusta). Table 1 reports the kinetic critical points obtained by the MCR-ALS models calculated considering both the whole spectral range (10,265–4135 cm–1) and two reduced ranges (i.e. range 1, 6120–4961 cm–1, and range 2, 9777–7417 cm–1). When considering the whole spectral range, it is possible to notice that the lower temperature profile tested (TS1) resulted in a significant delay of the process in terms of maximum acceleration (p < 0.01), velocity (p < 0.001) and deceleration (p < 0.001), as expected. However, the increase in the temperature did not show any significant difference (p > 0.05) due to the variability of the different trials performed for each temperature level. The most relevant kinetic index to be considered is the maximum deceleration, as it can be useful to estimate the end of the roasting process, thus the point after which prolongation of the process could lead to an over-roasting of the coffee beans, which can negatively affect their sensory properties. The maximum deceleration of the second concentration profile for the batches roasted at TS1 occurred between 728 and 989 s, whereas for all the other tested temperatures it appeared between 534 and 679 s. No differences (p > 0.05) were instead ascribable to the variety factor, indeed Arabica and Robusta responded in the same way to the roasting process for all the temperature levels tested.

Table 1 Characteristic kinetic indexes calculated from the MCR-ALS concentration profiles describing the roasting behaviour models obtained considering the whole spectral range or the reduced ranges (range 1, 6120–4961 cm–1; range 2, 9777–7417 cm–1)

When considering the concentration profiles of the MCR-ALS models obtained on the two reduced ranges, no statistically significant difference was uncovered for the maximum deceleration rate calculated for all the tested temperature profiles (Table 1, uppercase letters). However, the deceleration rates obtained by the two selected ranges differed from the ones obtained modelling the whole spectral range. In fact, as can be seen in Table 1, the kinetic indexes calculated for the whole spectral range showed to be higher than the ones referring to the reduced ranges. Nevertheless, the kinetics calculated for range 2 resulted in linearly correlated with the ones obtained by whole range modelling (y = 0.895x, R2 = 0.753, p < 0.01).

To better visualise the relation among samples according to the kinetic parameters obtained from the MCR-ALS models, the computed kinetic values of each experimental dataset were organized into new data matrices composed of 24 samples and 3 variables (i.e. the kinetics parameters) and were inspected by PCA. The score plot obtained for the kinetic indexes calculated from the MCR-ALS model considering the whole spectral range (Fig. 5a) well represents the group of samples roasted at the lowest temperature (TS1), whereas samples roasted at higher temperatures resulted heterogeneously positioned in the left part of the plot. The score plot obtained considering the spectral range 1 (6120–4961 cm–1; Fig. 5b) visually confirmed the lack of ability of this range to discriminate differences in the kinetic of the process according to temperature and variety. The second spectral range (9777–7417 cm–1; Fig. 5c) seems to be the most affected by kinetic variations along the process when different operative conditions are considered: indeed, good groupings according to the roasting temperature were found. The increase in the temperature resulted in a decrease in PC1 values, even if some degrees of confusion can be noticed between samples roasted at TS2 and TS3. The visual inspection does not mark any systematic distributions of the samples according to the varieties.

Fig. 5
figure 5

PCA score plots of the kinetic parameters calculated from the MCR-ALS models. a Plot of the whole spectral range model. b Plot of the spectral range 1 model. c Plot of the spectral range 2 model. Letters “A” and “R” close to the samples’ markers refer to Arabica and Robusta species, respectively

The differences in describing the process kinetics according to all spectral ranges considered are undoubtedly related to the physicochemical information retained by each considered interval. Indeed, range 1 contains information concerning the major chemical changes involving C–H and S–H vibrations and C═O plus O–H and O–H vibrations (Catelani et al. 2018), whereas range 2 mainly considers physical changes, such as beans density, which contribute to the changes in slope from 9777 to 7300 cm–1. Ideally, retaining both chemical and physical changes is advisable to describe the roasting progress; thus, it would be more advisable to consider the whole spectral range (10,265–4135 cm–1) to follow the process in deep. However, in vision of simplified systems, range 2 (9777–7417 cm–1) could be selected for the development of an ad hoc instrumentation. Indeed, the information retained in this region proved to be satisfactorily correlated with the information provided by the whole range considered and sensitive to process changes due to temperature modifications. Moreover, instruments working in this region could be developed with cheaper light sources and detectors, resulting in affordable sensors for the coffee industry.

Conclusions

The present study demonstrated the feasibility of a rapid, non-invasive, and automated method, based on NIR spectroscopy and multivariate data analysis, to perform real-time monitoring of the coffee roasting process.

The results, especially their visualization, demonstrate that PCA is a good diagnostic tool to identify and automatically remove NIR spectra not related to the coffee roasting process (Fig. 1a, c) and, subsequently, to clearly describe the spectra (i.e. the time points) according to the roasting time (Fig. 2a). Further implementations of an automatic monitoring system could be equipped with a PCA step based on Hotelling’s T2 to filter out those spectra with strong experimental defects.

The application of MCR-ALS provided precious insights about the roasting dynamics under different process conditions: the spectral changes occurring during the roasting process could be efficiently modelled, and it is interesting to notice that these results were obtained with very simple models (only two components). The main changes that the coffee beans undergo during the roasting process could therefore be described from both the chemical and physical viewpoints, by interpreting the MCR-ALS resolved spectral profiles. Furthermore, the MCR concentration profiles were successfully fitted with a sigmoid function for estimating some noticeable kinetic steps, i.e. the maximum rate, acceleration, and deceleration of the coffee roasting process. From the interpretation of the time evolution of the roasting process, a significant (p < 0.1) delay was found for the roasting tests performed at the lowest temperature level with respect to those performed at the three higher temperature levels. In all cases, no significant differences were found based on the coffee varieties. A real-time implementation of an MCR-ALS-based system could envisage the use of both historical data collected under the same conditions (roaster, experimental settings, coffee variety and quality) and “live” modelling of the evolving collected NIR information.

These considerations are coherent with the PAT paradigm, since the proposed approach would allow taking valuable process decisions in real-time and in a fully automatic way. Overall, the presented approach can be considered an important step towards the optimization of the coffee roasting process at the industrial scale since it allows taking valuable process decisions in real-time and in a fully automatic way. Moreover, in vision of simplified systems, a reduced spectral range (9777–7417 cm–1) has proven promising for the development of simplified ad hoc instrumentation, potentially affordable also for smaller artisanal roasting plants.