Wood structure, which is made up of cellulose fibrils embedded in a matrix of hemicelluloses and lignin, provides a high strength to weight ratio but makes wood vulnerable to hygroscopic changes and dimensional instability. These weaknesses can be compensated by various chemical and thermal treatments routinely used to improve wood properties for different applications. One alternative is wood acetylation, where acetic anhydride is used for replacing accessible hydroxyl groups (–OH) on wood polymers with acetyl groups (–CH3CO) generating acetic acid as a by-product [1]. Acetylation decreases the number of primary sorption sites for water molecules and limits the accessibility of unmodified hydroxyls due to the larger size of the acetyl groups [2, 3]. The degree of wood acetylation is generally measured gravimetrically by weighing a sample before and after the modification procedure. The obtained weight percent gain (WPG) provides an indirect measure of acetylation of the entire sample and is highly correlated with the improvement in wood properties [4, 5]. However, the WPG alone does not provide information on the potential variations in acetylation degree within the bulk sample. Although acetylation has been industrially used for decades, the distributions of acetyl groups within wood still remain poorly understood.

Chemical imaging provides tools to combine the spatial information of an image with the chemical information of a spectrum. Different spectroscopic techniques can be used for acquiring spectral information from a range of spectral variables either from a single pixel, a line of pixels or an entire image scene at a time. Hyperspectral near-infrared (NIR) imaging records diffuse reflectance or transmittance on a continuous range of several hundred different wavelengths in the NIR region of the electromagnetic spectrum [6]. The NIR region provides information on the overtones and combinations of the vibrational modes of polar molecular bonds making it extremely well-suited for analyzing lignocellulosic materials and their derivates. Modern push broom instruments are fast and can be used with minimal sample preparation due to the lower absorptivity of the overtones and combinations of fundamental vibrational modes from the mid-infrared [7]. However, many of these peaks in NIR spectra overlap and multivariate image analysis based on chemometric techniques is generally required to identify the spatial variation in the chemical properties of a sample. Raman microspectroscopy, on the other hand, operates by mapping Raman spectra generated by the inelastic scattering of monochromatic light from a preselected region on the sample surface one pixel at a time. The theoretical maximum resolution is determined by wavelength of the excitation laser and the numerical aperture of the microscopy objective [8]. Although the spatial resolution achieved in practical applications is often below the theoretical maximum, resolutions in the range of micrometers can easily be achieved [9]. The majority of chemometric techniques for spectral image interpretation are based on dimensionality reduction using latent variables and enable visualizing the important variations in an image dataset in just a few interpretable dimensions. The use of latent variables can also enhance selectivity by taking into account information of a specific analyte from multiple different wavelengths and avoids problems of multicollinearity in calibration and regression if the number of spectral variables exceeds the number of available samples or objects.

Chemical imaging is a promising tool for determining the chemical properties of wood. Hyperspectral NIR images have previously been used for evaluating the chemical composition of wood [10, 11], identifying different wood species [12], predicting wood moisture content and physical properties [13, 14], and visualizing the level of wood extractives [15]. The use of NIR images has also been reported in predicting vapor-phase acetylation of wood [16]. The authors recorded NIR images within 1050–2550 nm and obtained coefficients of determination in the range 0.89–0.90 for predicting the concentration of acetyl groups on wood surfaces. However, although the authors were able to visualize the spatial variation in the concentration of acetyl groups independent of wood moisture content, the results were based on a multiple linear regression model on second derivative absorbance values at 1727 nm [16]. Raman microspectroscopy has also been extensively utilized in studying wood chemistry, with typical applications involving the visualization of cellulose and lignin distributions on a cellular and subcellular level in different types of natural and treated samples [9, 17]. Raman images have also been used to study the distributions of wood extractives [18] and various wood modification agents [19,20,21].

We have recently shown that hyperspectral NIR imaging and image regression can successfully determine the penetration of acetic anhydride into the wood structure after wood acetylation [22]. Our method was based on calibrating NIR images with external WPG values measured from dried samples after confined one-sided acetylation of solid wood. Now we extend this work by evaluating the spatial differences on wood surfaces after unconfined acetylation followed by wood conditioning under different relative humidities. This approach enables visualizing the acetylation-induced changes on wood surfaces practically independent of equilibrium moisture content. We also show how the most important wavelengths that correspond with wood acetylation can be systematically determined by selecting subsets of spectral variables based on interval regression. The results show that considerably higher WPG values were found in the earlywood parts of wood surfaces. As the spatial resolution of most hyperspectral NIR cameras is limited to the macrostructural level, we have also illustrated the differences in earlywood and latewood acetylation on the cellular level using unsupervised classification of microscopic Raman images.

Experimental section

Sample preparation

Defect-free samples of dimensions 15 × 15 × 15 mm3 were cut from kiln-dried Scots pine (Pinus sylvestris L.) sapwood and prepared according to an experimental design, where acetylation and sample conditioning were varied on different levels. The initial central composite design included three levels for both variables, which resulted in WPG values in the range 0–18% and sample moisture contents of 0–29%, Fig. 1a. Additional calibration and validation samples were also prepared inside the design range. These samples were not evenly distributed within the design but provided additional variable levels for image calibration. All samples were prepared in duplicate or triplicate, which provided a total of 36 imaging samples distributed across 13 experimental locations.

Figure 1
figure 1

a An illustration of the experimental design, b an image of similar wood samples under illumination of the NIR camera, c and grayscale images of a sample (darker gray) and the selected region of interest, ROI (lighter gray). The values in c denote image pixel dimensions

The samples were first extracted in a Soxhlet apparatus with acetone for 6 h and then oven dried at 103 °C to determine their initial mass. The acetylation was performed using neat acetic anhydride. The samples were first impregnated with the anhydride under vacuum at room temperature and then transferred to a reaction flask, where they were treated with the acetic anhydride for 0, 20, 100, 180 or 360 min at 120 °C under reflux for adjusting acetylation degree. At the end of the treatment, the reaction flask was transferred onto ice to stop the reaction and the anhydride was replaced with acetone. The acetylated samples were again Soxhlet extracted with acetone for 6 h to remove leftover anhydride and reaction by-products and then oven dried at 103 °C to determine their modified mass and the WPG.

The acetylated samples were then conditioned in different relative humidities for determining the effects of acetylation on sample moisture content. Silica gel, salt solutions of calcium chloride, potassium iodide, potassium chloride or pure water were used as reagents for generating relative humidities 0, 33, 70, 85 and > 97%, respectively [23,24,25]. The acetylated samples were placed on sample holders above the reagents in plastic containers, which were sealed and stored in room temperature 20 °C for approximately 4 weeks.

Hyperspectral NIR imaging

The hyperspectral NIR images were taken from the sawn, transverse surfaces directly after the wood samples had been acetylated and conditioned in different RHs. The imaging setup has been previously been described in ref. [26] and consisted of a Specim SWIR 3 camera (Specim, Spectral Imaging Ltd.) equipped with a 105 mm OLES macro lens provided by Specim. As illustrated in Fig. 1b, two opposite rows of quartz halogen lamps generated polychromatic light and the reflected wavelengths were separated by a grating-prism monochromator followed by a HgCdTe detector array. The images were taken in line-scanning mode where a line of 384 pixels was continuously recorded on different wavelengths. A spectral range of 1000–2500 nm was used with a sampling interval of 5.6 nm, which provided 267 spectral variables. The lens field of view was 10 mm which resulted in a nominal pixel size of 26 × 26 \(\mu\)m2 [26]. The images were taken in reflectance mode and converted to absorbance units after correction with Spectralon white reference and dark current intensities. After imaging, the samples were placed back into the sample containers and weighed after all the samples had been imaged.

Raman microspectroscopy

Raman images were obtained from earlywood and latewood sections in a sample with 8.9% WPG and 9.2% moisture content located in the center of the experimental design (Fig. 1a). A cross section with a thickness of approx. 20 \(\mu\)m was cut from a water-saturated block on a sledge microtome (Lab-Microtome, Swiss Federal Research Institute WSL). The cross section was then placed on a microscope slide with a drop of deionized water, covered with a glass coverslip and edge-sealed with nail polish. The Raman images were acquired using a WITec alpha 300 RA Raman microscope equipped with a 532 nm frequency-doubled Nd:YAG laser operated at 30 mW, a 100 × immersion oil objective with a numerical aperture 1.25, and a DU970-BV EMCCD camera behind a 600 lines mm−1 grating. The image size was 50 × 50 \(\mu\)m2 with 180 lines per image and 180 points per line. The integration time was set to 0.3 s for collecting a Raman spectrum in each image pixel. Separate images were acquired from earlywood and latewood cells within the same cross section.

Image and data analysis

The obtained NIR images were interpreted using exploratory image analysis and hyperspectral image regression. As illustrated in Fig. 1c, square regions of interest (ROIs) of 384 × 384 pixels were selected from the middle of each sample image and median filtered to remove the effects of dead pixels in the camera detector. One sample ROI from each experimental design location was selected and combined into a mosaic of 9 sample ROIs. The mosaic was decomposed using a principal component analysis (PCA) [27, 28] model after the pixel spectra had been preprocessed with standard normal variate (SNV) [29] correction and mean centering and unfolded into a two-way array, Eq. (1):

$$ {\mathbf{X}} = {\mathbf{t}}_{\mathbf{1}} {\mathbf{p}}_{\mathbf{1}}^{{\mathbf{T}}} + \ldots + {\mathbf{t}}_{{\mathbf{n}}} {\mathbf{p}}_{{\mathbf{n}}}^{{\mathbf{T}}} + {\mathbf{E}}_{{\mathbf{n}}} $$

where X denoted an unfolded matrix of preprocessed and mean-centered pixel spectra, t and p the orthogonal score and orthonormal loading vectors, respectively, and En a residual matrix after n principal components. The principal component score vectors t were then refolded back to the mosaic dimension to illustrate the results through score images.

Image calibration was performed using partial least squares (PLS) regression [30, 31] on measured WPG values. The model vector b in the general regression equation, Eq. (2):

$$ {\mathbf{y}} = {\mathbf{Zb}} + {\mathbf{e}} $$

where y denoted a vector of measured and mean-centered WPG values, Z a matrix of SNV transformed and mean-centered calibration spectra, and e a vector of model residuals, was determined using the SIMPLS algorithm [32]. The image ROIs were divided into separate calibration and test sets that represented the entire experimental design range and six average spectra were obtained from each sample ROI for calibration and validation. Root-mean-squared errors (RMSE) of calibration (RMSEC) and prediction of the test set (RMSEP) were calculated based on Eq. (3):

$$ {\text{RMSE}} = \sqrt {\frac{{\mathop \sum \nolimits_{j}^{n} \left( {y_{j} - \hat{y}_{j} } \right)^{2} }}{n}} $$

where \(y_{j}\) and \(\hat{y}_{j}\) denoted the measured and predicted values, respectively, and n the number of calibration or test objects. In addition, a separate RMSEP parameter was determined based on the pixel populations of the test set ROIs to evaluate image over-fitting [33], Eq. (4):

$$ {\text{RMSEP}}_{{{\text{im}}}} = \sqrt {\frac{{\mathop \sum \nolimits_{i}^{k} \mathop \sum \nolimits_{p}^{m} \left( {y_{pi} - \hat{y}_{pi} } \right)^{2} }}{km}} $$

where \(y_{pi}\) and \(\hat{y}_{pi}\) denoted the measured sample and predicted pixel values, respectively, and m the number of image pixels in k test set images.

The measured Raman images were interpreted using PCA and PCA-based pixel cluster analysis after the earlywood and latewood images were combined into an image mosaic. Wavelengths outside 500–3500 cm−1 were excluded to remove unnecessary noise, and the pixel spectra were preprocessed by baseline removal using a second-order polynomial followed by normalization to unit length to correct for intensity differences between the two images. Random spikes were removed with a median filter applied in the image dimension. Principal component scores and loadings were determined as in Eq. (1) after preprocessing, mean-centering and mosaic unfolding. The obtained PCA scores were used for pixel clustering based on a k-means algorithm that determined similarities between individual pixels and the cluster centroids based on correlation (cij) in the principal component score space, Eq. (5):

$$ c_{ij} = {\mathbf{t}}_{{\mathbf{i}}} \cdot {\mathbf{t}}_{{\mathbf{j}}} = \left| {{\mathbf{t}}_{{\mathbf{i}}} } \right|\user2{ }\left| {{\mathbf{t}}_{{\mathbf{j}}} } \right| cos\theta $$

where ti and tj denote the score vectors of pixels i and j normalized to unit length after n retained principal components (Eq. 1). As illustrated in Eq. (5), for vectors which are unit length, correlation described the angle and not the distance between the determined pixel scores in principal component space. A given number of cluster centroids was initially chosen based on largest distances from the center of the score space and updated upon the assignment of each pixel. All data analyses were performed with in-house Matlab® (Version 2018b, The MathWorks, Inc.) routines that utilized commercial functions from the PLS (Version 8.7) and MIA toolboxes (Version 3.0.7) from Eigenvector Research, Inc. The initial NIR image segmentation was done in Breeze (Version 2019.02, Prediktera AB), and the final results were plotted in OriginPro (Version, OriginLab Corp.).


We acetylated and then conditioned wood samples in different relative humidities based on an experimental to also determine the effects of acetylation and WPG on moisture uptake based on hyperspectral NIR images. The determined PCA model showed that the effects of acetylation and sample conditioning were clearly visible in the NIR image mosaic. The first three principal components explained 96% of the variation in the preprocessed and mean-centered spectra and provided a meaningful chemical interpretation for the differences within the mosaic. The first principal component explained 79% of data variation and mainly separated the samples based on relative humidity during conditioning. As illustrated in Fig. 2, samples conditioned under higher relative humidity showed an increase in the pixel score values. In general, positive scores in PCA are associated with increased absorbance on positive loadings, which indicated that sample conditioning increased absorbance at approximately 1410 and 1930 nm based on the loading vector. These are the main NIR peaks for water-related vibrational modes and are generated by the first overtone of the O–H stretching vibration generally detected at 1414 nm and the O–H stretching vibration and O–H deformation vibration from H2O itself within 1916–1942 nm [34]. The first score image also suggested that the acetylated samples had lower final moisture contents as the score values within a specific relative humidity decreased towards increasing acetylation degree. It should be noted that small deviations from NIR peaks reported in the literature are normal in hyperspectral image analysis due to the comparatively lower spectral resolution of imaging spectrographs.

Figure 2
figure 2

Score images and the respective loading vectors of the first three principal components based on the NIR image mosaic. The color scales denote pixel score values

The second and third principal components each explained 8–9% of the remaining data variation and illustrated the effects of wood acetylation within the mosaic. Especially, the earlywood parts of the sample surfaces showed significant changes with acetylation (Fig. 2). The second principal component represented a combination of acetylation and sample conditioning based on the score image. The negative loading peak identified at approximately 1150 nm most likely corresponded with the second overtone of the C-H stretching vibration of methyl groups and aromatic moieties in lignin or methyl groups of acetyl esters in hemicelluloses generally seen at 1143 and 1157 nm, respectively [34]. The second overtone of the C=O stretching vibration in hemicellulose is generally seen at 1910 nm [34], which suggested an increase in acetyl groups with negative pixel scores. The differences across the sample scores also highly resembled the first score image, which further illustrated the correlation between acetylation and sample moisture content. The second acetylation peak at circa 2260 nm was likely caused by O–H and C-O stretching vibrations of acetyl groups generally seen in acetylated wood at 2255 nm [34].

The third principal component provided the clearest distinction in acetylation based on the mosaic. A clear trend in the pixel scores was seen towards the direction of increasing sample acetylation. The positive loading peaks at approximately 1160, 1680, 1720, 1900 and 2255 nm were consistent with an increase in the methyl groups of acetyl esters in hemicelluloses (reported at 1157 and 1175 nm), acetyl groups in acetylated wood based on the first overtone of C-H stretching vibration (1681 and 1721 nm), carbonyl groups in hemicelluloses based on the second overtone of the C=O stretching vibration (1907 and 1910 nm), and acetyl groups in acetylated wood based on O–H and C–H stretching vibrations (2255 nm) [34]. The remaining broader loading peak at circa 1410 nm was likely a combination of two separate peaks. The first overtone of O–H stretching vibrations from phenolic hydroxyl groups or wood lignin is generally seen at 1410 nm [34], of which the latter was in line with the general increase in lignin in earlywood compared with latewood [35]. The peak also showed a clear shoulder on the left at approximately 1350 nm, which suggested an increase in the first overtone of C–H stretching vibrations and C–H deformation vibrations from acetylated wood components (1350 nm) [34].

After PCA, we divided the 36 sample images into separate calibration and prediction sets with 23 and 13 objects, respectively. Six average spectra were then extracted from each sample image, and calibration models were determined for gravimetrically measured WPG values based on SNV transformed and mean-centered calibration spectra. The effects of spectral preprocessing on the extracted calibration spectra are illustrated in Fig. S.1 (Supplementary material). Average prediction errors based on model calibration (RMSEC), prediction of the test set (RMSEP), and prediction of the test set image pixels (RMSEPim) were then used for determining the correct number of latent variables for the PLS calibration model. As illustrated in Fig. 3a, no clear shoulders were observed in the determined diagnostics as RMSEC and RMSEP monotonically decreased as more model components were used. Even RMSEPim did not clearly suggest a correct number of latent variables through increasing prediction errors based on the test set image pixels that would have indicated potential model overfitting [33].

Figure 3
figure 3

a Calibration and prediction errors as a function of PLS latent variables, b minimum average test set prediction errors on five latent variables based on iPLS, c calibration and prediction errors with 9 intervals based on iPLS, d measured and predicted WPG values of the calibration and test sets with three latent variables and 9 intervals based on iPLS, and e the PLS model vector with three latent variables and 9 intervals based on iPLS

However, we initially used the entire wavelength range within 1000–2500 nm and not all wavelengths are necessarily equally important for a spectral calibration model. Thus, the best combination of spectral variables for predicting WPG and minimizing RMSEP was determined based on interval PLS (iPLS) [36]. One variable (1005 nm) was first omitted and the wavelength range was divided into 14 equal intervals, which each consisted of 19 spectral variables. All possible combinations of using 1–14 wavelength intervals were then applied for model calibration and determining RMSEP based on five latent variables. Five latent variables were initially chosen based on RMSEPim, which indicated that adding more model components did not provide significant improvement in model performance (Fig. 3a). A total of 16,383 calibration models were determined, and the combinations with the lowest average test set prediction errors values are illustrated in Fig. 3b.

Based on the iPLS results, using 9 out of the original 14 intervals provided the lowest average prediction errors on five latent variables and decreased RMSEP from 0.87 to 0.49% WPG (Fig. 3b). As shown in Figs. 3c, using these 9 intervals also provided a clearer indication of the correct number of latent variables based on RMSEPim which suggested that using four or more model components increased prediction errors based on the pixel populations of the test images. Three latent variables provided the lowest RMSEPim and led to satisfactory prediction of the calibration and test sets with average prediction errors of 0.75% and 0.91% WPG based on extracted mean spectra, respectively (Fig. 3d). The final model vector illustrated in Fig. 3e also included the main peaks affected by sample acetylation as was earlier indicated by the third principal component during the interpretation of the NIR sample mosaic.

The calibration model on three latent variables was then used for predicting the WPG values of the test images and examples are illustrated in Fig. 4. Clear differences were visible in the predicted WPG values of the earlywood and latewood regions of the wood surfaces especially with the samples that were situated in the center of the experimental design (Fig. 4b). The results indicated that the earlywood regions had significantly higher WPG values and had thus been more severely acetylated than the latewood regions of the wood surface. These earlywood and latewood differences were also clearly visible in the pixel histogram, which suggested that the WPG values of the individual pixels could be described with two different distributions (Fig. 4b). The differences were less pronounced but still visible in the heavily acetylated samples (Fig. 4c).

Figure 4
figure 4

Predicted weight percent gains (WPG, %) of the test images (above) and the respective pixel histograms (below) for a a non-acetylated sample, b an acetylated sample from the center of the design, and c a heavily acetylated sample. The red vertical lines in the histograms denote gravimetrically measured WPG values and the blue lines the respective model predictions expressed as an average of the pixel populations

The spatial resolution in hyperspectral NIR imaging is commonly limited to the macrostructure level, which enables separating the earlywood and latewood parts of the wood surface but does not allow determining chemical differences on the cell level. For determining cell level differences between earlywood and latewood acetylation, we recorded Raman images of wood cells separated from different wood sections from the sample illustrated in Fig. 4b. The Raman images were combined into a mosaic and preprocessed using baseline removal, normalization and median filtering. Pixels corresponding to cell lumens were then removed, and the remaining mosaic was then decomposed using PCA. The first four principal components shown in Fig. 5 explained 91% of data variation and the first, third and fourth components provided a meaningful interpretation of the chemical differences within the earlywood and latewood cell walls. The second principal component mainly represented measurement uncertainties and a loss of focus due to differences in sample height on the focal plane. After four principal components, the noise in the loadings also increased significantly.

Figure 5
figure 5

Score images and respective loading vectors of the first four principal components based on the Raman mosaic showing earlywood (left) and latewood (right) cells. The color scales denote pixel score values

The first component explained 61% of data variation within the Raman mosaic and separated the image pixels into areas of high lignin and high carbohydrate contents. As illustrated in Fig. 5, these areas showed increasing intensities from the aromatic ring stretch band at 1600 cm−1 and the carbohydrate C–H stretch at 2880 cm−1 [37], respectively. The effects of wood acetylation appeared primarily in the third principal component, which explained 9% of data variation. The score images showed positive score values in the cell walls and the middle lamellae in earlywood, whereas in latewood positive scores were only seen in the inner cell wall layers. The loading vector showed corresponding positive bands at 2940 cm−1 and 1660 cm−1 due to the C–H stretch of O–CH3 groups and the C=O of acetyl groups previously observed in the Raman spectra of acetylated lignin [37]. The loading also includes a positive band at 1725 cm−1, which was likely due to the carbonyl stretching of acetyl groups previously reported for acetylated lignin and wood [37, 38]. The fourth component explained 5% of the remaining data variation and mainly illustrated the outer layers of the cell walls based on a strong contribution from the orientation sensitive cellulose band at 1090 cm−1 [39].

The score values of the first, third and fourth principal components were then used for unsupervised classification of the mosaic pixels. The classification was based on a k-means clustering algorithm which evaluated similarity based on correlation in principal component space. A given number of cluster centroids was initially chosen, and the final number of clusters were determined based on average correlation between the pixels and the cluster centroids. We finally used five clusters with an average correlation of 0.91 as using six or more provided only minute improvements in average correlation (Fig. S.2) and did not enhance interpretation of the results. The obtained class image is illustrated in Fig. 6. Average mean-centered spectra of the corresponding classes are given in Fig. S.3 for brevity.

Figure 6
figure 6

Class image of earlywood and latewood cells based on k-means clustering of principal component scores determined based on the Raman image mosaic

As illustrated by the results, the effects of acetylation were associated with classes 2 and 4 which represented earlywood cell walls and middle lamellae (Fig. 6). The average spectrum of class 2 showed a negative band of the carbohydrate C-H stretch region at 2890 cm−1 (Fig. S.3). The class 4 spectrum featured a positive band at 2890 cm−1 and negative bands at the aromatic ring stretch region at approximately 1600 cm−1. The average spectra of both classes also showed positive acetylation marker bands at approximately 2940 and 1720 cm−1 which indicated that classes 2 and 4 represented acetylated low and high carbohydrate regions, respectively. In earlywood, these regions encompassed the entire cell wall and most of the middle lamellae, while the cell corner region was represented by class 1. The average spectrum of class 1, however, contained no acetylation marker bands, which suggested that their intensities were close the mean intensity of the entire data set.

In latewood, the cell corner regions were also represented by class 1, but the acetylated regions represented by classes 2 and 4 extended only halfway through the cell wall (Fig. 6). The remaining central cell wall layer was represented by class 5, while the outer cell wall layers and parts of the middle lamellae were represented by class 3. The mean-centered average spectra of these classes showed either no or featured negative acetylation marker bands, which indicated that the outer cell wall regions were less acetylated in latewood cells.


Our objective was to determine the spatial differences in wood acetylation based on hyperspectral NIR image regression. Already the exploration of the NIR image mosaic based on PCA suggested that there were significant differences in the acetylation and resulting moisture content of earlywood and latewood and proved useful for visualizing the chemical differences within the experimental domain. NIR image calibration based on extracted mean spectra provided a reliable calibration model for predicting acetylation through sample WPG. The variable selection procedure enabled choosing the most important wavelength combinations and improved estimating the pseudorank of the final model, which resulted in average prediction errors of 0.91% WPG based on the test set mean spectra and three latent variables. Such a prediction error within a range of 17.6% WPG makes our spectral calibration model suitable for quality control applications based on the general rules of thumb in the NIR field [7]. Our capability to predict WPG within a broad humidity range further demonstrated the suitability of the calibration model.

Once the reliability of the calibration model had been established, we predicted the WPG values of the individual test set image pixels which showed clear differences in the acetylation of earlywood and latewood. These differences earlywood and latewood acetylation, however, decreased in the heavily acetylated samples, which suggests that they were caused by the different rates of acetylation in earlywood and latewood. Since the rate of acetylation is generally controlled by anhydride diffusion in the wood cell walls [41], the thin-walled earlywood can be expected to acetylate faster than the thick-walled latewood. A significantly higher reactivity of acetic anhydride has previously been found for lignin compared with hemicelluloses and cellulose based on isolated cell wall compounds [40, 42]. As earlywood generally has higher lignin contents than latewood [35, 43], this difference in chemical composition may also have contributed to a higher acetylation rate in earlywood.

We have recently observed similar differences in the acetylation rate of earlywood and latewood when the acetylation was initiated after a confined, one-sided flow of acetic anhydride into the wood [22]. However, earlier studies in the wood field, in which wood samples were completely soaked in acetic anhydride prior to acetylation, have found little to no difference in the WPGs of earlywood and latewood. The results from these earlier studies stand in contrast to our present study, which can likely be explained by differences in experimental methodologies. As an example, Moon et al. [44] and Sadeghifar et al. [45] both performed bulk measurements on mechanically separated and heavily acetylated earlywood and latewood with WPGs in the range 15–22%. It is possible that the authors have been unable to detect the narrow low-WPG regions that persisted in heavily acetylated latewood based on our results. With increasing acetylation time and the resulting high WPG values, the differences between earlywood and latewood acetylation are, however, likely to disappear completely.

We also determined the effects of acetylation on the wood cell level using additional Raman measurements. Unsupervised classification of the Raman images enabled choosing the chemically meaningful principal components for pixel classification after PCA decomposition of the Raman mosaic. The classification results supported the NIR results and revealed that the degree of acetylation decreased from the cell lumens towards the cell corner regions in the thick-walled latewood cells. This explains the observed difference in acetylation degree between earlywood and latewood and supports our hypothesis that the acetylation of the latewood cell walls was limited by the diffusion of acetic anhydride within the wood cells.


We have illustrated how hyperspectral imaging coupled with chemometrics revealed wood acetylation on the wood surface and cellular levels. These results extend our recent findings where we predicted the WPG values of image pixels based on cross-sectional hyperspectral NIR images after confined one-sided acetylation of wood samples. Our current results showed that this approach is not limited to evaluating the penetration depth of acetic anhydride and can be successfully extended to take into account changes in wood equilibrium moisture contents. Unsupervised classification of Raman images obtained from acetylated earlywood and latewood cells supported our hypothesis that the observed differences in earlywood and latewood acetylation on the wood surfaces were caused by differences in the diffusion of acetic anhydride within the earlywood and latewood cells. Overall, our findings pave way for developing novel machine vision methods for the wood field and are an important step for understanding how chemical treatment affects the different hierarchical structures of wood and their hygroscopic properties on different spatial scales.