Introduction

Within Italy, there is a rich papermaking history as it was one of the first great centres of European papermaking. Commercial mills were established as early as 1268 and medieval mills in Northern Italy supplied large quantities of paper to Central and Northern Europe until the 15th century [1,2,3]. Fundamental changes were introduced compared to papermaking in the near East; including the use of stamping hammers to macerate rag fibres, the inclusion of watermarks and the introduction of animal glue as a sizing agent, replacing the use of starch. Historically, European paper was widely made from macerated linen and hemp fibres, and later cotton, beaten by hand or by mechanical beaters, and sheets were formed on large sieve-like screens [1].

As demand for paper increased exponentially in the late 1800s, papermakers were unable to source an adequate quantity of rags. In response to this, lignin-containing wood fibres were introduced into the pulp mixture from the 19th century on, and groundwood pulp became the main fibre source for commercial paper. In the 20th century, bleached pulp largely replaced groundwood pulp and a shift in sizing took place. Pure animal glue was predominantly used from the 13th to the 17th century, when alum was introduced and mixed into the glue. In the 19th century, rosin gradually replaced gelatine as the main commercial sizing agent [4], following Moritz Illig’s invention of rosin-alum sizing in 1807. Sizing with rosin-alum is now rare; reactive sizing agents, such as alkylketene dimer or alkyl succinic anhydride that work under neutral or alkaline conditions, became by far more common.

Nowadays, there is mounting concern about deterioration of many book and paper collections. It is well known that the degradation of paper-based materials is a function of various properties of paper, as well as storage and use conditions, e.g., temperature (T), relative humidity (RH) [5]. The effect of the internal and external factors is largely dependent on the instability of the raw materials used; the presence of lignin and acidic sizing, such as rosin, greatly accelerates the deterioration of paper objects. As estimated by numerous large-scale condition surveys conducted [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20], considerable proportions of items in libraries and archives are already deteriorated, and an alarming percentage (70–85%) of paper materials produced between 1850 and 1950 is prone to fast degradation and is not likely to remain accessible to readers in the 22nd century [21].

Collection surveys are important tools as they assess the extent of deterioration in collections, following which appropriate conservation programmes can be planned. In the past, many survey tests were destructive or invasive [6,7,8,9,10,11,12,13,14, 17, 19]. Manual fold tests, where a page corner is folded back and forth several times until the consequent break of the page, were occasionally used to determine the brittleness of the paper. The acidity of paper was commonly measured using pH testing pens. The destructive nature of these tests has prevented the applicability of such condition surveys to rare and historical books. In fact, most of the literature addresses surveys carried out on academic or test collections, historical books [22,23,24] being rarely analysed. As a result, there is a lack of information regarding the current chemical and physical properties of rag paper collections.

The cultural and historical value of such collections restricts the choice of methods of analysis, and non–destructive and non–invasive methods such as near-infrared (NIR) spectroscopy are preferred. NIR spectroscopy in combination with chemometric data evaluation to characterise paper [25, 26] allows rapid determination of the main chemical and physical paper properties [27], and was calibrated and validated using a reference set of 1400 historical and modern European papers, characterised using destructive methods, such as viscometry to measure the degree of polymerisation of cellulose [28, 29].

The present study was undertaken to address the current material state of the paper-based collections in the historical Classense Library in Ravenna, Italy. This presents an opportunity to quantitatively survey a large number of historic rag papers for the first time. The items included incunabula, manuscripts and printed books from the 14th century until the first-half of the 20th century. Visual assessment was used as it has been demonstrated that physical deterioration such as missing pieces influence readers/visitors’ subjective judgments of fitness-for-use to a greater extent than discolouration or tears, which have little or no influence [30]. In Part II [31], the results will be further elaborated with the support of a dose–response function recently modelled for historical paper [32], in order to assess potential preservation scenarios.

Materials and methods

Classense Library

The Classense Library is located in the urban area of Ravenna (Northern Italy). The 16th-century building was originally designed as a monastery and converted into a library in 1803 to house the monastery collection [33, 34]. Currently, it covers about 28,000 m2 and houses more than 800,000 items [35], including ancient and modern printed books, parchment and paper manuscripts, photographs and maps [36,37,38].

In the present study, only paper-based materials were surveyed. The paper-based collection of the Classense Library is divided into two sub-collections, where the most valuable items are housed in a secure room called Caveau, where the environmental conditions have been being mechanically controlled since 2012 (C collection). The C collection consists of about 1000 items from the 14th century to the late 19th century. Access to the Caveau is restricted to library staff and special permissions are required to view the items held there. All other items are housed in the numerous rooms of the Classense Library on open shelving, where the environment is not mechanically controlled (NC collection) and most of the books are freely accessible to the public. The NC collection includes about 54,000 items of interest for this survey, about half of which are books printed from the 16th to the 19th century, the other half being dated between 1901 and 1950.

Sampling strategy

A stratified random sampling was carried out. The minimum sample size for each collection was calculated according to the equations reported by Daniel [39] considering a confidence interval of 95 ± 10%. As both C and NC collections include books which are spread in terms of publication date, the criterion for the stratification was the date of publication to unveil possible changes in papermaking, and to model degradation. The number of books surveyed in each stratum was based on a proportional strategy [39], modified to ensure a minimum number of books (12 for the NC collection) in the least numerous strata. This led to a total of 297 items analysed, 145 from the C collection and 152 from the NC collection. These numbers are more than the 192 books suggested by Drott’s method [40] with the same confidence level and tolerance. Table 1 reports the strata and the number of books analysed for the C and NC collections.

Table 1 Sample sizes as the number of items analysed in each stratum of the C and NC collections

The stratum sizes in terms of years in the C and NC collection are different. The strata were elaborated separately for each collection because of a different accuracy of the publication dates for the two analysed collections. The overlap between strata of the C collection (i.e., 1301–1400 and 1351–1450, 1401–1500 and 1451–1500) is due to these different accuracies. The age of paper for dated volumes was approximated estimating a standard error of 2.5 years, assuming that paper has been used within 5 years from its production. For books which can be dated in a century, the approximated publication date was the middle of the century, and the estimated standard error was 50 years (e.g., for a book dated in the 15th century, 1450 ± 50).

Non-destructive analysis

Due to management restrictions, the survey could only be conducted using non-destructive techniques. Therefore, visual assessment and NIR spectroscopy were carried out in a secure room in September 2017, mostly by two assessors. Visually observed changes which may affect book functionality (e.g., detached cover, tears, missing pieces) and its aesthetic appreciation (discolouration) were recorded and reported in Additional file 1, as the present paper mainly focuses on the analysis of the quantitative variables.

Spectroscopic analysis

The SurveNIR instrument (Lichtblau e.K., Germany) [27,28,29] is a reflectance spectrometer, with an InGaAs diode array detector (256 pixels). Reflectance spectra were collected in the interval of 1101 to approximately 2199 nm at a 2 nm resolution. The typical penetration depth of NIR radiation in a matrix of organic substances is typically 1–3 mm [41]. It has been previously estimated that information is returned from up to ca. 0.5 mm depth in paper-like materials [25]. Eight measurements, each an average of 500 accumulated spectra per measurement, were taken on different pages of each book in an area without ink and visible signs of localised degradation. The spectra were automatically filtered and averaged using the dedicated SurveNIR SUSO software using a connected laptop. Predictions of physical and chemical properties of the paper were supplied within a few seconds using the chemometric analysis of the spectra. SurveNIR provides identification of the paper type as well as pH, degree of polymerization (DP), tensile strength (TS), tensile strength after folding (TSF) [42], lignin content, protein content, and rosin content values. As well as the measurement outputs, the NIR spectrum for each measurement was recorded and a photograph of where the measurement was performed was captured. Table 2 shows the evaluated SurveNIR instrumental errors for the various quantities measured in each paper type.

Table 2 SurveNIR instrumental errors for each paper type

Statistical analysis

Different statistical methods were employed to characterise the surveyed collections. Principal component analysis (PCA) [43,44,45] and multiple regression analysis (MLR) [46] were performed using OriginPro 2017 (Origin Lab Corporation). The PCA and MLR were built on a single dataset (i.e., analysed rag paper books). The pairwise correlations (see Additional file 1) were carried out on the quantitative variables for all the observations (i.e., the analysed books).

Results and discussion

After pre-assembling the books, 297 items were analysed by two assessors in about 49 h, at ~ 10 min per book, bibliographic information which include title, author and date of publication being also recorded (see Additional file 2). The survey primarily reports data on historical rag papers, as the majority of the measured books pre-date the 19th century. Two large-scale studies have previously investigated the chemical and physical properties of historic rag papers [22, 24], and the present study offers additional data which allows an in-depth analysis of rag paper degradation. A summary of the methodology and results of the visual assessment can be found in Additional file 1.

As the collections were separated into climate-controlled and non-controlled collections by the library only 6 years ago, they have been grouped together for parts of this study, as the last 6 years of degradation likely represent an insignificant period compared to the age of the objects (1–5% of the object’s lifetime). The collections will be separately dealt with in Part II [31] as the effects of the different environments on the future condition of books will be modelled.

Quantitative survey data evaluation

Paper typology

The SurveNIR system discriminates between four European paper types, i.e., rag paper, groundwood pulp paper, bleached wood pulp and coated paper, using the chemometrical models obtained from a reference sample set of 1400 European paper [28, 29]. Three paper types were present in this survey: rag paper (82%), groundwood pulp paper (13%), and bleached wood pulp (5%). The change in fibres follows the general evolution of European papermaking [47]; the oldest book made of bleached pulp paper found in this study was dated 1896, while the oldest book made of groundwood paper may be dated to 1834.

Acidity, protein and rosin content

It is well known that acidity plays a very important role in paper degradation, the more acidic the paper is, the faster the degradation proceeds [5, 48]. Figure 1 shows the pH for the analysed books as a function of publication date and paper type. The estimated pH values cover a wide range, in which 75% of all the analysed books are close to neutral (6–8), most of which are rag paper. Groundwood pulp papers are characterised by low pH, while the books made of bleached pulp paper have a broad spread of pH values.

Fig. 1
figure 1

pH of rag (squares), groundwood (circles), and bleached pulp (triangles) papers as a function of the approximated publication date. The error bars indicate the instrumental errors for each paper type

Compared to Barrow’s large-scale study of rag paper [22], which measured the acidity of 1470 book papers made between 1507 and 1949, the results on average are similar. Historic rag paper has pH averaging 6.1 in Barrow’s study and 6.6 in this study. Both studies observe a sharp increase in acidity in the mid-19th century onwards. Gelatine (derived from collagen, the connective tissue in skin, cartilage, sinews and ossein of animals) was the most common sizing agent used in Italian papermaking from 1337 until acidic alum-rosin sizing was introduced in 1807. During the 17th century aluminium potassium sulfate was added to gelatine size to stabilise the viscosity of the size at various concentrations and temperatures, to inhibit biological growth, and increase the resistance of the size to ink penetration [49, 50].

It has been shown that gelatine is beneficial to paper, decreasing its degradation rate, as the protein can act as a physical barrier and chemical buffer [51, 52]. The presence of gelatine sizing was measured by protein content (%) in this study and has been described as a function of publication date and paper type in Fig. 2. The measurements show a continual decrease of protein content towards the early 19th century when alum-rosin size became widely used. Similar trends for protein use in Italian papermaking were recently observed by Barrett et al. [24], who found higher concentrations of protein in pre-1500 paper, as papermakers were attempting to imitate parchment.

Fig. 2
figure 2

Protein content (%) of rag (squares), groundwood (circles), and bleached pulp (triangles) papers as a function of the approximated publication date. The error bar indicates the instrumental errors for all paper types

The effects of the addition of alum to gelatine, discussed extensively in Barrow’s research [22] as a factor that increased acidity in papers produced from the mid-17th century could not be clearly observed in this study as the presence of aluminium was not tested during this survey. However, the use of alum was not recorded in the technical data included in early Italian papermaking descriptions [23] and high contents of protein (> 4%) were measured until the late 18th century in this study. It is possible that most of the measured rag papers did not contain high concentrations of alum as high acidity was not observed. High contents of protein could have also acted as a buffer, as observed by Barrett et al. [24].

From the early-19th century on, rosin was added to the pulp and precipitated with aluminium sulfate to size paper internally. Figure 3 describes rosin content (mg/g) as a function of publication date and paper type from 1800 onwards. The results indicate a steady increase of rosin content in the books printed after 1850, as expected. It has previously been shown that high quantities of alum are detrimental to the long-term stability of paper [53]. The average pH of rosin containing paper (> 2 mg/g) measured in this survey was 4.4, resulting in chemically unstable post-1850 books.

Fig. 3
figure 3

Rosin content (mg/g) of rag (squares), groundwood (circles), and bleached pulp (triangles) papers as a function of the approximated date of publication post-1800. The error bar indicates the instrumental errors for all paper types

Lignin content

Lignin is a polymeric structural constituent of wood and other plant tissues and is undesirable as it may cause discolouration [54, 55]. The lignin content in paper varies considerably depending on the original fibres and its pulping processes, i.e., the amount of lignin in groundwood is much higher than rag paper or bleached pulp paper. The fibre sources of historic rag paper within Italy, including linen and later cotton rags, naturally contained low levels or no lignin as observed in Fig. 4, which shows lignin contents (mg/g) measured in the analysed books as a function of publication date and paper type. As expected, there is a notable difference between rag, groundwood and bleached pulp paper. The lignin content of the books made of rag paper is typically below 25 mg/g, while the books made of groundwood paper have lignin contents between 73 and 303 mg/g. The books made of bleached pulp paper have less lignin than those made of groundwood paper, the lignin content of the bleached pulp paper ranging from 31 to 98 mg/g.

Fig. 4
figure 4

Lignin content (mg/g) of rag (squares), groundwood (circles), and bleached pulp (triangles) papers as a function of the approximated publication date. The error bar indicates the instrumental errors for all paper types

Mechanical properties

The mechanical strength of paper reflects the chemistry, morphology and the fibre network structure of paper. Mechanical strength depends on the paper type and the environment, primarily relative humidity. Tensile strength (TS) and tensile strength after folding (TSF) were measured, as they reflect the strength of a sheet of paper not subjected to mechanical stresses (TS), while TSF represents the strength after folding using the Bansa-Hofer method [42]. Both are reported in terms of nominal force (N). Figure 5 displays the TS and the TFS as a function of publication date and paper type.

Fig. 5
figure 5

TS (left) and TSF (right) of rag (squares), groundwood (circles), and bleached pulp (triangles) papers as a function of the approximated publication date. The error bars indicate the instrumental errors for each paper type

The structure of the paper and properties of individual fibres are reflected in the TS values. The mean TS values of groundwood and bleached pulp paper books are similar to each other, 35 and 38 N, respectively, and the mean TS value for rag paper is 51 N, which is statistically significantly different even given the measurement uncertainties (Table 2). It has been reported [23] that calcium carbonate was produced from the lime used during sheet formation, creating alkaline deposits in traditional Italian papers without affecting the fibre strength, and in addition, gelatine sizing may also improve the strength of the paper sheet [51, 52]. As expected, changes in papermaking processes result in a wider range of TS values for the 19th-century papers, where 49% of papers measure < 45 N, which is similar to Barrow’s study [22]. A lack of folding endurance can be the result of short fibre length or lack of inter-fibre bonding [56].

Rag paper data analysis

Experimental studies of paper degradation have mostly focused on the chemical and physical analysis to understand the complex relationship between a limited set of experimentally controlled parameters. Collection surveys can offer significant and complementary datasets; however, statistical analysis is required to draw conclusions on the basis of often higher data scatter and uncertainties. The specific advantage of the data collected during the survey of the Classense Library collection is that it offers an unprecedented historic rag paper dataset covering a 600-year period from 1300–1900, and further data analysis focuses specifically on this dataset.

Degree of polymerisation

Since the most common analytical techniques to measure the DP, namely viscometry and size exclusion chromatography, require destructive sampling [57], condition surveys rarely report DP values. Figure 6 displays the inverse of the average degree of polymerisation (1/DP) [58,59,60] as a function of age for the measured rag papers only.

Fig. 6
figure 6

1/DP of rag papers as a function of age, with the regression line fitted using York regression [61], taking into account the standard errors for age and 1/DP

The regression line in Fig. 6 shows that degradation of rag paper broadly follows the Ekenstam equation [58], where the rate constant of degradation in year−1 can be calculated from the regression line slope. There is significant data scatter, however, both the slope and the intercept are statistically significant. The error bars represent the uncertainties of DP estimation (Table 2) and of paper age, and we used York regression [61] to account for the reported uncertainties for both x and y values (N = 243).

Based on the value of the intercept, average DP0 (i.e., immediately after production, at t = 0) in this study was 2360 ± 130. Although the error interval for this value is asymmetrical due to the inverse function, the difference is small (< 5%) compared to both the instrumental error (Table 2) and to the usually observed inhomogeneity of rag paper and the associated uncertainty of DP, which is typically ~ 10% [62]. Therefore, the negative and the positive error were averaged to give the value of 130. Although DP 2360 appears small compared to the DP of native cellulose, it is comparable to the DP of other processed cellulosic fibres, and it is useful to remember that rag fibres were substantially pre-processed and pre-degraded before being used for paper production [1].

The rate constant for chain scission is (4.2 ± 0.6)·10−7 year−1, as deducted from the slope of the regression line in Fig. 6. It represents the first experimentally observed rate constant for historical rag paper, and it is valid for all rag paper, as the methods of production were similar outside Italy. In spite of the significant data scatter, the value found for the rate constant is perfectly consistent with that reported [63] for historical Korean Hanji paper without beeswax coatings (2.17·10−7 year−1), considering that the rate of cellulose chain scission of Hanji paper was found to be about two times slower than that of rag papers [63]. The Collections Demography dose–response function [32] may help us validate this result, and in Fig. 7 we estimate the range of T and RH values where such a rate would be expected for paper with average pH of 6.6, as was calculated for the 243 samples in this study. The standard error (SE) was calculated on the basis of the SE of the regression line slope in Fig. 6, and reflects data scatter and uncertainties of estimations of DP and age, but it does not reflect the uncertainties of the dose response function itself, or the uncertainties of pH estimation, so it is likely that the SE is substantially bigger, although numerical error estimation is outside the scope of this article. The estimated SE of 0.6·10−7 year−1, i.e., 15%, is comparable to the estimated errors for experimentally determined degradation rates [57], which is remarkable, given that the declared instrumental uncertainties and general data scatter are high.

Fig. 7
figure 7

Range of RH and T values that could lead to the observed rate of rag paper degradation (average pH 6.6) of 4.2·10−7 year−1, based on the Collections Demography dose response function for historic paper [32]. The green range represents ± 1SE interval for the rate, the yellow one ± 2SE, and the orange one ± 3SE

The coloured areas in Fig. 7 thus represent the potential past average storage conditions that the surveyed samples may have endured since production in order for paper to decay at the observed rate. Since the majority of the surveyed rag papers have been stored in the Classense Library building since their acquisition, it is reasonable to assume that the current average indoor climate reflects the past indoor climate reasonably well. The 2014 averages in the non-climate controlled areas were measured to be 7–17 °C and 50–70% RH in December and 24–28 °C and 53–63% RH in July/August [64, 65]. The external walls of the building are composed of fired clay bricks. The data measured during two monitoring campaigns shows that the Classense Library has a high thermal inertia [64]. Therefore, while August may be the hottest month indoors, December may not be the coldest one, as the annual minimum temperature in Ravenna is in January, and allowing for a time lag due to the thermal mass of the building, the coldest month indoors could be February. On the other hand, the average annual temperature over the last 10 years (from January 2009 to January 2019) in Ravenna has been 15 °C [66], meaning that a passive storage area in a building with a large thermal mass could have a lower annual average temperature than that. In addition, the climate has been warming up since the Little Ice Age in the 16th century [67], meaning that the past temperatures experienced by the collection may have been significantly lower than those measured in the same building today, and thus the above calculated rate of rag paper degradation may be a valid estimation.

PCA and MLR

Using PCA to visualise the relationships between all the data of rag paper books, the first two principal components account for 83% of the cumulative variance, which provides a useful approximation of the relationship between variables. TSF, lignin and rosin content were not included in PCA. TSF is very strongly correlated to TS (see Additional file 1: Fig. S6), the lignin content in rag paper is very small, and rosin is present only in books dated post-1800. Figure 8 shows the resulting biplot where the observations and positions of the considered variables are displayed.

Fig. 8
figure 8

PCA of the observation data for rag paper

TS, pH and protein have similar heavy loadings for PC1, as well as TS, pH, and publication date for PC2. DP adds little to the first component having the maximum loading for PC2. Besides, protein content is negatively correlated to publication date, while vectors corresponding to TS and pH are positively correlated, as reflected in pairwise correlations (see Additional file 1: Fig. S6). It is known that tensile strength is influenced by intra- and inter-fibre bonding [68], which could be represented by DP and protein content, respectively. However, their contributions have never been quantitatively examined, and MLR could help us evaluate this relationship. As all of the observed variables depend on date, this was also included in the MLR analysis.

Figure 9 and Table 3 show the model outputs where age, DP and protein are considered as independent variables to determine tensile strength. The model explains ~ 60% of data variability, and indicates that all three variables are statistically significant (p < 0.05), p value of DP being the smallest, as reported in Table 3. The relationship between tensile strength and individual independent variables within the multiple-regression model can be visualised through partial leverage plots (Fig. 9), which are constructed from the residuals of tensile strength (Y) and the independent variable (X).

Fig. 9
figure 9

Partial leverage plots of the TS multiple regression model with age, DP and protein classified as independent variables

Table 3 Outputs of the TS multiple regression model with age, DP and protein as independent variables

Age shows the largest spread of data within its partial leverage plot and it can be concluded that within the measured sample set, age of the paper has the least effect on tensile strength. Both DP and protein content can be viewed as significant predictors of tensile strength. The partial leverage plots (Fig. 9) show that while both DP and protein content have a strong linear relationship to tensile strength, the X and Y residuals of the DP partial leverage plot are more closely correlated, indicating that DP has a greater effect on tensile strength than protein content. In agreement, Zou et al. [69] found an almost linear correlation between DP and TS in artificially aged Whatman paper. The present results suggest that intra-fibre bonding is the principal driver for increased tensile strength of rag papers, and inter-fibre bonding is of secondary importance, which is a significant conclusion related to the use of DP as a general proxy for rag paper mechanical strength.

Conclusions

This study investigated the chemical and physical properties of the paper-based collections housed at the historical Classense Library. A total of 297 books, including incunabula, manuscripts and books from the 14th to the 20th century, were analysed in a non-destructive and non-invasive way using visual assessment and NIR spectroscopy with multivariate data analysis. An age-stratified survey was performed to obtain a good spread of data on historical collections useful for modelling of degradation. For the first time, the survey reported parameters such as pulp type, pH, DP, TS, TSF, lignin, protein and rosin contents, for collections housed in a historical library. The results form the basis of an evidence-based evaluation of the permanence of historical papers and allowed for in-depth paper characterisation of the measured Classense books. This is specifically important as such an in-depth survey has never been performed on a substantial rag paper collection. The visual survey produced a baseline assessment of the current state of degradation caused by factors such as chemical degradation, biological growth, water, and use. These can be used both to assess the needs for conservation interventions and to estimate the effect of use on further mechanical degradation.

Analysis of the overall trends of physical and chemical properties of paper with age shows the expected rapid changes usually observed in paper produced between 1850 and 1950, mainly due to the introduction of acidic sizing. A comparison of the TS and TSF values indicates that the books made of groundwood are less mechanically stable. Lignin, protein and rosin contents measured are generally consistent with the development and the chronology of the European papermaking processes, as well as pulp types.

For the first time, this survey provided a set of quantitative data for rag paper, over a time period of 600 years, from 1300 to 1900. PCA shows that tensile strength is correlated to both protein content and DP and following MLR it was established that DP was more closely correlated to TS than protein content, indicating that intra-fibre bonding could have a greater effect on the overall tensile strength of rag paper durability than inter-fibre bonding.

The analysis of DP changes over time provided an estimation of the average initial DP at time of paper production (DP 2360 ± 130), as well as the rate constant for chain scission (4.2 ± 0.6)·10−7 year−1, which is broadly in agreement with predictions provided by the Collections Demography dose–response function taking into account the past climate in Ravenna. This is the first “real-time” experimental determination of the rate of degradation of rag paper.

In Part II, we will further elaborate some measured chemical properties to evaluate the effect of future environmental management scenarios, with the support of the Collections Demography dose–response function. Using isochrones and demography plots, we will assess the possible outcomes of preservation of the two collections housed in the Classense Library. Jointly, assessment of the current and future state of the historic collection will represent the first comprehensive evaluation of a historic collection and thus provide heritage managers with evidence for informed preservation decision making.