Introduction

The fragmentation levels, or the breakage rates, differ from one artifact group to another due to the formation processes in the ground while the artifact remains un-earthed (Orton 2000: 51–53). Therefore, according to Orton and other authors of archaeological method literature (e.g. Bowman and Wilson 2009; Renfrew and Bahn 2008; Shennan 1997), the more accurate measure of comparison is the weight of a found artifact type rather than the number of found sherds. Although, for example, Orton referred to using weight as a comparative variable for single artifact group, using weight and percentage of total material to report the amount of different found material has been widely adopted way of presenting archaeological material assemblages (see e.g., Clarkson et al. 2015: 54; Oikarinen 2012: 32, 37; Ossowski and Badura 2014: 241; Sarajärvi 2012: 37).

However, especially in the European late medieval and early modern periods, the number of different commodities grew rapidly. In these later periods, a wide range of different artifact types became available, with each having different qualities in size, breakage rate, and density. All scholars have an intuitive feel that a sherd of stoneware pottery weighs more than a sherd of glass the same size, and it is common knowledge that metals are denser substances than ceramics. For elements like iron, the densities are widely available in the literature (Rumble 2017), but for ceramic, glass artifact and other man-made objects, especially those made in the medieval and post-medieval periods, this is not so. Therefore, the questions arise as to how much these artifact groups differ from one another in their material densities and whether the difference is significant enough to affect the results of quantitative analyses. If it does affect the analysis, the question then is: how it can be standardized?

These questions are more relevant than ever as the datasets are growing larger (Gutmann et al. 2018; Villazon-Terrazas et al. 2015), as new quantitative tools are being adopted (see e.g., Brughmans 2013) and with the ever increasing computing power, even in archaeology. Larger datasets also mean more sources of error in which case accuracy and therefore reliability become an important aspect to consider (Binford 1987: 393).

In this paper, I explore the answers to these questions in the form of statistical analysis (Drennan 2009; Earl et al. 2013; Field 2009; Shennan 1997; VanPool and Leonard 2010). The densities of different material groups were measured and analyzed to see which of the artifact groups differed statistically enough to have an effect on the comparative studies. I will also present a simple corrective solution to the aforementioned problem: a density derived corrective key ratio that can be applied retrospectively, thus eliminating the need for additional measurements for older studies.

Interestingly enough, this topic seems to have instigated no previous discussion. Material densities have been measured previously (e.g., for redware, Elson and Craig 1992: 11–14, in osteology, Currey 1984; Stein 1989; Wall 1983), but not as a problem of artifact group comparison. The reason for this is unclear; however, it might be related to the fact that researchers often specialize on a specific artifact group, and therefore the challenges of intercomparability have not been relevant. Moreover, as a number of artifact types were manufactured from mixed material, the densities of several artifact types are currently unknown. In this study I will also present the results of these measurements.

This paper is related to my doctoral study about the development of medieval trade in the Baltic Sea area and how the changes that took place emerge in the material culture. As a big data (for definition, see Boyd and Crawford 2012; Gutmann et al. 2018) and quantitatively oriented study, it is important to minimize the error caused by the density of the material, in order to avoid additional errors when analyzed with statistical and other tools. In this study, I present a short case study to highlight the benefits of density correction. In this case, importing consists of all the material brought overseas to Finland, including goods from continental Sweden.

Material

A 100-piece sample was selected to be measured from each artifact group for which the densities were not available in the literature. These were redware, stoneware, clay pipes (white clay), dish glass, and plate glass. Thus, a total of 500 finds were initially selected to be measured.

The samples were selected from the 2001 excavations of Rettiginrinne (TMM 22196) conducted by the Provincial Museum of Turku (now the Museum Centre of Turku). Rettiginrinne located in the heart of the town of Turku, was founded around 1300 CE, in southwest Finland (Saloranta and Seppänen 2002). The city has a rich medieval and post-medieval material composition. The measurements were made at the Museum Centre of Turku using the displacement method. In the displacement method the measured object is placed in a graduated cylinder filled with water and when the object is immersed the difference is recorded (Rapp 2002: 21). From this reading the density was calculated as follows:

$$ p=\frac{m}{V} $$

where p is density, m mass, and V volume. Four different-sized graduated cylinders were used: 500 ml, 250 ml, 100 ml, and 50 ml. The cylinders provide a margin of error at 20 °C but in this case it did not affect the measurements as the readings were done before and after the immersion. The 500 ml cylinder had marked lines every 5 ml; the 250 ml and 100 ml cylinders every 2 ml; and the 50 ml cylinder every 1 m. The measurements were accordingly taken at every half of the marked lines (2.5 ml, 1 ml, and 0.5 ml).

Water can initiate a disintegration process in conserved finds (in this case glass finds) (Davison 2011: 269). Therefore, the glass finds were covered with a thin layer of plastic before the immersion. It is hard to evaluate the margin of error of using plastic wrapping and therefore all finds were protected with the plastic in order to keep the error constant. However, there is a reason to believe that the error caused by this procedure is minimal as the plastic wrapping was left open from the top to allow air to escape from the pressure of the water. The use of a plastic wrapping, the scale of measurement of the graduated cylinders as well as the sample sizes formed the total margin of error.

The sizes of the graduated cylinders affected which finds were possible to measure. From the original 100 pieces per artifact group, the samples in Table 1 were the right size/shape to be measured. In the cases of glass finds, some finds were too small to be accurately measured.

Table 1 The n, means and standard deviations for measured artifact/material groups. For values acquired from the literature with a range, the mean is used in calculations. For clay pipes and plate glass, medians are also presented as the measures were not normally distributed

The sample sizes of both glass find categories are not ideal, but considering that the difference between redware and clay pipes was the only one statistically significant from the measured finds, there was no need to increase the number of measurements. Additionally, the dish glass sample was normally distributed.

The confidence interval, or the margin of error, with 95% confidence level for the study was ±5.38%. This does not include the aforementioned factors also affecting the margin of error as they cannot be calculated precisely (Fig. 1).

Fig. 1
figure 1

The study material and the setup for the measurements

The densities are presented here as kg/m3, which is equal to g/ml. The resultant means are also presented in Table 1.

As can be seen from Fig. 2 and also from Table 1, the deviations in the measurements vary. Redware especially has the majority of its measurements close to the mean value, demonstrating little deviation in the measurements. This indicates that the mixes for redware pottery clay has stayed relatively constant density-wise as the material is homogenous throughout the European late middle ages and in post-medieval times. For example, it seems that the amount, or proportion, of materials (for example, sand) in the mix has not been varied significantly, a procedure which is common when modifying the properties of the clay mixture (Pihlman 1995). In addition, clay pipe makers used a mixture of their own (Ainasoja 2001: 38), but the narrow deviation indicates that the clay materials in clay pipes also had very similar properties in densities.

Fig. 2
figure 2

Boxplot graph of the densities of the various common materials found from an archaeological context. There is no deviation in groups iron–to–schist as the values were collected from the literature

The other measured groups have more deviation and outliers, but overall the deviations indicate that the measurements have been consistent and therefore reliable.

The densities of the material groups gathered from the literature were iron, copper, lead, amber, limestone, flint, sandstone, and schist (Rumble 2017: 12–203, 15–43), all available in the Rettiginrinne excavations. Two other archaeologically important material groups had to be excluded: leather and bones. Leather proved to be too flexible and fragile to be measured with the tools used for the other material groups. Although the average density of leather is available from the literature (Rumble 2017: 15–43), there is too much uncertainty about the effects of different tanning methods on the density. Additionally, the species from which the leather is obtained may affect the density. Bones have a similar type of problem. As a product of an organic, living animal (or human), the environment in which the individual has grown up, as well as diseases (e.g., osteoporosis) have too great an effect on the density of the bone. Moreover, other research also indicates that the densities vary between species (Lyman 1994: 239). Gold and silver were also excluded from this study, as they are usually found in very small quantities.

This study does not take dating into account, even though the manufacturing methods may have varied at different times, and thus may have had an effect on density. However, this was acknowledged and taken into account when selecting the samples. Therefore, the sample in this study represents the average value of density over time per material group. Otherwise the samples were drawn randomly; style or decorations did not affect the sampling process.

Results

As statistical tests, one-way ANOVA and Mann-Whitney U-test were used to assess the differences between the samples. In short, these tools compare the means of two or more samples using the F-distribution (VanPool and Leonard 2010: 153). The U-test is a non-parametric equivalent to ANOVA. The non-parametric test is somewhat less accurate than the parametric, but it is not dependent on the symmetry of the distribution (Mann and Whitney 1947). Altogether, the densities of five different artifact group samples with densities that were not mentioned in previous literature were measured, and the densities of 10 other materials commonly found in European excavations were collected from the literature (see Table 1).

The data was analyzed using SPSS 23 software (Field 2009; Meyers et al. 2013). First, the five measured groups were analyzed. The test for normality revealed that the distribution in the samples of clay pipes and plate glass were skewed and could not be corrected by transformations. This, therefore, had an effect on choosing the correct statistical tests. First, ANOVA revealed that there was a significant difference between pairs of redware-plate glass and redware-clay pipes. However, as clay pipes and plate glass both had a skewed distribution, thus requiring further analysis. A non-parametric Mann-Whitney U-test, with Bonferroni correction was used for these two pairs which showed that the difference in density in the pair redware-plate glass was not, in fact, significant (p = 0.148). For the pair redware-clay pipes the difference was still significant (p = 0.007).

This was continued by further comparative analysis with the densities of the artifact groups acquired from the literature. The results are presented in Table 2. As can be seen, iron, copper, and lead had a significant difference in comparison to all the other material groups; as had been expected. Of the stones, flint had a significant difference compared to redware (p = 0.014), stoneware (p = 0.014), dish glass (p = 0.011), amber (p = 0.019), and although narrowly, also clay pipes (p = 0.048). Schist had significant results (between <0.000 and 0.002) when compared to all the other artifact groups. Surprisingly, the results of limestone and sandstone did not exhibit a significant difference in a comparison with any of the material groups measured.

Table 2 The comparison matrix and the results of the statistical analysis of how the densities vary between different artifact material groups

Overall, the results of the statistical analysis support the hypothesis of density having an effect on the comparison between different materials (in addition to the obvious difference to metals and some of the stones), and therefore the use of corrective key ratios. Consequently, the answer to the question “are the differences significant enough that they affect the results of quantitative analyses?” is yes.

Formulating the Key Ratios

The most effective way to accommodate for the error caused by material density, is to standardize it in the form of key ratios for each artifact group. The key ratios were formulated based on the measurements. The usage of key ratios is important as it is not sensible to divide the weight of the material of an artifact group with the density of the substance they are made of as a means of creating a measure to minimize the effect of density. Although mathematically and statistically viable measure to do, this would, however, distort the volume of the artifact’s composition and make it counterintuitive. Relative values do not change whether key ratio correction or “dividing by density” is used, but for absolute values (also referred to as real values or par values) of material composition the key figure correction is generally less distortive. In Fig. 3, the key ratio corrected values do not change significantly (excluding metals) compared to the actual values. However, the values that are corrected by dividing with just the material density are further away from the par values. It is important to remember that the aim of the correction is to correct the bias caused by the different material densities, not create values that are floating without any confluence to the par values (see Fig. 3), similar to the concept of floating exchange rates in economics (MacDonald 1988).

Fig. 3
figure 3

Comparison of material found at the Turku Rettiginrinne plot excavations. Actual quantity, key ratio corrected and plain divided by density correction. The attempt to correct by dividing with density distorts the quantity of all material groups compared to the actual quantity; the key ratio correction adjusts the difference and thus makes it possible to compare the actual quantities and key ratio corrected values with each other within the margin of error

After calculating the total average density of the measured material groups, the key ratio for stoneware was selected as 1 (density, measured average 1,48 kg/m3), being the closest to the total average (1,51 kg/m3). From there, the other key ratios were calculated by following equation:

$$ Selected\ material\ group\ key\ ratio=\frac{Stoneware\ density}{Selected\ material\ group\ density} $$

The results are presented in Table 3 where the corrective key ratio has been applied to negate the effect of material density by multiplying the sum weight of found material by the selected key ratio.

Table 3 The calculated key ratios for different material groups. The key ratio for stoneware was selected as 1, therefore the par value and DCV of stoneware are identical. The key ratios for clay pipes and plate glass were calculated from the median to take into account the skewness of the data
$$ \mathrm{D} ensity\ corrected\ weight\ of\ material\ \mathrm{x}= material\ weight\ \left({}^{"}{x}^{"}, par\ value\right)\ast key\ ratio\ \left({}^{"}{x}^{"}\right) $$

Once again, the statistical tests showed that density has an effect on the results for iron, copper, lead, schist, flint, clay pies, and redware. Depending on which material groups are compared to each other, the corrective key ratio needs to be applied. Ideally, the conversion is made for all material groups to minimize the effect of the conversion and presented as, for example: weight (g, DCV [density corrected value]). The usage of the key ratio correction is not necessary when only comparing artifact groups that have no statistical difference in densities. In these cases, real weight can still be used as a comparative figure.

Case-Study: Analyzing the Import Volume to Visby, 1300–1800

To further help to understand the benefit of using a key ratio for density correction, I have analyzed the volume of import to Visby between years 1300–1800. Visby’s commercial history is one of the few cases in European economic history where a large town, in time of relative economic growth, declined gradually (Haase and Ström 2004: 26). Visby was an old Viking-age outpost originating from the eleventh century. It is located at the island of Gotland in the Baltic Sea, near the Swedish coastline. Its early commercial success relied on its central location in the Baltic Sea as a commercial gateway to acquiring Novgorodian goods as well as also being along the route to Stockholm (Dollinger 1971: 7, 67; Westholm 1999: 526; Zerpe 2006: 567). This led to German as well as many other merchants to emigrate to Visby, making the town prosper. From early on the German merchants in the town had full Hanseatic privileges, meaning that Visby’s merchants were entitled to exemptions from paying import tolls in other ports of Europe and to other privileges (Palais 1959: 855; Wubs-Mrozewicz 2013: 6).

However, the decline began when a series of events confronted the town. First, Danish king Valdemar IV conquered the island of Gotland in 1361 which had been a Swedish territory and at the end of the century the Victual Brothers then captured the island (Dollinger 1971: 79; Harreld 2015: 76, 141; Satora 2012: 76; Zimmern 1889: 54–55). It was also a major blow to the town’s commercial life when the unpopular king Eric of Pomerania constructed and settled in Visborg castle in 1411 and remained there for 12 years, during this time the town’s commerce halted due to unfair high taxes (Westholm 1999: 526).

However, the unfortunate events that faced Visby were not the only factor that eventually led Visby to lose its Hanseatic status in 1476 (Westholm 1999: 526). Lübeck, Danzig, and other Hanseatic towns along the German and Polish coastline were growing faster than Visby. In addition and in general, the geographical balance of trade had shifted in the Baltic Sea which made Gotland more obsolete (Zerpe 2006: 567). It was not until the eighteenth century that Gotland began to prosper again (Friedman and Figg 2017: 585).

The reason for Visby being a good example for this analysis is not, however, its decline, but because almost all goods and materials were imported excluding limestone and sandstone, which are found abundantly in Gotland (Friedman and Figg 2017: 584; Zerpe 2006: 568–69). For example, iron, whether as slag iron or finished goods, were all imported from mainland Sweden (Zerpe 2006: 574). This allows us to analyze almost all the archaeologically preserved material to show how imports developed during the late Middle Ages and early modern eras. When studying the development of volumes of imports, the use of a key ratio becomes beneficial.

In this analysis I have focused on the material from Kv. Säcken 7 – a plot which was originally excavated in 1974–75 (Elmshorn 2014). The excavation report and find catalogue are published as open access and are available in the Samla database (http://samla.raa.se). It is one of the many large scale excavations in twentieth century organized in Visby and is located in the old town quarters. There has been a house, a garden, and a large well from the eleventh century onwards, although neither the function nor the owner of the house are unknown (Elmshorn 2014). There is a possibility that the house might have been a place for a market.

Overall, a little over 164 kg of material that was identified as being imported were excavated and documented. For example, 63 kg of this was redware, 54 kg of iron and 13 kg of stoneware (Fig. 4, Table 4) by par volumes. If we look at the material composition of the imports, ~38% of the imports (that are archaeologically preservable) were redware, ~33% were iron, and ~8.5% were stoneware.

Fig. 4
figure 4

Kv. Säcken 7 material composition as a diagram before and after the use of key ratio correction

Table 4 The finds from Kv. Säcken 7 excavations in Visby categorized by the material by par volumes as well as the density corrected volume

However, because the material density affects the values, these values give us limited information about which product category was the most imported. After applying the density correction, it is revealed that especially the volume of iron that was imported to Visby is not very significant. Moreover, by looking at the density corrected volumes, it is revealed that actually ~54% of the imports were redware. The proportion of iron in Visby’s import is only ~8% which is just a little less that stoneware (11.4%). Consequently, the whole quantitative picture of the imports to Visby changes by eliminating the effect of material density.

In Table 4, I have also calculated the change in proportion of material composition for each material group. This reveals that the proportion of redware increases by almost 16% and the proportion of iron decreases by almost 25%. Although most of the changes are less dramatic, the calculations also reveal that some other material categories become more important: the percentage of bricks increases by almost 5% and the percentage of stoneware increases by almost 3%. The changes are large enough that there will be an effect on results if analyzing these material categories with statistical models that rely on material composition as a whole, for example with cluster analysis or primary component analysis.

Another example demonstrating how DCV helps to evaluate the differences in import is to compare the change in volumes over time. In Fig. 5, I have made a closer inspection of the volumes of redware and iron imports to Visby over time. Another pair could have been chosen, but this pair shows particularly clearly how the use of DCV clarifies the information on material imports. On the left, the uncorrected volumes by weight can be seen, and on the right the density corrected volumes. From the un-corrected graph, it could be concluded, as in Fig. 4, that redware and iron had a similar import by volume, which is not the case. By eliminating the material density from the comparison, it can be seen that actually redware was more imported than iron. Furthermore, it is worth noting that by using the DCV the redware import was more volatile than the iron import (note that the margin of error has not been accounted for in the volatility calculations). Volatility is a measure of the rate at which a variable increases or decreases over time (Black et al. 2013). If volatility is calculated from par values, it can be concluded that the volatility of export for both material groups are similar, even though that is not the case (Table 5). Again, this is a false impression created by the material density.

Fig. 5
figure 5

The development of redware and iron imports to Visby from 1300 to 1800. Comparisons of non-corrected and density-corrected volumes. For density corrected volumes, a linear regression is added

Table 5 Volatility of par volumes and density corrected volumes of import

Another interesting aspect can be seen from the Fig. 5. The import of redware can be seen to be declining quite heavily, when on the other hand iron import seems to be quite stable, this indicates that the usage of pottery was declining but the use of iron was not. During the period between 1300–1800, Visby lost approximately one third of its population (Bairoch 1991: 156) which explains the decline in the consumption of pottery. It might be possible that the import of iron remained stable during the decline as it was used for weapons and other utensils. If it is assumed that during this period the population also became less wealthy, imported pottery was perhaps the first to be replaced, to some extent, with goods made from cheaper material, perhaps wood.

The DCV does not affect the relationship in single materials (similar to logarithmic or exponential conversion in statistics). The linear regression line is the same for corrected and non-corrected, therefore the linear regression is only drawn to the corrected graph.

Although we know intuitively that iron is a denser material (which has been the counterargument for not using density correcting method presented in this paper), we still need a variable that is quantified and intercomparable. Working with find catalogues, this leaves two options; the weight or number of sherds, of which the second is an unreliable measure as explained earlier. However, as can be seen from Fig. 5, the information about the differences in imports are only visible if the material density has also been accounted for.

The data sample for this analysis is not comprehensive and therefore no final conclusion can be made about the nature of the imports to Visby and how they change over time. It still does, nevertheless, give an interesting insight on how the import business of different commercial goods seems to have developed. This analysis focused mainly on the intercomparability of two material categories as the differences between other material groups were smaller and therefore not as clearly visible. Additionally, the purpose of this analysis was to prove the benefit of the method presented in this paper. It does not mean that other interesting differences could not be detected.

Discussion

The challenge of historical quantitative studies has always been how to reliably take random effect variables into account. In some cases, randomizing the sample is the answer; sometimes this can be done by utilizing different advanced statistical models like Generalized Mixed Model (GML), which is a linear regression model that is able to take random components into account (Agresti 2015: 2) and in other cases an effective way is to isolate the phenomenon the researcher wants to measure from the real world. One example of the latter is to use indeces, indicators, and key ratios, which eliminate the variable that causes the random or unwanted effect.

One well known variable is the population and the use of per capita. Nations and towns, modern or historical, come in various shapes and sizes. The volume of consumption is highly dependable on the population: a growing population leads to increased trade as the traders endeavor to meet all the needs of the population. This can also be seen in the archaeological source material. The larger the population of the town, the richer the archaeological context. However, a study that compares towns with populations of 50,000 and 1000 by the par volumes of material is ineffective and needs to be corrected by using the per capita, so that the population does not affect the results. In modern economics, per capita is often used for example to compare nations by their GDP (Gross Domestic Production) (Le Gallo and Ertur 2003; Lluís Carrion-i-Silvestre et al. 2005).

The same standardization has to be made for material densities. Although the effect of material density is not as notable as the effect of population, this paper shows that the differences are large enough to affect the analyses. As in all the sciences, archaeological analyses are increasingly focusing on more accurate analyses and finer details concerning the general macroscale observations previous generations of scholars have established in the past. If random effects like material density are not accounted for, we risk having a distorted (albeit slight) view of history.

The aim of this study was to show that by removing the effect of material density from the analysis, more precise questions can be asked, especially about trade. It should be noted, however, that the use of key ratio/DCV is justified in individual cases only, and researchers should always consider the benefits. This paper is not a recommendation to use the method automatically when presenting volumes of archaeological material. In cases, where there is a risk that the material density may affect the results of the analysis, this paper provides the convertive key ratios, as well as the densities of different materials usually found in archaeological contexts.

Although the method presented here is an index-like ratio, which is less intuitive for some, the additional benefit of this method is that density corrected values can be compared to non-corrected values within the margins of error, something that is not possible with other convertive methods.

Overall, as has been shown, the quantitative studies of imports through par volumes are problematic. Intercomparability is especially an aspect that has been neglected so far. The case study shows that using DCV, when analyzing the volumes of imports to a certain town and how popular different goods have been, opens possibilities for new discussion. Although there were also other differences besides the pair redware-iron, these differences were too small to be visually examined in this case study. However, the possibility that these differences may affect statistical analyses remain. This is highly dependent on what kind of analyses and what kind of study material the researcher is using and therefore it is not possible to evaluate reliably how small differences will have an effect.

Conclusion

This study concludes that there is a difference in material densities, both in artifacts made from solid or mixed materials, and that the difference is significant enough that it might have an effect on quantitative analyses. More interestingly, this study shows that there was a significant difference between redware and clay pipe material densities.

As this paper reveals, there are significant differences between materials in an archaeological context, and therefore it is recommended in the future to take into account the density of the material in the analyses. By using the density derived key ratios presented in this paper this conversion is easily utilized. Moreover, it especially gives additional accuracy to statistical analyses, by providing more precise results in cases where the volumes of material or artifact categories are compared. For example, in a case of evaluating the volume of imports to a particular town, presented in this study, the use of DCV proved its benefit. The denser objects were no longer over-emphasized in the data, giving a more accurate picture and helping to measure the actual volume of trade.

However, it is clear that the number of cases where the use of this methods will bring benefit to the study are limited; most likely they will be related to highly specific quantitative perspectives and research questions, possibly in the fields of economic history and archaeology. As regards a wider interest, this study does provide the material densities of different medieval and modern era artifact groups made of heterogeneous mixtures; these have not been previously measured and published.