1 Introduction

Thermal conductivity is an important material property for many applications and processing procedures. There are various methods for its determination, such as the guarded hot plate apparatus [1], modulated temperature differential scanning calorimetry [2] which differ in terms of experimental effort and complexity of the apparatus required. Haakvoort, van Reijen and Aartsen [3] proposed a method using a conventional Differential Scanning Calorimetry (DSC) setup, which was further refined by Riesen [4] and Monkiewitsch [5] and finally implemented in a DIN EN ISO standard [6]. It is suitable to measure materials with a thermal conductivity up to 1 Wm−1 K−1. The most important advantages among others are the broad availability of a DSC setup in many laboratories as it is a widespread measuring method for other thermal properties, it is rather fast and the thermal conductivity is measured directly without calculation from the specific heat capacity \({c}_{p}\). But \({c}_{p}\) and other thermal properties can still be measured alongside so that other thermal properties such as the thermal effusivity can be easily calculated. In addition, the material consumption is low and the specimens have a simple geometry [4,5,6,7].

For any use case, the accuracy of the material parameters used as input for data evaluation or computation is of utmost importance. The thermal conductivity is often part of combined quantities, such as the thermal effusivity \(e\) a key parameter in infrared thermography (IRT) as a nondestructive testing method [8], a comprehensive introduction of IRT in general can be found in [9], temperature control in processing of composite materials [10] or numeric simulation of thermal processes [11] described with

$$e= \sqrt{\lambda \rho {c}_{p}}$$
(1)

where \(\lambda\) is the thermal conductivity, \(\rho\) the density and \({c}_{p}\) the specific heat capacity.

The main objective of this paper is to provide a statistically valid estimate of the achievable accuracy of the method proposed by Hakvoort, depending on the specimen dimensions and surface roughness as either publications [3,4,5] and the DIN EN ISO standard [6] do not provide reliable information on this. The second objective is to describe a procedure that helps to reduce the experimental effort for repeated measurements without sacrificing accuracy, which can also be adapted to other experimental techniques.

Therefore, we investigate the influence of the reduction of the number of samples on measurement uncertainty, using statistic tools to broaden the information gained from a measurement series. Basically, our considerations are based on the law of large numbers (LLN) and different techniques that are used in the statistical methods, such as Monte-Carlo- or Markov chain Monte Carlo (MCMC) algorithms [12]. With this we take a look at the dependence of the number of samples on the max. percentage error. Furthermore, we define an “inflection point” as compromise between expenditure of time and accuracy with regard to automation. Additionally, we show the non-negligible influence of the real contact area on the testing method according to Hakvoort.

2 Materials and methods

2.1 Measuring of thermal conductivity using DSC

The specimen is positioned between the heating plate of the sensor of the DSC and a crucible filled with an auxiliary material with a defined melting temperature, while on the reference-sensor of the DSC an identical crucible with reference material is placed directly on the heating plate. The auxiliary material serves to keep the temperature at the top of the sample constant during melting and thus creates a defined temperature gradient within the sample. The melting temperature of this material also determines at which temperature the measurement of the thermal conductivity takes place. The sensors continuously measure the heat flux into the sample and reference as well as the temperatures of both. Melting causes a change in the heat flux without any change in temperature. Compared to the reference the melting is delayed in time due to the thermal resistivity of the specimen, which is indicated as a peak in a heat flux-temperature diagram. The thermal conductivity \(\lambda\) can be calculated directly from the slopes of the curves of heat-flux over temperature for the specimen \({S}_{p}\) and the reference \({S}_{r}\) according to

$$\lambda = \frac{h}{A*(\frac{1}{{S}_{r}}-\frac{1}{{S}_{p}})}$$
(2)

where \(h\) represents the height and \(A=\pi *{r}^{2}\) is the contact area of a circular specimen with radius \(r\). However, this only applies to specimens with a perfectly smooth surface. In reality, the contact surface is rough and only in contact with the sensor at the roughness peaks. The heat flux is therefore limited to the real area of contact since the air acts as an insulator at the points at which the surfaces don’t contact each other. For this reason, we considered the real contact area according to [13]. The real area of contact is given by

$${A}_{real}= \pi *\eta *A*\beta *\sigma *{F}_{1}(h)$$
(3)

where \(\eta\) is the peak-density, \(A\) the nominal area, \(\beta\) the peak-radius, \(\sigma\) the standard deviation of the height distribution, and \({F}_{1}(h)\) the distribution function of the height profile. These parameters were determined according to [14] and used for calculation. Since the parameters for determining the real contact area depend on the height profile and the height profile in turn depends on the surface roughness \({R}_{a}\), the parameters and thus the real contact area also depends on the surface roughness \({R}_{a}\), among other factors.

2.2 Specimens used

Since the experiments for this investigation were performed before the publication of the DIN standard, our setup is based on the work of [4] and [5] and differs from the DIN standard in a few details (see Table 1). Most important are the specimen height, the surface treatment and the conditioning, of which the first two parameters were varied in our study, in order to investigate their influence on the result. The specimen height ranges from 1.0 to 5.0 mm and the surfaces are finished with abrasive papers of grit size from P600 to P4000, depending on the series of measurements. All specimens were made from PMMA. The raw material and the specimen were kept in a fully air-conditioned laboratory room to ensure that the variations in temperature and humidity were as small as possible.

Table 1 Comparison of the different implementations of the thermal conductivity measurement with DSC

2.3 Experimental Setup

All measurements of the thermal conductivity haven been performed using the DSC821e of Mettler Toledo, Germany. As auxiliary material Gallium was used. The heating rate was set to \(0.5\mathrm{ K }{\mathrm{min}}^{-1}\) around the melting temperature of Gallium, from \(26\) to \(38 ^\circ \mathrm{C}\). The flow rate of the inert purge gas was \(30 \mathrm{ml }{\mathrm{min}}^{-1}\). The surface roughness was measured using the FRT MicroProf 100 from FormFactor, Germany. For this purpose, 6 line measurements were made per specimen, distributed over the surface of the specimen. Three in 0° and three in 90° direction. Each line measurement had a length of 2 mm and consisted of roughly 1000 points.

2.4 Data evaluation

As most often used distributions, as Gaussian normal distribution or Student distribution could not describe our data, we found that our experimental data for the thermal conductivity can be described sufficiently enough with the also quite often used beta distribution

$$B\left(p, q\right)= \frac{\Gamma (p)\Gamma (q)}{\Gamma (p+q)}$$
(4)

which can be adapted in a wide range with the two parameters \(p\) and \(q\), also to asymmetrically distributed data sets. \(\Gamma\) corresponds to the gamma function. These p–q-values can then be used to calculate the expected value \(\mu\) and the associated standard deviation \(\sigma\). [12]

$$\mu =p/(p+q)$$
(5)
$$\sigma =\sqrt{\frac{p*q}{{((p+q)}^{2})*(p+q+1)}}$$
(6)

To extend the theoretical p–q-values with our measurement data, we calculated the adjusted p–q-values for a standard beta distribution. These can only be used if a measurement series is available for the considerations. This combines the p–q-values for our theoretical considerations with the results of the measurement for each iteration. For this, we defined a “hit criterion". If the measured value is in the range of mean ± standard deviation \(\mu \pm \sigma\) it counts as a hit. Then, the fitted p–q-values can be calculated using the number of hits t and the number of samples n. [15]

$${p}_{adjusted}=p+t$$
(7)
$${q}_{adjusted}=q+(n-t)$$
(8)

Instead of the standard deviation \(\sigma\) we calculated the 99% confidence interval (\(CI\))

$${CI}_{99}={t}_{99}*\frac{\sigma }{\sqrt{n}}$$
(9)

which additionally contains information about the entirety of the measurements and thus also about future measurements. \(n\) is the number of samples and the factor \({t}_{99}\), describing the confidence level, was taken from the student table, because we’re looking at the confidence interval for the arithmetic mean with unknown variance [15].

2.5 Hough transform–clustering

If the result of any iterative process converges, it can be assumed, it converges to a straight line for a sufficient number of iterations. This straight line can be found, using the Hough Transform (HT). This method is widely used in image processing algorithms, for example to detect straight lines and more complex patterns [16]. The basic idea is that any straight line passing through a point (\(x,y\)) can be described by a perpendicular distance/radius \(r\) from the origin to the straight line and its angle \(\theta\):

$$r=x\cdot \mathrm{cos}\theta +x\cdot \mathrm{sin}\theta$$
(10)

We used the Hough Transform in the sense of a clustering method. All data that might belong to a straight line is selected from a given set with a given threshold. The accuracy is controlled by the selection of \(\Delta \theta\) and \(\Delta r\). In the case that more than one straight line or cluster is identified, we took the one that has the smallest angle to the abscissa axis and necessarily includes all the last iteration values. To increase the number of data points for clustering, it can be modelled with an analytic function if possible or extended with a regression method.

2.6 Methodology for uncertainty quantification and sample number determination

As a starting point, an experiment with a random constant (\(h,{R}_{a}\))-pair was performed for the first iteration step. This random experiment consists of the measurement of the thermal conductivity of \(N\) samples. The first assumption we made in our error control and effort reduction algorithm was that the mean value of the obtained set \(M\) of \(N\) elements corresponds to the expected value, according to the law of large numbers [12]. This means the set \(M\) represents all possible measurement values. From this set \(M\), all possible combinations of measured values for all \(n=1,\dots ,N\) were drawn without repetition and without considering the order. For all these combinations, the corresponding mean value with the associated standard deviation was calculated. This resulted in a new set \({{M}_{\overline{\lambda} (n)}}\), which contains all mean values that can occur with a single measurement series from \(n\) samples. Then, we did check for the distribution of the set \({{M}_{\overline{\lambda} (n)}}\) and calculated the 99% confidence interval suitable for this distribution. The maximum percentage distance \({CI}_{\%}\) of this interval to the corresponding expected value was calculated and plotted. This plot allowed us to directly find a point from which on the percentage distance is below a threshold value, which is needed for a certain task. Additionally to find the best compromise between effort and accuracy we searched for an "inflection point" because of \({CI}_{\%}\)-convergence, which marks the point from which the progression of the \({CI}_{\%}\) becomes nearly linear, while at the same time only a small change occurs. For these purposes, Hough Transform (see 2.5) was used on the \({CI}_{\%}=f(n)\) data set, which has been normalized to the maximum value and additionally extended/fitted with the parametric regression model (11) for more accuracy, where \({c}_{1}\) to \({c}_{6}\) are the parameters which define the fit.

$${CI}_{\%}={c}_{1}{e}^{-{c}_{2}n}+ {c}_{3}{e}^{-{c}_{4}n}+{c}_{5}{e}^{-{c}_{6}n}$$
(11)

This inflection point gives a possibility to select the minimum or optimal number of samples \({N}_{opt}\) regarding the corresponding error. Our hypothesis was based on the assumption that the effect of roughness on the counting result is dominant. Therefore, we extended the \({CI}_{\%}\) for the optimal number of samples by introducing an uncertainty factor \(s\). This allowed us to improve the result without considering the complexity of the real contact area. Performing and comparing the error propagation or the quotient of the standard deviation to the nominal thermal conductivity for the real and nominal contact area, assuming \({A}_{real}=c({R}_{a})*{A}_{nom}\) (see formula (3)) with \(c({R}_{a})\le 1\), allowed us, under a rough assumption, to set the uncertainty factor as follows:

$$s= \frac{1}{{c}^{2}({R}_{a})}$$
(12)

According to the hypothesis such calculated and extended percentage distance of the confidence interval \({CI}_{limit,\%}\) is the error upper bounds for all (\(h,{R}_{a}\))-pairs with the optimal number of samples and calculated with the nominal surface area (13).

$${\sigma }_{i}(h,{R}_{a})\le {CI}_{limit,\%}$$
(13)

To verify our hypothesis, the next step was to vary the parameters we are interested in: sample height and surface roughness. For these measurements we used the previously determined optimal number of samples \({N}_{opt}\). From the results of the measurements an error plane can be constructed, which allowed us to select the (\(h,{R}_{a}\))-pair with the largest error, i.e. the worst case, for the second iteration step with equal number of samples \(N\) as in the first iteration.

3 Results

In the first iteration step we measured 20 specimens with a height of 2.0 mm, which is the upper limit of the DIN standard. The surface was finished with P2500 grit paper and the corresponding \({R}_{a}\) values were determined. The results were evaluated according to the methodology described in Chapter 2. With the scipy.optimize.curve_fit() function from Python, we fitted a standard beta distribution and the required p–q-values and determined the quality of the fit via \({r}^{2}\)-score. With these p–q-values, the 99% confidence interval and corresponding maximum percentage distance \({CI}_{\%}\) was calculated. In each case, the calculation was performed for the p–q-values and the adjusted p–q-values, but they differ only slightly, so we used the adjusted p–q-values for plotting for a better illustration. The results of the Hough Transform (see Sect. 2.5) are shown here in Fig. 1 (left).

Fig. 1
figure 1

Distance of the confidence interval in % for 1st iteration (left) and 2nd iteration (right). Blue dots—experimental data; blue dashed line—extended normalized data; red dashed line—clustering results belonging to data; black dashed line—extended clustering results

Displayed is the experimental data (blue dots), extended and normalized results (blue dashed line) and the corresponding clustering results (red dashed line). We chose to normalize the data to get plots which are more comparable. The vertical red dashed line indicates the optimal number of samples which is rounded up to \({N}_{opt}=10\). The corresponding percentage confidence interval is \({CI}_{{N}_{opt},I,\%}=3.68 \%\). Based on the mean value of the determined quotient of the real to the nominal area \(\overline{c }({R}_{a})=0.77\), the uncertainty factor is calculated to be \(s=1.68\) and thus the limit increases to \({CI}_{limit,\%}=6.18\%\).

For the error field we carried out 9 measurement sequences, as a compromise between experimental effort and thoroughness, as our goal was to create a general overview of the influence of the parameters sample height and surface roughness. The variation of the height and the surface finish of the specimens was as described in Table 1 and the number of samples was the previously determined \({N}_{opt}=10\). The mean value and associated standard deviation were calculated for each of these measurements and then plotted in a 3-dimensional isosurface (Fig. 2). We chose the point with the largest absolute standard deviation, the very small difference to the relative error can be explained by rounding differences. This point was at \(h=1.0 \mathrm{mm}\) and \({R}_{a}=0.4 \mu m\), which correspond to a finish with P1200 grit abrasive paper. We measured and evaluated 20 specimens with this (\(h,{R}_{a}\))-pair as the second iteration step identical to the first iteration. The plot of the maximum percentage distance of the 99% confidence interval is shown in Fig. 1 (right). Because we normalized the corresponding maximum percentage distance \({CI}_{\%}\) to the maximum value, the plots for the first and second iteration start at the same maximum value and differ only slightly for the following iterations. From these plots, the same \({N}_{opt}=10\) can also be found, which is again corroborated using the Hough Transform technique. The max. nominal percentage distance at \({N}_{opt}\) is \({CI}_{{N}_{opt},II,\%}=5.86 \%\) and is thus below \({CI}_{limit,\%}=6.18\%\).

Fig. 2
figure 2

Error fields green dots—1st iteration; red dots—max. distance of \(\sigma\) to \(\mu\); grey dots and dotted lines—measured parameters \(\mu +\sigma\) (left): grey plane—mean values; coloured plane—\(\mu +\sigma\); \({\sigma }_{\%}\) (right): coloured plane—\({\sigma }_{\%}\), grey plane—\({CI}_{{N}_{opt},I,\%}\); green plane—\({CI}_{limit,\%}\)

4 Discussion/conclusion

The results shown do not contradict our hypothesis. However, we made some assumptions which should be addressed in further investigations. Namely (1) the number of samples for the first iteration and number of (\(h,{R}_{a}\))-pairs for the error field, which should be higher to gain a more precise plot; (2) the height of the specimens because this directly determines the mantle area which influences the heat loss; (3) the base geometry of the specimens as just cylindrical specimens have been investigated; (4) the use of the standard-beta-distribution, as one can check for further distributions such as λ-PDFs, which might better describe the data [17] or adapt and apply non-probabilistic uncertain methodology to our task [18]. To see if our hypothesis stands true, also other materials should be investigated. A first preliminary measurement series with PVC revealed the same results, which could be used for further examinations.

Furthermore, we showed that the consideration of the real contact area has a big influence on the quality of the measurement. To compensate this, we have shown that the introduction of an uncertainty factor based on the area ratio between nominal and real contact area is a good approximation to give an upper limit for the overall error of the measurement. To get a deeper and more accurate understanding of this influence, it might be useful to deepen the understanding of the surface roughness or consider a more complex model for the real contact area.

For the future, we encourage anybody who is willing to further investigate on this topic. One possibility to gain as much data as possible with not too much time effort could be a round robin with many laboratories being interested in this model to join forces, to refine this work. With a large enough database, it would be possible to determine the specific \({N}_{opt}\) and the corresponding maximum error for this measurement series, without any preliminary measurements.