1 Introduction

Pollen monitoring sites were first established by those who most needed the information such measurements could provide, whether this be medical doctors diagnosing and treating allergy sufferers, researchers studying gene flow or ecosystem changes, or for the identification of invasive species, such as ragweed. Over time, the number of stations has grown and entire observation networks have been developed (Buters et al., 2018). However, the large majority of current sites and networks are still based on manual measurement techniques from the 1950s, for which standards were developed only relatively recently (EN16868, 2020). This method uses a volumetric sampler, originally developed by Hirst (1952), and manual analysis of samples under a light microscope. There are guidelines for: (i) sampler technical parameters, (ii) choice of sampler location, (iii) preparation of collected samples, (iv) microscopic analysis methods, and (v) calculation and statistical analysis of uncertainty based on the type of data. This method is labour-intensive and time-consuming, it requires specialised skills, and because of the sampling duration and manual processing of collected samples, data are obtained with a delay of up to 9 days. Because only a limited proportion of the sample collected is analysed, these measurements also suffer from relatively high uncertainty, being around 30% for daily averages. Furthermore, the uncertainty is also affected by the skill of the counter as well as the atmospheric concentration of pollen (Adamov et al., 2021; and references therein).

Automatic monitoring methods have become available over the past few years, making it possible to measure pollen and fungal spores in real-time and at much higher temporal resolutions (Crouzy et al., 2016; Sauliene et al., 2019; Oteros et al., 2019; Sauvageat et al., 2020). Several measurement techniques exist, ranging from impactors with digital microscopes to airflow cytometers that use fluorescence spectroscopy and scattered light to identify particles (Huffman et al., 2020). No matter the method used, the fact that there is a higher degree of automation means that these techniques lend themselves more easily to standardisation, whether it be for calibration, the actual measurement itself, or the development of particle identification algorithms. At the same time, the number of different measurement techniques already available and the fast pace at which new methods are developed, provide a challenge to developing such standards. Nevertheless, as an automatic pollen and fungal spore monitoring network is established across Europe under the umbrella of the EUMETNET AutoPollen programme and the number of new observation sites is constantly growing across the world, it is essential that all aspects of the measurement chain are standardised to ensure comparable and high-quality data and information can be provided to end-users.

This paper outlines a set of guidelines and best practices that can be applied to automatic observations of airborne pollen and fungal spores where data are supplied at high-temporal resolution (up to hourly) in real-time. The information provided can serve as a basis for the future development of an official European standard that can be applied across the growing EUMETNET AutoPollen monitoring network and potentially beyond to other regions of the globe.

2 Instruments and measurements

A companion paper in this special issue provides an in-depth overview of the instruments and methods available to measure airborne concentrations of pollen and other biological particles (Buters et al., 2022). While the range of different measurement techniques complexifies to some extent the development of standards, the guidelines outlined in this section provide a general framework applicable to all automatic pollen and fungal spore monitors.

2.1 Sampling flow rate and sampling resolution

Airborne pollen and fungal spores are usually recorded as a concentration (number of particles) per unit volume. It is thus essential that automatic instruments have accurate flowmeters integrated into the measurement system so that the flow rates can be precisely recorded. Primary standards have been developed which allow flow rates to be controlled and measured with an expanded uncertainty (95% confidence level) of < 0.5%, making high accuracy observations possible under laboratory conditions (Niederhauser & Barbe, 2002; Sillanpää et al., 2006). Calibration procedures for different types of flowmeters are described in the ISO, 14511:2019 and ISO, 10790:2015 standards, with it also being possible to calibrate them under environmental conditions in the laboratory. It is important to note that under strongly varying atmospheric conditions, for example with very low pressures or temperatures, the flow may vary considerably. This needs to be taken into account should instruments be calibrated in the field rather than in the laboratory. Accurate time keeping is likewise important for calculating the total volume sampled at a given sampling rate. Modern processing speeds easily facilitate this, however, and allow adequate accuracy.

Recommendation: Flow rates should be continuously measured and recorded within the device to an accuracy of ± 2% with the use of calibrated flowmeters/controllers. Any deviation or systematic shift should generate alerts that allow appropriate intervention and data should be flagged accordingly. In case of doubt, in-field verification of the flow rate with a calibrated transfer standard, such as a bubble flowmeter, is recommended.

The required observational accuracy essentially dictates the volume that needs to be processed by the measurement device and the sampling resolution thus possible. In a very general form, the analysed air volume V is:

$$V=\alpha F\tau$$

where α is the fraction of air that is analysed, F is the air flow rate at the inlet entry point [m3/s], and τ is the observation time interval [s].

Taking an example of daily observations using a Hirst-type trap with: flow F = 10 L/min, τ = 1 day, and α the fraction of the slide area counted 10% (5%) for pollen (fungal spores) following Galán et al. (2014), then the total volume sampled is VHirst_day = 1.44 m3 (0.72 m3 for spores).

A second relation connects the required observational accuracy and detection limit with the volume of air analysed. This can be expressed by the “shot noise” formalism, which describes the features of observations of rarely occurring events. In the case of airborne pollen or fungal spores, the rate event can be considered as a specific pollen particle being observed. From a statistical point of view, the detection of such particles can be modelled as a Poisson process, where the standard deviation is equal to the square root of the average particle concentration N. Employing the law of large numbers, the signal-to-noise ratio R is:


The inverse of R is a convenient measure of relative uncertainty γ. For instance, if a 10% error is a user requirement, the relative uncertainty should be 0.1, which leads to R = 10 and N = 100 – the number of particles that need to be captured in an observation to ensure that the shot-noise uncertainty is below the required limit.

Continuing the example of Hirst daily observations, if we define the detection limit, DL, as the concentration that can be observed with 50% uncertainty. With γ, the uncertainty being 50% or ½, R = 2, thus N = 4, and the concentration CDL_day = 4 / 1.44 = 2.8 particles/m3—the detection limit for daily observations with 50% uncertainty accepted. Quite evidently, for two-hourly measurements CDL_2hr = 2.8 · 12 = 33 particles/m3.

From the above considerations, one can thus obtain the minimal flow rate Fmin if the maximum uncertainty γImax = 1/Rmin, required for a particular averaging period τ, and relevant concentration threshold Cthr are given by a user and the fraction α of the analysed air flow is a device internal feature:

$${F}_{\mathrm{min}}=\frac{1}{\alpha \tau {\gamma }_{\mathrm{max}}^{2}{ C}_{\mathrm{thr}}}=\frac{{R}_{\mathrm{min}}^{2}}{\alpha \tau { C}_{\mathrm{thr}}}$$

Recommendation: The volume of air sampled needs to be sufficient to measure all relevant concentrations for a given temporal resolution and for an accepted level of uncertainty. For instance, to measure a concentration of 10 particles/m 3 with an uncertainty of 10% at hourly resolution and assuming 50% of the inflow analysed, the minimum flow required is 333 l/min.

Recommendation: one should distinguish between the detection limit of the device, which can afford a wide margin of error, e.g., 50%, from the reported concentration value, where the uncertainty must be low enough. In the example of Recommendation 1, the flow 333 l/min would lead to the detection limit of 0.6 particles / m3 observed with 50% uncertainty.

2.2 Counting efficiency

Counting efficiency is a measure of how accurately a particle sampler counts compared to a reference instrument. Essentially, it is the detected particle concentration divided by the “true” concentration (as determined by the reference). Many things can affect counting efficiency during the lifetime of an automatic monitor, ranging from misalignment of the light source to oversaturation of the detector, electronic issues with a camera or detector, or deposition of particles in different parts of the instrument.

Automatic pollen monitors based on airflow cytometry should be tested at regular intervals against an Electrostatic Classifier or a reference Optical Particle Counter (OPC), which should be traceably calibrated. Calibration curves of the counting efficiency versus particle size can thus be produced and these should be reported for each instrument (e.g. Lieberherr et al., 2021). The corresponding scaling factors should then be used to calculate final concentrations. Currently, calibration methods based on reference OPCs are in place for particle diameters up to about 20 µm. Projects are underway to extend the range to larger particle sizes, although such techniques are likely to take several years to be standardised and applied routinely across monitoring networks. For instruments that use impaction and image recognition, alternative methods will have to be developed (Zuberbier et al., 2017). As yet, it is unknown if and how measurements might drift in time. This remains to be tested.

Recommendation: Instruments should be calibrated, ideally by an accredited organisation, at the end of the production process and then at regular intervals with a transfer standard (a calibration instrument that can be taken out of the lab) once in the field. Should issues be identified, the instrument should be sent to a laboratory for more detailed tests.

Recommendation: Counting efficiencies with a stated measurement uncertainty should be reported for each size channel. Counting efficiency should also be verified at least every 3 years.

2.3 Particle size range, sizing accuracy, and size resolution

Considering the particle size, there always should be a clear reference to what size the instrument reports. Aerodynamic, optical, geometric, or gravimetric sizes differ substantially and, apart from the geometric size of the particle, are determined by its shape, optical properties of the surface, as well as density. Depending on how the particle size is measured, the corresponding definition should be used and, if necessary, conversion to other size-type variables applied.

The particle size range, sizing accuracy, and size resolution of airflow cytometers can be assessed using size-certified polymer microspheres or similar standards, which are commercially available in a broad size range (from a few nm up to at least 200 µm). The refractive index of the widely used polystyrene microspheres is approximately 1,59 at a wavelength of 589 nm (sodium D line), which is very close to the refractive index values of pollen exine (typically ̴1.53–1.54 at 532 nm) (Charrière et al., 2006; Kim et al., 2018; Park et al., 2018). The density of polystyrene spheres is 1.05 g/cm3, which also simulates reasonably well the density of fresh pollen at environmentally relevant relative humidity (van Hout & Katz, 2004). It is nevertheless important to keep in mind that there is a lot of natural variability in the morphology and chemical structure of pollen grains. For example, Ambrosia pollen have air chambers in the exine (Payne, 1963), which make this pollen about 20% lighter (density of 0.8 g/cm3; Prank et al., 2013) than reference polystyrene microspheres.

Recommendation: Instruments should be able to measure particle size to within ± 10% for the size ranges that manufacturers claim to be within the device scope. The mean particle size, the uncertainty on that mean, as well as the standard size deviation should be incorporated into the error associated with any calibration curve. A procedure for calculating the sizing accuracy is described in Section 3.4 of the ISO 21501-1:2009 standard.

Recommendation: The lower size limit for the particle size should be defined by convention to be the smallest diameter with which the counting efficiency is 0.5 ± 0.15 (50% ± 15%; lower size limit of the measuring range) according to ISO 21501-1:2009.

Size resolution refers to the ability of an instrument or sensor to differentiate between particles of different sizes, i.e., the smallest detectable difference between particle sizes with an acceptable significance. For optically based instruments, the variance around the average particle size is generally associated with discrepancies in the pulse heights produced by the sensor when measuring identically sized particles. For example, even if two identically sized particles flowed through the sensing volume in sequence, the electrical output produced by the sensor may not always be identical in terms of pulse height. In general, particle counters are known to have approximately 15% variance around the mean (ISO 21501-4: 2007). Deviations can be due to non-uniformities from the light source or simply from system noise that may affect pulse height. Recording of parameters such as light source voltage (Table 1) is required to ensure reproducibility.

Table 1 List of metadata that should be reported

The particle size distribution that can be measured is a function of the instrument’s resolution, with lower size resolutions resulting in broader size distributions being measured. Size resolution should be high enough to ensure that particles in the pollen size range can be adequately distinguished, the natural variability of the same biological particles being a different aspect to consider separately. Calibrating instruments with standard particles (as described above) is one way to assess this issue. This is the subject of further research and it is hoped that in the coming years the full pollen or fungal spores-relevant size resolution will be able to be tested.

Recommendation: Instruments should, at the very least, be able to measure particles in the size range of most airborne pollen, namely from 8 to 60 microns (keeping in mind that pollen and fungal spores range in size from 1 to 200 microns).

Recommendation: An evaluation of the size resolution should be provided by device manufacturers as part of the standard information material supplied. A procedure for calculating the size resolution based on measurements with monodisperse size-certified polymer spheres is described in Section 7.3 of the ISO standard 21501-4:2018.

With regard to instruments based on imaging techniques, the sensor pixel resolution is of particular significance when it comes to the instrument’s size resolution and this should be well described (Huffman et al., 2019). Pixel resolution, which is related to the sensor resolution, is the main factor determining the size resolution of such instruments. Furthermore, for those devices that produce colour images, increasing the number of colours recognised can likewise improve detection capacity. The image quality of the optical system and the possibility of stacking, i.e. merging several images from the same or different focus levels, also play a role.

Ideally, all automatic pollen and fungal spore monitors should be calibrated with a known number concentration of a particular taxon of interest. This is not yet possible and even when technology and methods have been developed to do so, it is important to keep in mind that biological particles have an inherent variability which needs to be taken into account. Either the size of each particle tested needs to be measured very accurately during the calibration tests or standardised polystyrene particles need to continue to be used to fully characterise the sizing accuracy and resolution of instruments.

2.4 Fluorescence signals

Fluorescence is another parameter that should be calibrated for instruments that use such detection techniques. Fluorescent material utilised in the calibration should match one or more of the excitation and emission wavelengths of the instrument in question, be stable, repeatable (i.e. different batches return similar results), be easily prepared, and safe to use (Robinson et al., 2017). The fluorescence threshold should be determined and discussed in relation to any results, particularly to ensure that instruments do not suffer from false positives or false negatives. Possible drifts in the fluorescence threshold should also be considered on a regular basis since lasers can fluctuate as they age. It is important to note that obtaining absolute fluorescence values is challenging since often there is a lower signal-to-noise ratio for this parameter. This can, to some extent, be addressed through normalising the spectrum obtained and comparing this with the reference observations.

Recommendation: Instruments based on fluorescence should be able to provide the location of fluorescence peaks within the correct bin compared to reference values (e.g. Könemann et al., 2019), with an uncertainty of ± 25 nm tolerated.

Furthermore, it is important that the automatic measurements are continuous throughout the pollen and/or fungal spores season to ensure complete coverage of all periods of interest. The lifespan of all instrument components should thus allow for uninterrupted functioning for at least one such a period. The laser lifespan is determined by its technical specifications and thus depends on the environment in which it is used, with shorter lifespans in areas with higher particle loads.

2.5 False count rate

The false count rate is the number of particles that an instrument counts in a flow of pure air (i.e. that has zero particles). This value should be determined for each instrument and can relatively easily be assessed by installing a particle filter with high efficiency (HEPA or ULPA filter) over the instrument inlet and running a test for a minimum of 5 min. Any particle events measured by the instruments should thus be considered false and could be due to a number of sources internal or external to the system. These can range from contamination (deposit of pollen/fungal spores from the previous sampling) to electronic noise or spray radiation (for light-scattering/fluorescence instruments).

A high signal-to-noise ratio is also necessary. A ratio above 2–3 has been used in certain disciplines, with some considering noise as anything that falls below the mean background/blank value plus 3 times the standard deviation (Huffman et al., 2019). Low signal-to-noise ratios can also produce resolution degradation resulting in artefacts being detected as particles.

Recommendation: Instruments should regularly (at least every 6 months) be tested for false count rates. An adequate filter (e.g. a HEPA filter) should be attached to the instrument inlet and run for at least 5 min. A procedure for evaluating the false count rate is described in Annex C of the ISO 21501-4:2018 standard.

2.6 Maximum detection limit

Automatic monitors need to be able to detect pollen and fungal spores at levels below the thresholds at which allergy sufferers start, on average, showing symptoms or at which there is a risk of crop infection. These thresholds are usually low, on the order of a few grains or spores per cubic metre. On the other end of the spectrum, the maximum concentration that a particular instrument can measure also needs to be understood to ensure that measurement saturation does not occur (or is, at least, taken into account). This may be of particular importance in an urban setting with, for example, high levels of non-biological particulate matter or pollen from ornamental plants, where such particles could potentially inhibit the detection of target pollen. In rural regions, where high concentrations of a single pollen species are found, e.g. Olea in the Mediterranean region or Betula in Northern Europe, a similar problem may also occur.

Recommendation: Maximum detection limits should be provided by device manufacturers as part of the standard information material. In the case of light-scattering instruments, the maximum particle number concentration is defined as the concentration at which coincidence particle loss is lower than or equal to 10% (ISO 21501-1). Device manufacturers should calculate coincidence loss according to Section 4.3 of the ISO 21501-1:2009 standard.

2.7 Physical constraints

Depending on the size, aerodynamic, and surface properties of pollen or fungal spores, it can be deposited inside an instrument. This may pose a problem not only because the actual pollen concentration can be underestimated, which particularly at low concentrations may push the instrument below its detection limit, but also may result in a “memory effect” with pollen becoming detached much later and being detected when they are not in fact present in the atmosphere. Instruments need to be engineered to avoid this as best as possible.

Instruments need to have completely airtight measurement systems to ensure that the airflow measured and sampled is precisely the same at the entry and exit. They also need to be adapted for outdoor conditions, either entirely waterproof themselves or suitable to be placed in waterproof housing with an adapted inlet. Furthermore, they need to function appropriately under a range of environmental conditions, from very warm temperatures to below freezing and from very dry to very humid conditions.

Recommendation: Measurement systems should be completely airtight. The instrument should be waterproof or suitable for being placed in waterproof housing with an adapted inlet.

Recommendation: Measurement systems should be able to perform accurately in the temperature range from at least − 20 to + 50 °C and across the full spectrum of atmospheric humidity from 0–100%. Manufacturers should also specify under what environmental conditions instruments have been tested (minimum/maximum temperatures, humidity, and wind speeds).

The losses due to turbulence within the system should be minimised during instrument design or at least accounted for thereafter. Additionally, for impaction instruments, an understanding of the impaction efficiencies for particles over the size range of interest is needed, including any errors associated with any bounce back from the collection substrate. Finally, laser efficiency decay or other instrument parameters that can drift in time should be measured at regular intervals so that these drifts can be identified and corrected. This is also of relevance for the length of time for which an automatic monitor can run without repair or major maintenance.

Recommendation: The inlets and instrument tubing of each instrument need to be well understood, with the Stokes and Reynolds numbers for each of these components calculated and reported for the relevant particle size ranges measured. The frequency of servicing/cleaning/maintenance of instruments should also clearly be defined for different environments (particularly for regions with high aerosol loads).

2.8 Device comparability

An essential aspect for instrument manufacturers and particularly for monitoring networks is device comparability. It is important that devices of the same model or type produce statistically similar results to ensure compatibility and consistency. Standardisation methods for networks, based, for example, on procedures used in meteorology, are currently being developed for automatic pollen and fungal spore monitoring networks. Standardised laboratory calibrations should first be performed at the end of the production process, ideally by an accredited organisation. Transfer standards should then be used at regular intervals for instruments in the field to ensure comparability of results obtained across a network. Thorough laboratory calibrations should then be performed to sort out any issues identified using the transfer standards.

Recommendation: Instruments across a measurement network should be regularly calibrated with a transfer standard. Should issues be identified, the instrument should be brought to a laboratory for further tests and maintenance.

2.9 Operating time and temporal data recall

The operating time refers to the amount of time a particular instrument measures, while the temporal data recall is the fraction of data obtained from what theoretically should be available. Specific acceptance thresholds for the temporal recall are set by data users and it should be a mandatory feature of all instruments that all downtimes and incompletely observed time intervals are reported.

As a default, averaging should only be carried out over complete datasets. For instance, if an instrument delivers minute-level particle counts, hourly counts can be computed as an average of minute values only if all 60 values have been measured. This is because intermittent failures might indicate a systematic problem under certain conditions, such as saturation at high particle concentrations. As a result, missing values can correlate with particle concentrations or certain environmental conditions, thus jeopardising the distribution functions, mean values, peaks, etc. However, a less than 100% temporal recall is often unavoidable and in many fields, averages are accepted as representative as long as two thirds of the data for each time interval are available.

Recommendation: Monitoring instruments should record their downtime and incomplete observation periods with the same frequency as the highest internal temporal resolution.

Recommendation: The default requirement to calculate average values is to have 100% of all data for that particular time period, although it is acceptable to calculate values when more than two thirds of the data for each time interval are available. However, all basic-level data should be reported and stored because some users can safely use the datasets with incomplete time series.

2.10 Metadata

Table 1 outlines a minimum list of metadata to report with each concentration data point. These metadata are essential to accurately calculate the final pollen or fungal spore concentrations as well as to be able to reanalyse or reconstruct timeseries at a later date, as well to assess the quality and conditions of the measurement.

Considering the de facto standards currently used for operational atmospheric monitoring, the decision was made to use the EBAS-NASA-AMES air quality file format. These text files contain a header with metadata as well as a data section. This allows a certain flexibility in terms of the file content, which is particularly important for bioaerosol observations. Through the Copernicus Atmospheric Monitoring Services (CAMS)_23 project, EUMETNET AutoPollen has developed links with the EBAS database and portal (http://ebas.nilu.no), which was started by the European Monitoring and Evaluation Programme (EMEP) more than 20 years ago and is operated by the Norwegian Institute for Atmospheric Research (NILU). EBAS has sustainable funding from EMEP and the portal is used by a wide range of stakeholders related to atmospheric monitoring, including the WMO Global Atmospheric Watch programme, Arctic Monitoring and Assessment Programme, as well as by a variety of European research projects. The database contains both real-time and validated data and the portal has established access control procedures that allow selective restrictions on data distribution policies.

Recommendation: All raw data should be stored in case future changes in algorithms require a reanalysis of an entire timeseries to ensure homogeneity.

Recommendation: All data being submitted to the EBAS database should be reported following the EUMETNET AutoPollen data format and protocol (see the supplementary material).

3 Identification algorithms and their development

The main difference between manual pollen and fungal spore identification methods and automatic ones is that computer algorithms are used to identify particles rather than humans with a microscope. These algorithms are, to date, largely based on various machine learning methods, whether they be applied to images or other data signals (Crouzy et al., 2016; Sauliene et al., 2019; Oteros et al., 2019; Sauvageat et al., 2020; Daunys, et al., 2021; Schaefer et al., 2021). A range of different techniques exist to develop training datasets as well as to evaluate the end results, which are further described below.

3.1 Datasets for training identification algorithms

Supervised algorithms used to identify pollen and fungal spores require datasets upon which they can be trained. Training datasets are produced in two main ways, either by exposing the measurement device to aerosolised pollen or fungal spore particles of a known taxa, or for image-based devices, by manual selection from a timeseries of environmental observations. Pollen and fungal spores can either be collected fresh from the respective plant, fungus, or substrate, or bought from business entities, in which case they are dried and can be covered in various chemical substances for storage, which may in turn affect the physical or optical properties of the particles.

Two commercially available devices exist to aerosolise pollen particles, the Swisens Atomizer and the Pollen Dispersal Unit (PDU; Zuberbier et al., 2017). To date, neither has been tested under laboratory conditions to estimate the number of particles they produce and whether this is consistent in time. The PDU has also only been tested with dry pollen.

No matter what method is used, it is important that the sample is as clean as possible (i.e. ideally only unbroken particles of a particular pollen or fungal spore taxa that arrive one at a time) and that the flow used to disperse them is as steady as possible. It is unclear at present how much influence the relative humidity has on airborne pollen and fungal spores, and, in particular, whether this affects algorithm performance. Basic tests performed outdoors at near-zero temperature with 100% relative humidity showed quite dramatic differences from laboratory conditions, but limitations of the experiment do not allow for far-reaching conclusions. Likewise, little is known about operation in very dry or hot and humid conditions. This issue is discussed in more detail in Sect. 3.3.

When developing identification algorithms, it is important that a large enough number of events, or data points, are used to train the model for each taxon. A minimum of at least 5′000 events per taxa should be used, but it may be helpful to use as many as 10′000 depending on how difficult the species or genera or family is to identify. This also depends on the other confounding particles present in the sample. It should be noted that these numbers refer to events with a signal of sufficient quality to enable identification. Furthermore, it is important that training datasets are made up from samples from a number of different days or plants. This is to ensure that the algorithm is trained on particles under a variety of different conditions (e.g. different relative humidity, etc.) and to reflect biological variability.

Recommendation: For supervised learning algorithms, data from at least 5′000 particles with a signal of sufficient quality should be obtained for a single type of pollen or fungal spore. Ideally, these data points should be obtained from particles in different environmental conditions and from different individual sources (to reflect biological variability). A similar number of data points for each taxon should be used when training an algorithm.

It is important to note that supervised identification algorithms have been developed to identify only what they have been trained to identify. Often there is no category for particles that are identified as pollen or a fungal spore but are of an unknown type. This may be a useful category to include in algorithms to avoid the algorithm being forced to allocate a label to a particle, even if generally the certainty associated with such a label will be low. Likewise, identification algorithms should be trained on datasets that include pollen or fungal spores from a range of different atmospheric conditions since this is known to affect their physical and possibly even chemical properties (see Sect. 3.3). Finally, it is important that all algorithms function using only the measurements and do not make use of pollen calendars or other techniques that specify when particular taxa may or may not be detected. This ensures that neophytes or any unusual or extreme transport events are not ignored.

3.2 Assessing algorithm performance

There are many ways to assess the performance of the identification algorithms applied for pollen and fungal spore monitoring. Essentially two aspects need to be evaluated, the counting efficiency and the identification. The former is an issue related to the measurement technique itself and is covered in Sect. 2. The latter is related to how well the identification algorithm performs. Here, two relatively simple examples are presented that can be used to easily compare results from various algorithms, whether they be applied to the same instrument or from different instruments.

3.2.1 Recall, precision, and the F-score

For each taxon that an algorithm identifies, it is important to determine the number of false positives (FP; the number of other particles falsely identified as the taxon of interest), false negatives (FN; the number of particles not labelled as the taxon of interest), true positives (TP; the number of correctly identified particles of interest), and true negatives (TN; the number of particles not of the taxon of interest labelled as such). Three indices can be derived from these four values:

The recall is the fraction of the sampled particles that are correctly identified:

$${\text{Recall}} = {\text{TP}}/{\text{TP}} + {\text{FN}}$$

The precision is the proportion of correct classifications:

$${\text{Precision}} = {\text{TP}}/{\text{TP}} + {\text{FP}}$$

When assessing the quality of recognition, one can thus calculate an overall score which takes into account both the precision and recall, the F-score:

$${\text{F - score}} = {2}*{\text{Recall}}*{\text{Precision/(Recall}} + {\text{Precision)}}$$

It is important that the F-score is taken into account since if the precision is optimised the recall is often reduced, or vice versa. The maximum possible F-score is 1.0 and we recommend aiming for values above 0.9. A further aspect to take into account is that a similar number of events for each taxon should be used when calculating precision, recall, and F-score values for different algorithms.

Recommendation: F-score values should be above 0.9, where possible, for all pollen or fungal spore taxa identified by a particular algorithm.

3.2.2 Confusion matrices

Confusion matrices are commonly used to determine how well a particular algorithm can identify all the taxa of interest simultaneously. It also provides information about which taxa may eventually be confused, e.g. Alnus identified as Betula, hence the name confusion matrix. These tables are simple output of any algorithm training where the number of correctly and incorrectly labelled particles of each type is recorded. To ensure a fair comparison of matrices between algorithms it is essential that the same number of taxa are included in each algorithm, ideally also exactly the same taxa. This is because the larger the number of taxa identified the more challenging the identification and thus the more likely the results will not be as good as an algorithm identifying a smaller number of taxa.

Recommendation: The list of particles identified by a particular algorithm should be included in the description, i.e., it is important to know for what the algorithm was trained. Confusion matrices can provide useful information about how a particular algorithm performs and, if the same list and number of taxa are identified, also to compare different algorithms.

3.3 Evaluating identification in real environmental conditions

It is essential that devices are also tested and evaluated under real environmental conditions (ideally several different environments) to ensure that the instruments and the classification algorithms perform well outside of controlled laboratories. Atmospheric aerosols are composed of a huge range of particle types, many of which are present in concentrations much higher than the pollen or fungal spores of interest. Identification algorithms need to be able to deal with such “interference” and ensure that performance is not detrimentally affected. Furthermore, different meteorological conditions can also affect pollen and fungal spore particles, e.g. humidity can impact pollen size and shape, and it is important that this also does not affect instrument performance significantly. This is something that needs to be taken into account both when producing training data for the classification algorithms as well as when evaluating instruments in outdoor conditions. For example, if an identification algorithm was trained on only dry pollen but the instrument is placed in very humid atmospheric conditions it is likely the system will not perform as well. This is only a problem for instruments based on airflow cytometry. Devices where the particles are collected on a substrate and, rehydrated so that the pollen revert to their characteristic shape are not affected by this issue (Oteros et al., 2019). However, contamination by small particles sticking onto the surface of the pollen grains/fungal spores in a dirty atmosphere remains equally important for all instruments.

Performance assessment should be carried out through a statistical comparison with standard manual measurements (EN16868:2020), keeping in mind the uncertainties from which this traditional method suffers (e.g. Adamov et al., 2021). Since pollen data do not usually follow normal distributions, it is important that nonparametric statistical methods are used. Timeseries including peaks, as well as the start and end of the pollen season, need to be compared, ideally over more than two pollen seasons, particularly for masting trees such as birch or beech.

Recommendation: To accurately assess algorithms under environmental conditions and highlight seasonal factors, at least two years of data should be available from co-located manual and automatic monitors.

Ultimately, it would be useful to have one standard algorithm used for each instrument type. This would ensure consistency across all instruments of one particular type but would require that the algorithms are developed to be optimal in a wide variety of environments. This in turn, would involve using training data from a number of areas for each pollen or fungal spore taxon and, as already mentioned, from different atmospheric conditions. This would, ideally, present a more complete picture of ambient environments across different regions. To date, most algorithms have been developed only to meet more specific regional conditions and thus cannot easily be applied in very different conditions without needing to be retrained on local pollen and fungal spore types. Once such “standard” algorithms have been chosen, a mechanism will likewise need to be developed to completely assess how any new, potentially better, algorithm might be evaluated to understand if indeed it should supersede the older version. This remains a goal that will need to be further researched and established as technologies and methods are refined and improved.

4 Recommendation: Raw data should be archived in the long term to allow reanalysis with improved algorithms and comparisons across different regions.

4.1 Estimating uncertainty

There are two main sources of uncertainty for each concentration value that is produced by a particular measurement system, namely the uncertainty related to the measurement and that stemming from the identification algorithm. When estimating uncertainty of the automatic method one can follow similar methods to those applied as part of the standard for manual pollen and fungal spore observations (EN16868:2020), but adapted for automatic technologies.

A number of factors influence the measurement uncertainty, for example instrument maximum detection limits (saturation at high concentrations), imperfection or lack of calibration techniques for measuring particles larger than 20 microns, aggregation of particles (pollen or fungal spores that may be stuck together or other debris stuck to them), and particle deposition in the instrument and later release of these particles (the so-called “memory effect”), which could result in correct identification but at the wrong time of the day or season.

For both manual and automatic methods, one needs to consider flow variability in the uncertainty estimate (Oteros et al., 2017). While for the manual method the adhesive medium affects sampling efficiency (Comtois and Mandrioli 1997; Galan and Dominguez, 1997), for automatic measurements it can be the efficiency in recording signals (e.g. image clarity or fluorescence) of sufficient quality. Automatisation limits intra-observer variation during identification and counting but the reproducibility, accuracy, sensitivity, and specificity of measurements (as defined in point 7.3 of the manual standard (EN16868:2019)) remain the major sources of uncertainty for automatic methods.

The second major source of uncertainty is related to the identification algorithms, in particular when there are large differences between the datasets used to train the algorithm and the real atmospheric composition that an instrument is exposed to. In addition, it is important what part of the training dataset is selected for validation and what procedures are used for cross-validation. Each dataset used to train an algorithm is generally split into two parts, with the training section usually approximately 80% of the dataset and the validation part the remaining 20%. Ideally, an algorithm should be trained on datasets that stem from more than just one source and time period. This ensures the algorithm is not specific to particular environmental conditions or a certain biological specimen. The validation portion of the dataset is used to choose the optimal model configuration and that the algorithm is not over-trained (i.e. too specific). However, there is always the possibility of obtaining sub-optimal results due to poor feature engineering, the removal of abnormal values from the training dataset, and imbalances between the dataset sizes of the different pollen or fungal spore taxa that are used for training. A number of other characteristics (data shuffle, the “initial seed” values used in training, which sets adjustable randomisation to ensure reproducibility of predicted results, etc.) may also be additional sources of uncertainty.

Recommendation: Uncertainty estimates should take into account uncertainty from both the measurements (e.g. signal saturation) and the algorithms (e.g. from imbalances between training datasets used). All sources of uncertainty should be reported together with the final cumulative uncertainty estimate.

5 Site selection criteria

A similar set of criteria to manual monitoring needs to be applied when choosing an adequate site to place an automatic instrument. This section outlines a number of factors which should be taken into consideration.

5.1 Representativity

A typical variable quantifying representativity is the correlation radius, if a circular symmetry or, more generally, a 2D structure function can be assumed around the site (Siljamo et al., 2008). The acceptable level of representativity of a particular site is, however, specific for each application. In terms of pollen and fungal spore monitoring the representativity of a station is dependent on several aspects, including the local distribution of particle sources, biogeographical zone, local climate, regional topography, and surrounding land-cover types. Representativeness is a function of temporal averaging: for shorter averaging intervals it is usually much lower. Previous research has shown that measurement sites are influenced by a surrounding area with the radius depending on these variables as well as the aerodynamic morphology of the pollen or fungal spore of interest. For example, Oteros et al. (2015) showed that for olive pollen from plantations on the Iberian Peninsula the distance was at least 25 km for daily averages. The actual area influencing a site should be fully assessed, for example, with modelling studies to understand the station footprint (e.g. Sofiev et al., 2013, 2015; the footprint being the area comprising all sources affecting the particular parameter measured at a particular site) and/or with data from land or forestry services.

5.2 Historical/current pollen or fungal spore monitoring site

If a current pollen or fungal spore monitoring site meets the majority of the requirements set out here, then automatic monitors should also be installed at the same site. This is particularly of relevance to ensure continuity of long timeseries. In this regard, a long enough period of overlapping measurements should be allowed to ensure that homogenisation of the record is possible. This should be at the very least two pollen seasons, preferably 5 years (Galan et al., 2017).

5.3 Proximity to local particle sources

The site should not be located directly next to major particle sources (stationary or mobile, biological or other) such as heating chimneys, major roads, waste burning plants, and commercial shopping centres. Any large nearby pollen sources, such as forests, should also ideally be avoided (at a distance of > 100 m) and, where any sources are present, a map of the station surroundings indicating their location should be provided.

5.4 Surrounding land-use and vegetation

The area surrounding the monitoring site should be well-mapped in terms of two aspects: vegetation and land-use. This should take a similar form as described in Saar and Meltsov (2011), with three levels of complexity including the immediate surroundings (within 100 m of the site), adjacent surroundings (within 1 km), and distant surroundings (within 10 km). Maps of the surrounding plant communities as well as the estimated coverage of wind-pollinated plants and water surfaces should be provided for both the immediate and adjacent areas (up to 1 km distance from the site). For more distant surroundings, a more generic land-use cover map can be used to describe the diversity of land coverage up to 10 km from the site. These maps should be updated at least every five years to take into account any land-cover or vegetation changes that may occur over time.

When choosing a site location, future possible land-use changes in the area around the station also need to be considered. For example, sites should be avoided where planned urban development may interfere with measurements. It is a good idea to choose a place in accordance with the zoning plan of the site and to consult with the local government in terms of site choice.

5.5 Microscale location

In typical monitoring networks, the pollen/fungal spore sampler should be placed on an easily accessible, flat, horizontal surface – ideally on the roof of a building. The instrument should be located at least 2 m away from the edge of the building and the inlet elevated from the roof (by at least 1.5 m) to avoid the effects of turbulence (Galán et al., 2014). The required height of the roof depends on the surrounding obstacles (buildings, trees, etc.) and topography, but should ideally be 10 m above the surrounding ground (Rojo et al., 2019). The sampler should be well away from all possible obstructions, such as trees, other buildings, walls, and solar panels. Ideally, such obstacles should be at a distance of at least four times the height of this object from the site, but, at a minimum, at least two times the height of the object from the site. Where possible, surrounding trees and other vegetation should be well maintained so as to impact as little as possible the sampling site. While not currently practical given the size of most automatic instruments, another option is to install the device on a mast, away from all obstacles, as is done, for example, for wind measurements. In all cases, the device should be well attached to withstand wind storms or other bad weather conditions and also have protection in case of lightning strikes.

5.6 Accessibility and available infrastructure

Ideally, sites should be located on land where long-term monitoring can be ensured, i.e. that there will be little chance of having to relocate. Each site needs to be permanently and easily accessible for any personnel who may have to carry out any maintenance or other inspections. Furthermore, the site should also provide adequate facilities such as electricity (important to ensure an uninterrupted power supply), Internet access (for data transfer), and security measures against interference or theft as well as to ensure operator safety. Depending on the location, it may be useful to secure the device, e.g. with fencing.

For analysis purposes, it is very useful to have a weather station at each measurement site. The station should at a minimum measure temperature, relative humidity, wind speed and direction, as well as precipitation. An alternative is to use data from a nearby station, particularly if that station is a WMO one and is within a few hundred metres.

5.7 Station logbook

Each site should maintain a station logbook to record and document station operations, particularly any events or observations that may affect the measurements or procedures (see also Sect. 2.10). This information may be especially important at a later stage for data validation and/or reanalyses.

6 Conclusions

This paper has set out a number of guidelines and best practises that can be applied to automatic observations of pollen and fungal spores at high-temporal resolution (up to hourly). Despite the fact that the field is very dynamic and new developments are continuously made, it is vital that all aspects of the information chain are standardised, from the initial measurements through to the development of particle identification algorithms and their evaluation. As a more long-term goal, a European standard for automatic monitors should be developed. Activities in this direction have already begun, with a CEN technical specification currently in preparation. Furthermore, device certification should, in future, become the norm for all automatic pollen and fungal spore monitors, just as is the case for air quality monitors. This would greatly facilitate potential synergies between bioaerosol and air quality measurements, potentially allowing pollen and fungal spore monitors to be used for both purposes.

To ensure quality across the growing European network of automatic pollen and fungal spore monitoring sites, it is important that the guidelines described here are consistently applied. The EUMETNET AutoPollen Programme is working towards developing a quality seal for sites and networks that will contribute towards this goal, particularly during the interim while no official European standard for automatic pollen and fungal spore monitoring exists. Additionally, the community will need to focus efforts on better characterising measurement systems as a whole, for example by testing them under different wind speeds, increasing the particle size range over which reference laboratory calibrations can be made, or better understanding air inlets and their impacts on sampling efficiency. Such endeavours will further help to improve measurement techniques and reduce overall uncertainty across the European automatic monitoring network as well as other sites and networks globally.