Introduction

One of the most pressing questions of our time is how to stop the widely documented loss of biodiversity (Hallmann et al. 2017; Cowie et al. 2022). In order to tackle this problem effectively, it is important to get information on the occurrence and abundance of species, how both change over time, and which drivers are behind these changes. Only with such information, we can effectively protect species. In recent decades, Citizen Science projects, i.e. the collection of data by volunteers, have been proven as an excellent way for generating such occurrence and abundance data as a basis for further analyses (Schmeller et al. 2009; Chandler et al. 2017). This is particularly true for popular groups, especially if many of their common species are easy to identify, such as birds or butterflies.

Along with the growing popularity of Citizen Science (CS) projects, the technical possibilities for recording, processing and analyses of the resulting data are constantly increasing. While 15‒20 years ago, data were mostly recorded in the traditional way by noting observations and records with paper and pencil, increasingly modern recording tools have emerged since then. They range from websites with corresponding data entry sheets, simple recording apps to advanced apps with integrated AI based image recognition, species identification and geolocation of the individuals. The progress in this field is fast and the resulting possibilities inspiring (Wäldchen et al. 2018; Høye et al. 2021; Mäder et al. 2021). Alongside, data management has also evolved. In recent decades, for example, data recording has evolved from index cards via Excel spreadsheets to powerful databases.

At the same time, however, there is a completely different development in the field of Citizen Science, which at the first glance is independent from the technological development of data recording. The older generation, i.e. numerous people with good knowledge of species, and some of them have even specialised in specific species groups, is so far not replaced by the next generation adequately. For example the median age of participants of “Butterfly Monitoring Germany” is 62 with only few young people participating (Kühn et al. 2022a). The ability to identify animal and plant species in the field or even to re-identify them with the help of identification books has declined sharply within the younger generation; claims also include that taxonomist are nearing extinction alongside the species they identify (Miller 2005; Hooykaas et al. 2019; Gerl et al. 2021; European Commission. Directorate General for Environment. 2022). This applies to professional as well as to amateur species experts, both have profound knowledge in special fields. Many of these “old-fashioned” species experts have not changed their data collection approaches for decades and hardly use modern technologies. This development goes along with the frequent opinion that “old-fashioned” data collection with pen and paper in the field is obsolete. The result is that it is increasingly difficult to make observations available to science when not collected by individuals using a mobile phone, laptop or computer.

It is often argued that it is important to increasingly and particularly attract young people to Citizen Science projects because presently many such projects mainly attract older people (Füchslin et al. 2019; Aristeidou et al. 2021; Kühn et al. 2022a). This might be due to several reasons. On the one hand, people at the beginning of their professional career and/or with small children have less time for volunteer activities, while retired people have more time available to participate in CS projects. At the same time, however, “old-fashioned” data collection is not as attractive to young people as data collection via an App on a mobile phone, where it may also be possible to determine species automatically via image recognition and at the same time still having a lot of additional information at hand. It remains to be seen how successful these modern methods of data collection attract more young people, as well as, less experienced groups.

A CS project in which both types of data recorders (i.e. traditional and modern ones) are active is the Butterfly Monitoring in Germany (= Tagfalter-Monitoring Deutschland—TMD, www.tagfalter-monitoring.de). In this project, volunteer butterfly enthusiasts count butterflies several times a year and over many years along defined transect routes (Kühn et al. 2008). The project started in 2005 and participants have three options to submit their data. They can send their data by (i) ordinary mail on paper or (ii) by email via attachments in many formats (Excel files, pdf files, photos of completed data entry forms) to central coordination. These options are often used by “old fashioned” recorders who prefer to count butterflies in the field with paper and pen. Such data are entered manually into the online data entry tool at the headquarters of TMD. (iii) The most commonly chosen option is that recorders enter their data via the same online data entry tool (as used by central coordination) themselves (“online recorders”). The additional work of manually entering such data collected using the “old fashioned” method (sent by letter or email) is not only time-consuming and cost-intensive but also prone to errors, as transmission errors can easily occur. Consequently, the project coordinators discuss every year whether this form of data entry is worth the effort. The advantage, however, is that when “old fashioned” data is manually transferred to the central database, all data is available in a standardised format.

Therefore, we sent a questionnaire to all recorders not entering their data themselves via the online data entry tool, asking them why they are not using this method. We also asked what hurdles exist that makes the use of the online data collection problematic. In addition, we inquired what needed to be done (or implemented) to convince recorders, who use the “old fashioned” method, to enter their data online.

In this publication, we address the following questions related to this topic.

Is the quality of data collected by “old-fashioned” data recorders of the same, better or worse quality than data collected by “online recorders”? Specifically, we would like to know:

  1. 1.

    Do “old-fashioned” recorders generate data that we cannot get via online data entry?

  2. 2.

    Is “old-fashioned” data collection outdated and can we do without it in favour of online data collection on computers or mobile phones?

  3. 3.

    Can we derive recommendations for the development of online data collection tools that are used by as many participants in Citizen Science projects as possible?

Methods

Background

Butterfly Monitoring Germany is a Citizen Science project in which volunteers participate all over Germany. Whoever is interested in counting butterflies can join, and the site for a butterfly transect is freely chosen by the transect walkers. Most transect walkers decide to walk a transect near their homestead (Kühn et al. 2008). The methodology of how to count is highly standardised and counts are performed each year from April to September, ideally once a week (Kühn et al. 2014). Transect walkers enter their results for each transect walk on recording sheets, noting the butterfly species they saw and the number of individuals per species (Fig. 1). Later, these results can be entered online via a data entry tool for the project. Alternatively, the recording sheets can be sent to the project´s central coordination via email or letter and the data is entered by project staff into a central database. The data is then available in a uniformly standardised format.

Fig. 1
figure 1

Recording sheet from an “old-fashioned” transect walker, filled in with a pen on paper and sent to the coordination of Butterfly Monitoring Germany via letter

For this paper, we used data from the German Butterfly Monitoring from the year 2021. In this year, 593 transects were walked, 367 transect walkers were active with many counting butterflies on more than one transect. 151 different butterfly and burnet species and 77 (mostly day active) moth species were counted. We used Reinhardt and Bolz (2011) to identify red listed species, considering all species that are listed in categories 1‒3 (i.e. threatened with extinction, highly endangered and endangered), the others are not endangered.

Comparison of data collected by different groups of data entry

To answer the above-mentioned questions, we compare the data of “old-fashioned recorders”, who do not enter their data online themselves, with those of the “online recorders” who enter their data themselves via our online data entry tool. The “old-fashioned recorders” are divided into those who send us their data via email and those who send us their data via letter. We assume that the participants sending their data by email would theoretically also have the possibility to enter data online themselves. The participants sending their data by letter will most likely not be able to enter data online themselves in the future.

The groups are abbreviated as follows: group “online” = Data entry online, group “email” = Data submission via email, group “letter” = Data submission via letter.

Questionnaire “old fashioned” data entry

In the context of this project, a survey was developed consisting of three multiple-choice questions (full questionnaire as supplementary material S4 [English] and S5 [German]). Here, we analyse the answers on question 2 and 3 as these are relevant for the content of this paper. A total of 93 questionnaires were sent to all transect walkers sending their data for the year 2021 via letter or email (but were not entering their data themselves). They were asked to answer the questions and to send us the questionnaire back, using the stamped envelope included.

Statistical analyses

All statistical analyses were performed in R (R Core Team 2022) v. 4.2.2 with RStudio (RRStudio Team 2023) v. 2023.09.1 + 494 as user interface.

We compared recorder activity, the individuals and species richness of all butterflies and moths (jointly) and those on the Red List (Reinhardt and Bolz 2011; butterflies only), respectively, using Analysis of Variance based on glm with quasipoisson distribution to account for the, on cases, substantial overdispersion. In case of significant differences (at α = 0.05), we performed a Tukey’s all-pair comparisons post-hoc test using the function glht() in the package multcomp (Hothorn et al. 2008; Bretz et al. 2010).

We identified associations between species and recorder groups using the R package indicspecies v. 1.7.13 (Cáceres and Legendre 2009), using a multilevel pattern analysis (function multipatt()). Here, three metrics are reported, A, B, and stat. A is the mean abundance of the species in the target site group divided by the sum of the mean abundance values over all groups. B is the relative frequency of occurrence of the species inside the target site group. Stat refers to the statistics of association (a special form of correlation coefficient, for details see Cáceres and Legendre (2009)). Proportions of transects with these indicator species present were then compared in a mixed-effects ANOVA using aov() and species identity as random factor. These two analyses were restricted to butterflies only.

All test statistics are provided as supplementary material in supplement S1 and S2.

Results

In 2021, butterflies were counted on 593 transects. The vast majority of these transect data were entered online; little more than 15% of the data had to be entered manually by the project coordination (Fig. 2).

Fig. 2
figure 2

Type of data entry in the data repository of Butterfly Monitoring Germany for the recording year 2021

To measure the effort invested in butterfly counting by transect walker groups, we first compared the number of transect surveys conducted per person per year in each group (Fig. 3). There were significant differences (p = 0.007) among the different groups of recorders. The Posthoc-Test yielded significantly more walks for the group “email” compared to “online” and marginally more walks than in group “letter”. The latter two did not differ significantly (please see supplementary material S1 and S2 for a full overview of all relevant test statistics).

Fig. 3
figure 3

Activity of volunteers (No. of walks per volunteer) for the 2021 monitoring season. Significantly different groups (p < 0.05) are marked by different letters, marginally significantly different groups (here p = 0.08) are indicated by different capitalisations of the same letter

As a next step, we compared the number of butterfly species recorded in each group in 2021 (Fig. 4). No significant differences (p = 0.19) in numbers of butterfly species were recorded per transect among groups. Further, we compared the number of butterfly individuals recorded per year in each group in 2021 (Fig. 5), but no significant differences were obtained either (p = 0.76).

Fig. 4
figure 4

Number of butterfly species recorded per groups of data entry per transect in 2021

Fig. 5
figure 5

Number of butterfly individuals per transect recorded per group of data entry in 2021 (note the log scale on the y axis!)

To assess differences in the value and ecological importance of the data, we compared the number of red-listed butterfly species (threat status 1‒3) recorded per year in each group (Fig. 6). We could only find marginally significant differences among groups (p = 0.051), but the Tukey posthoc test did not show any significant pairwise differences. Further, we compared the number of individuals of red-listed butterfly species recorded per year in each group (Fig. 7). The overall test yielded no significant differences (p = 0.41).

Fig. 6
figure 6

Number of red list species recorded per transect recorded per groups of data entry (note the log scale on the y axis; therefore 1 needed to be added to species richness!). Number of individuals of all nationally red-listed butterfly species per transect for the entire year 2021 recorded per groups of data entry in 2021 (note the log scale on the y axis; therefore 1 needed to be added to species richness!).

Fig. 7
figure 7

Number of individuals of all nationally red-listed butterfly species per transect for the entire year 2021 recorded per groups of data entry in 2021 (note the log scale on the y axis; therefore 1 needed to be added to species richness!)

Restricting the analyses to those transects with the presence of red-listed species (to avoid zero-inflation) resulted in significant differences (p = 0.005), with group “email” data delivery recording significantly more species than the two other groups, while group “online” records marginally more species than group “email” (Fig. 8). The differences are even more pronounced for abundances (p < 0.001) (Fig. 9), but here the records submitted by group “letter” are significantly lower than the two others, while the number of online records was just marginally lower than those submitted via email.

Fig. 8
figure 8

Number of red-listed butterfly species per transect recorded per groups of data entry in 2021, restricted to those transects with red-listed species occurrences (note the log scale on the y axis). Significantly different groups (p < 0.05) are marked by different letters, marginal significance (here p = 0.09) is marked by capitalization of the same letter

Fig. 9
figure 9

Number of individuals of red-listed butterfly species per transect recorded per groups of data entry in 2021, restricted to those transects with red-listed species occurrences (note the log scale on the y axis). Significantly different groups (p < 0.05) are marked by different letters, marginal significance (here p = 0.09) is marked by capitalization of the same letter

Indicator species analysis

To assess differences in the value and ecological importance of the data on species level, we looked for butterfly species that “indicate” one group of data entry, i.e. are significantly more frequent/abundant in one group than in any other. 25 species were associated with a special group of data entry. The species complex Thymelicus lineola/sylvestris is associated to the group “online” and a total of 24 species is associated to the group “email” (Table 1). If we look at these “indicator species” in detail (see also Supplementary material S3), 14 of these are red-listed species (seven vulnerable, two endangered and five critically endangered). The frequency of “indicator species” differed significantly among groups (p < 0.0001) and was higher in transects of the group “email” compared to the other two (Fig. 10).

Table 1 List of species significantly associated to one of the two groups “online” and “email. A is the mean abundance of the species in the target site group divided by the sum of the mean abundance values over all groups. B is the relative frequency of occurrence of the species inside the target site group. Stat refers to the statistics of association (a special form of correlation coefficient, for details see Cáceres and Legendre (2009))
Fig. 10
figure 10

Proportion of indicator species per group of data entry in 2021

Questionnaire

Of 93 questionnaires sent to the transect walkers of the groups “email” and “letter”, 50 were returned (i.e. a response rate of 54%). The main reasons transect walkers did not use the online data entry were that data entry takes too long and is too cumbersome, but also technical reasons such as bad or no internet connection and lack of technical capabilities were stated (Fig. 11). Almost half of transect walkers would not use the online data entry under any circumstances, but about a quarter would use the online data entry if it was faster, and a fifth if it was easier (Fig. 12).

Fig. 11
figure 11

Answers to a questionnaire sent out to the project participants returning their record via email or letter in 2021

Fig. 12
figure 12

Answers to a questionnaire sent out to the project participants returning their record via email or letter in 2021

Discussion

Transect walkers not entering their data online invested on average more time in the field, and performed more transect walks. However, this did not result in a higher number of butterfly species or individuals recorded (also not of those on the Red List). Restricting the analyses to transects with the presence of red-listed species, transect walkers submitting their data via email obtained significantly higher proportions of individuals and numbers of butterfly species included in the Red Lists than the other groups of transect walkers. Most importantly, 24 species were significantly more frequent on transects of recorders submitting their data by email.

In general, data collected by “old fashioned” recorders were of comparable but not better quantity than those collected by “online recorders”. Taking species identity (on taxon level) into account revealed a qualitative difference. Thus, the Alcon Blue Phengaris alcon (critically endangered in Germany) was exclusively counted on one transect by a transect walker submitting data via email; the Large Heath Coenonmpha tullia (critically endangered) was only counted on two transects, one by a transect walker entering data online and one by a transect walker submitting data via email. Data on endangered species (i.e. Red List categories 1‒3) are generally underrepresented in “Butterfly Monitoring Germany” because recording these species usually requires more effort: Transect walkers have to visit specific habitats often not in the immediate vicinity of their place of residence. The list of species more frequently recorded by transect walkers of the “Email” group showed that rare and endangered habitats were visited by them for numerous species, such as bogs (with Coenonympha tullia and Minois dryas) or wet meadows (with Phengaris alcon and Ph. teleius). In addition, the list reported by the “old fashioned” recorders also included species difficult to distinguish from closely related taxa (e.g. Colias alfacariensis, Plebejus argyrognomon, Zygaena viciae). For these, in depth taxonomic knowledge is necessary to address these species correctly. In summary, many of the transect walkers submitting their data via email are volunteers with very good, and in some cases specialised, species knowledge specifically selecting their transects in species-rich habitats.

One main output of “Butterfly Monitoring Germany” are population trends of butterfly species, which are analysed and reported every year (Kühn et al. 2022b). For analysing these trends, having sufficient data of the respective species is crucial. Especially for rare species (which in most cases are also endangered), each individual data set is of importance for trend calculations. Many studies focus on widespread species because data for rarer species are missing or not sufficient (Staats and Regan 2014; Plummer et al. 2023). In our study, for example, Phengaris nausithous, a species listed as “vulnerable”, is counted on 14 transects by transect walkers entering their data online and on three transects by transect walkers submitting their data via email. However, a trend could not be calculated without these three transects of “old-fashioned” transect walkers (Fig. 13). For most of the “indicator” species also listed on the Red List, we cannot calculate species trends because of still insufficient data, underlining the high importance of every data on these taxa.

Fig 13
figure 13

Dusky Large Blue (Phengaris nausithous), Foto: Josef Settele Trend of Phengaris nausithous in Germany, Source “Butterfly Monitoring Germany”

Consequently, data collected by transect walkers submitting their data via email are of high if not even crucial importance for “Butterfly Monitoring Germany”. They include data on species that are either rare and/or endangered or difficult to determine. Without these data, we would have considerably less information about these special species. Therefore, these data should be entered into the respective database under all circumstances even if this includes a lot of extra-work. Unfortunately, this is associated with considerable additional effort in the case of the email submitters because they already had entered their data in some electronic form of data sheet (often in an Excel sheet) which then has to be entered again manually into the online tool by someone that did not perform the transect walks. This is rather time-consuming and is prone to errors as someone of “Butterfly Monitoring Germany” stuff has to enter 20 to 40 recording forms in a short time what needs a lot of attention.

(Modern) technology/exchange with other databases

An important question for “Butterfly Monitoring Germany” is how to minimise this extra-work without losing important data. In our questionnaire, many of the “old fashioned” transect walkers stated that they did not use the online data entry because it is too time consuming and/or too cumbersome. For this group, an easy and simple to use online entry tool offering additional benefits for the transect walkers would be highly beneficial. Most of the highly skilled transect walkers with very good knowledge of species also work in various other projects. In our questionnaire, some transect walkers stated that they already had entered their data into a different database and therefore were not willing to perform the double work to enter it in our online tool again. This group might be convinced to change their habits if being offered a good and easy to use online data entry offering some additional benefits, such as AI driven image recognition or the possibility to carry out own statistical evaluations.

Another important point for transect walkers might be the possibility to enter opportunistic data in addition to the highly standardised TMD transect data. This would allow participants to record single sightings of interesting species besides butterfly monitoring. In Germany, ObsIdentify (Schermer and Hogeweg 2018) and iNaturalist (Nugent 2018) are the most popular tools to capture, archive and manage own data. In addition, databases open to everyone also exist to record data on the state level (e. g. Bavaria, Brandenburg). For these tools, good interfaces are important because transect walkers would like to have the possibility to export/import their data from one tool to another.

The different forms of data recording and increased data quality delivered by “old fashioned” transect walkers comes with a trade-off: In the head office staff needs to spend considerable effort to enter the data manually into the central data base. Still, this results in harmonised data ready for analyses, so that scientists using the data in their studies do not need to worry about the different ways of recording, but receiving high-quality data. This once again emphasises the importance of transferring the data from the various data collection groups to a common database. Even if this means additional organisational effort, it pays off in the long term and can ultimately improve nature conservation.

Different people—different ways to collect data

In our questionnaire, 44% of the “old fashioned” transect walkers stated that they will not enter their data online under any circumstances. This astonishingly high percentage underlines that some participants always will reject such technologies. Some of these recorders used the comment section in our questionnaire to explain why they do not and will not use online data entry tools. They explained that they do not want to spend their time at the computer but rather like to go out to enjoy nature or that they generally have a negative attitude towards computer technology. In this context, it is crucial to keep in mind that Citizen Science projects are only partly for the sake of science, but also volunteers should benefit by being part of a community and having access to information about (butterfly) biodiversity. Therefore, also people not willing to use advanced online tools should be given the possibility to be part of such projects, even if their integration is labour-intensive for these projects.

Species knowledge

Our analyses underline the importance to capture the data of “old-fashioned” transect walkers submitting their records via email, even if this means a significant additional effort. Their records were filling important gaps and contributed to a substantially better data situation, especially on rare and/or endangered species. On the other hand, data collected by “old-fashioned” transect walkers submitting their records via letter did not contribute to more information about these species and did not fill knowledge gaps.

Nevertheless, involving all types of transect walkers is important for “Butterfly Monitoring Germany” on a completely different level. Numerous studies show that the knowledge about species or the recognition of species is getting worse, especially among children and young people (Frobel and Schlumprecht 2016; Sturm et al. 2020; Gerl et al. 2021). To be able to continue citizen science projects such as “Butterfly Monitoring Germany” on the long run, it is important to get young people interested in this subject and to train them. Hence, the young can learn from the old, and only when the knowledge available in the older generation is passed on, we can hope for a next generation that will continue such projects. This will not only be about the continuation of “Butterfly Monitoring Germany”, but also for the actions deriving from it. Previous research showed that Citizen Science projects can contribute educating the public about biodiversity (Peter et al. 2021) and that species literacy is an important factor for further actions on conserving biodiversity (Hooykaas et al. 2019, 2022).

Outlook and conclusion

Taking purely a data quality perspective, data of transect walkers submitting their records via letter do not provide additional information (i.e. increase data quality). In contrast if we excluded the data of transect walkers submitting their data via email, we would miss important information about specific butterfly species and with potential downstream impacts on trends analyses. For this group of transect walkers, it is worth trying to design a data entry tool that suits their needs to such an extent that using the tool piqued their interest. Main points in this case are that data entry has to be as easy as possible and that datasets can be up- or downloaded and im- and exported in different formats. However, if scientists make use of data collected by citizen scientists, they should always keep in mind that these types of projects should not only benefit science but also the people investing their time and effort. If this is considered, people are more likely to participate long-term in a project and thus help to collect long-term data important for the conservation of biodiversity (Kühn et al. 2022a).

Finally, people are different and have different interests and motivations to join a Citizen Science project. Many people joining a CS project in the field of biodiversity have an affinity for nature with a corresponding low interest in technology. Therefore, there will always be a certain amount of people who will not use online data entry tools, even if they are specially designed to their needs. Therefore, we advocate a pledge to “leave no one behind”, as stated in the agenda 2030 for Citizen Science (Montanari et al. 2021). Every participant should have the possibility to submit data in a way that suits him or her best and should be assured that the data will be entered into the database and analysed.