Impact assessment of citizen science: state of the art and guiding principles for a consolidated approach

Over the past decade, citizen science has experienced growth and popularity as a scientific practice and as a new form of stakeholder engagement and public participation in science or in the generation of new knowledge. One of the key requirements for realising the potential of citizen science is evidence and demonstration of its impact and value. Yet the actual changes resulting from citizen science interventions are often assumed, ignored or speculated about. Based on a systematic review of 77 publications, combined with empirical insights from 10 past and ongoing projects in the field of citizen science, this paper presents guidelines for a consolidated Citizen Science Impact Assessment framework to help overcome the dispersion of approaches in assessing citizen science impacts; this comprehensive framework enhances the ease and consistency with which impacts can be captured, as well as the comparability of evolving results across projects. Our review is framed according to five distinct, yet interlinked, impact domains (society, economy, environment, science and technology, and governance). Existing citizen science impact assessment approaches provide assessment guidelines unevenly across the five impact domains, and with only a small number providing concrete indicator-level conceptualisations. The analysis of the results generates a number of salient insights which we combine in a set of guiding principles for a consolidated impact assessment framework for citizen science initiatives. These guiding principles pertain to the purpose of citizen science impact assessments, the conceptualisation of data collection methods and information sources, the distinction between relative versus absolute impact, the comparison of impact assessment results across citizen science projects, and the incremental refinement of the organising framework over time.


Introduction
Over the past decade, citizen science has experienced growth and popularity-both as a scientific practice, and as an emerging form of stakeholder engagement and public participation in the generation of scientific knowledge-due to, among other things, the pervasive diffusion of information and communication technologies (Silvertown 2009;Bonney et al. 2009aBonney et al. , 2014. Its popularity as a novel form of stakeholder engagement and public participation in science stems from the increased realisation of its potential for jointly identifying and addressing common challenges of the twenty-first century . Moreover, progress in addressing the challenges articulated in the UN Sustainable Development Goals (SDGs) can be monitored using citizen science for around 33% of the indicators of the SDG framework (Fraisl et al. 2020). Beyond the common notion of public participation in data collection for scientific purposes, a range of phenomena, activities and practices fall under the umbrella term 'citizen science' (ECSA 2020). We consider citizen science as a multifaceted phenomenon, consisting of collaborative data and knowledge generation among citizens, scientists and, in some case, decision makers, for a range of purposes, consisting of different dimensions (thematic, geographical, temporal, socio-political, scientific, technological and economic) which together influence the nature, remit, value and impact of any given citizen science initiative.
While the aspirations of citizen science are running high and the efforts to capture and report outputs, outcomes and impacts of citizen science are increasing, the actual changes resulting from citizen science interventions are often assumed, ignored or speculated about . Outputs refer to direct products of a citizen science initiative, while outcomes and impacts refer to short-term and long-term changes resulting from citizen science initiatives respectively. There is no blue-print for impact assessment of citizen science initiatives (Friedman 2008), due to the fact that the diversity of citizen science practices (e.g., various aims and thematic foci) and differing purposes of impact assessment (e.g., improving citizen science implementation, or reporting to funders), do not easily allow for a single methodology or approach to fit all. Moreover, limited resources (funds and expertise) and mismatches in the timing of impact assessments and impact manifestations quite often hinder a thorough assessment of the impacts of citizen science projects.
Previous literature review efforts have aimed to conceptualise, discuss and generate new insights on the impacts of citizen science. Jagosh et al. (2011) conducted a review of the participatory research literature which demonstrated that the diversity of research topics, intervention designs, and degrees of stakeholder involvement in the co-governance of research, and the complexity of outcomes render it difficult to evaluate such projects. Indeed, via a systematic review of 273 papers and 25 Community-Based Participatory Research (CBPR) projects, Sandoval et al. (2012) concluded that impacts and outcomes attributable to CBPR are often not (well) documented. Groulx et al. (2017) reviewed 145 studies to identify learning outcomes in citizen science projects relating to climate change and concluded that, despite initial discussions about such learning outcomes, evidence of these learning outcomes is not well documented. Based on an extensive literature review of 135 peer-reviewed publications, Fazey et al. (2014) concluded that evaluation of knowledge exchange is often an afterthought in interdisciplinary and multi-stakeholder environmental change research. Building on the literature from different fields of research, Hassenforder et al. (2016) and Gharesifard et al. (2019b) concluded that the identification of contextual variables is both important and challenging, and proposed conceptual frameworks that can help monitor and evaluate participatory processes and outcomes of citizen science. Following a structured review of citizen science project websites (327 in total), Phillips et al. (2012Phillips et al. ( , 2014Phillips et al. ( , 2018 highlighted that, as the field of citizen science continues to grow, it is important to reflect on its impact, and on the type of questions that are being asked by practitioners and researchers for capturing impacts of citizen science initiatives. Moreover, existing review efforts are not limited to the review of scientific publications and insights from projects. For example, Granner et al. (2010) reviewed 2681 articles from 1764 newspapers and identified media content analysis as beneficial for evaluating citizen science initiatives.
Despite their diversity and expansiveness, existing literature reviews on the topic have had very specific thematic or methodological foci and, therefore, may have limited application for the wider field of citizen science. For example, the review by Sandoval et al. (2012) was conducted with a focus on CBPR partnerships and participation in health research and a pre-defined model of impact assessment (i.e., the CBPR Conceptual Logic Model). Other review efforts contain a bias towards bodies of literature from specific fields, such as the literature reviewed by Groulx et al. (2017), which includes publications in multi-and interdisciplinary journals, but only a limited number of publications from the social sciences. In addition, a limitation of these previous review efforts is their focus on a specific (or limited number of) impact domains, i.e., areas of change. Examples include Jagosh et al. (2011) andHassenforder et al. (2016) which focus on governance impacts of participatory research projects; Phillips et al. (2012Phillips et al. ( , 2014Phillips et al. ( , 2018 which only focus on societal impacts; and Fazey et al. (2014), which is even more specific and only discusses the knowledge exchange outcomes in the context of research on multi-actor and interdisciplinary environmental change studies.
Collectively, the field of impact assessment within the 'science of citizen science' has made significant advances over the past two decades. However, if ongoing and future projects ignore the strengths and weaknesses of and lessons learned from previous impact assessment efforts, they run the risk of wasting resources, "reinventing the wheel" or maintaining the flaws and gaps of past impact assessment approaches. Impact assessment of previous citizen science projects, despite its limitations, offers various insights that can inform future impact assessment efforts by researchers and practitioners. This paper offers a consolidation of these insights into a coherent framework that can address and navigate the complexity of measuring the impacts of diverse citizen science initiatives.
The purpose of this research is therefore to generate guidelines for a consolidated Citizen Science Impact Assessment Framework (CSIAF) to enhance the ease and consistency with which impacts can be captured, as well as the comparability of evolving results across projects. We do so by combining a systematic literature review with empirical insights from ten past and ongoing projects in the field of citizen science. Specifically, in line with our view of citizen science as a multi-dimensional phenomenon, we frame our review according to five distinct, yet interlinked, impact domains: • Society Impact on society and individuals as well as collective (societal) values, understanding, actions and wellbeing (including relationships). • Economy Impact on the production and exchange of goods and services among economic agents; on entrepreneurial activity; economic benefits derived from data, e.g., for the public good or for the benefit of private sector actors. • Environment Impact on the bio-chemical-physical environment, e.g., on the quality or quantity of specific natural resources or ecosystems. • Science and technology Impact on the scientific process (method) as well as research more broadly; on the scientific system (institutions; science policy; incentive structures), scientific paradigms and resulting technological artefacts (e.g., sensors, apps, platforms) and standards. • Governance Impact on the processes and institutions through which decisions are made, both informal and formal (e.g., public policy), and on relationships/partnerships, as well as the governance of data generated.
While the three interlinked domains of sustainable development (environment, society and economy) are well known and accepted, the context of citizen science warrants the focus on two additional domains, namely science and technology, and governance. The science and technology domain is considered due to citizen science's alignment with, and use of the scientific process and resulting (potential) implications for the scientific system, scientific paradigms and technological artefacts. An additional governance domain is considered owing to the links of citizen science processes and results to monitoring, (environmental) management and (public) decision-making processes. These impact domains arguably cut across many if not all of the Sustainable Development Goals (SDGs). Moreover, considering impacts in different domains is helpful for 'unpacking' them, drawing attention to and enabling analysis of distinctly different types of impacts, e.g., those to the physical environment [environment] as compared to those to institutional settings [governance]. Nevertheless, impacts in the different domains can be closely connected and may occur in sequence-interdependence even-rather than in parallel. For example, Wehn et al. (2020b) showed that case-specific changes in society (e.g., sense of place) and governance (e.g., improved support for participation in decision-making) are required before envisaged changes in the environment can be attained (e.g., improved air quality). This paper is structured as follows: in the materials and methods section, we present the steps taken in the systematic literature and project review to select relevant papers and practices to capture insights. We present and discuss the resulting insights in the results and discussion section, and combine these into guiding principles for a citizen science impact assessment framework. In the conclusion section, we conclude the paper with reflections on future research and the limitations of our research.

Material and methods
The analysis of the state of the art in citizen science impact assessment approaches described in this paper is built on two main sources of information; (1) a systematic review of relevant academic literature about impact assessment in the field of citizen science and participatory research and (2) a small scale empirical research into current impact assessment practices in citizen science projects. Sections 2.1 and 2.2 provide details about the steps taken for the systematic literature search and review and Sect. 2.3 elaborates the methodology for collecting the empirical evidence.

Selection of relevant literature
The process of selecting relevant literature for this systematic review was iterative, based on the steps suggested by Moher et al. (2009) in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach, as illustrated in Fig. 1.
The purpose of this systematic literature search was to identify publications that propose or discuss impact assessment methods or approaches for evaluating citizen science initiatives, as well as publications that identify different impact domains of citizen science.
The starting point of this review was a list of 21 publications that were already known to the authors, mainly from a previous review of impact assessment methods in the Measuring Impact of Citizen Science (MICS) project (Wehn et al. 2020a). To complement this list, a further literature search was conducted in the Web of Science (WoS) and Wiley Online Library using relevant keywords. Keywords were compiled that referred to two distinct aspects of the literature; (1) the concept of citizen science and (2) impact assessment. Previous research has identified overlapping terms that refer to the concept of citizen science (Conrad and Hilchey 2011;Gharesifard et al. 2019a, b;Newman et al. 2011;Wehn and Almomani 2019;Whitelaw et al. 2003).
Building on these efforts, a set of keywords that refer to the concept of citizen science or closely related fields were identified (see Table 1). Similarly, a set of keywords was identified for the second aspect of the search, which relates to impact assessment terminology. The Boolean operators "AND" and "OR" were used to combine the search terms and the asterisk wildcard (*) was used to include different variations of each term.
The literature search in the WoS was conducted on 3 April 2020 by searching the 'Topic' of literature in the core collection of WoS that includes title, keywords and abstracts. This systematic search resulted in 8299 records.  Moher et al. (2009) The literature search in Wiley was conducted on 21 April 2020. Searching the 'Topic' of literature is not possible in Wiley; therefore, the literature in this repository was filtered based on the appearance of keywords in the abstracts. This search resulted in 1176 records. In addition to the literature mentioned above, 12 publications were identified by the authors via backward and forward snowballing (Wohlin 2014). This resulted in an initial list of 9508 publications, which were then screened for relevance. The eligibility criteria for inclusion was in-line with the purpose of the review and included (1) relevance for the field of citizen science and (2) focus on the topic of impact assessment. After removing duplicates and screening the topic, abstract and keyword, 92 publications were selected for full-text review. Next, the full texts of the publications in the shortlist were browsed to determine their relevance for inclusion in our full-text review, based on the subject matter addressed in the papers. During this process, 15 publications were discarded, resulting in a final list of 77 publications that are included in our synthesis. There were different reasons for the exclusion of records, for example their focus on technical details of citizen science initiatives instead of their impact, e.g., Brown et al. (2016), or a discussion on very specific impacts of citizen science (e.g., impact of community-based research on a specific health-related problem, as discussed in Corrigan andShapiro (2010), or Naylor et al. (2002).

Review process
The full-text reviews were conducted in three phases to quality control the review process.
Phase 1 involved the setup of the approach, initial paper reviews and collation of information. Each co-author was assigned specific publications to read and review. During the review of each publication, the following information was recorded in a summary table: • Scope and purpose of assessment Whether the publication proposes formative evaluation, summative evaluation or a comprehensive/holistic approach for capturing impacts (i.e., analysis of context, process and (evolving) impacts) • Conceptual relevance: Insights of the publication regarding themes or indicator level. • Thematic content Coverage of specific themes per domain (e.g., in the society domain: learning outcomes at individual or societal levels). • Participatory evaluation: whether the method involves citizen scientists, not only in sharing their perceptions or collecting data on evolving impacts, but also devising relevant impact assessment indicators for their citizen science initiatives. • Strengths and weaknesses: What are strengths and weaknesses of the approach for capturing impacts presented by the paper In this phase, each co-author reviewed between six to ten publications. Marked versions of the reviewed publications (with highlighted sections related to the above bullet points) were saved for future analysis.
Phase 2 of the full-text review consisted of an internal peer-review process. During this phase, each author peerreviewed between six to ten publications that had already been reviewed by others in phase 1. The peer-reviewers had access to the marked version of the publications (see phase 1). This step worked as a quality control mechanism to ensure that the reviews were thorough and that essential aspects or insights of the reviewed approaches had not been missed, and to reduce subjective judgments about the reviewed impact assessment approaches and methodologies.
In phase 3, the peer review results were cross-checked by the lead authors (the first and second author) who have expertise in both social and natural sciences, and any discrepancies between the initial and first peer review results were resolved via discussion among the lead authors.
The three-phased review ensured an unbiased and complete review of the 77 publications, therefore allowing a comprehensive review and discussion of the current state of citizen science impact assessments.

Empirical research into current impact assessment practices of citizen science projects
The second source of information for this study consists of empirical evidence obtained via dedicated, semi-structured interviews with the coordinators of the following ten citizen science projects (see Table 2). The projects were selected via convenience sampling, i.e., drawing on projects known to the authors. The interviews were carried out with project coordinators who were already closely connected to the MICS project or members of the MICS consortium including a member of the MICS advisory board, two project coordinators of a citizen science project from the MICS UK case study of the MICS project, and four projects led by members of staff at Earthwatch. None of the interviewed project coordinators from Earthwatch were involved in the MICS project. In addition, all the coordinators of the 'SwafS' projects 1 which were active in January 2020 were also invited to interview and, of the ten coordinators approached, four agreed to be interviewed.
Specifically, 11 interviews (1 interview per project except for Outfall Safari which had 2 interviews) were held in the first quarter of 2020 to elicit the projects' current citizen science impact assessment approaches. Nine interviews were conducted by the co-authors from Earthwatch and the two interviews with Outfall Safari were done by the co-authors from the River Restoration Centre. From the 11 interviews, 4 were conducted face-to-face, 5 online and 2 via telephone. All responses to the questions were captured in form of notes taken during the interviews. The list of questions asked during the interviews is provided in the supplementary material. The results of these interviews were analysed using the MAXQDA software. The coding of the interview transcripts . It served to identify, per project, the purpose of impact assessment activities and the methods and approaches currently used (including participatory approaches) that represent the 'scope and purpose of the assessment', impact domains of interest and impact indicators that relate to the 'conceptual relevance' and 'thematic content' of the approaches, as well as challenges encountered in assessing the impacts of their citizen science activities that can be linked to the 'strengths and weaknesses' of each approach. This qualitative research was undertaken to provide the study with empirical evidence of current practice in the impact assessment of citizen science projects. As a complementary method to the systematic literature review, the qualitative research was undertaken from an analyticist situatedness perspective: the authors explicitly positioned themselves as peers of the interviewed citizen science practitioners and used the interviews to engage with them in a discussion about impact assessment in citizen science projects. The use of convenience sampling was a valid methodological approach for generating findings that provide indicative anecdotal evidence. While these findings have limited generalizability, they serve to illustrate the range of methods currently used for citizen science impact assessments as well as limitations of current practice. A limitation of the research is that a systematic sampling approach may have resulted in identifying additional impact domains. The validity of the generation and interpretation of the results was ensured through joint coding and analysis by the co-authors. Specifically, two teams from the co-authors coded the interviews and interpreted the results. Each team peer-reviewed the coding and interpretation of the other group. Then the two teams cross-checked the results of the peer-reviews and resolved discrepancies (regarding the coding of project proposal justification and education indicators; none regarding interpretation) via joint discussions with both teams.

Results and discussion
In the subsections below, we present the results of the systematic literature review and the findings from the empirical research into current impact assessment practices of citizen science projects. These are followed by a discussion of the combined insights which we present as a set of guiding principles for a consolidated CSIAF.

Results of the systematic literature review
Each of the reviewed publications considers one or more of the five impact domains, namely; Society, Economy, Environment, Science and Technology, and Governance (see Fig. 2). The only exceptions were two publications which focus on generic impact assessment approaches instead of specific impact domains (Jacobs et al. 2010;Reed et al. 2018). The two publications with generic impact assessment approaches are not included in the subsequent domainspecific analysis. A detailed overview of the relevance of the reviewed publications per domain is presented in Table  in supplementary material. The majority of the reviewed approaches focus on measuring impacts in 1 or 2 domains (32 and 19, respectively); only 2 out of the 77 reviewed publications referred to all 5 domains (Gharesifard et al. 2019a, b).
As is evident from Fig. 3, the reviewed literature addresses the five impact domains at distinctly different levels of intensity, with the largest number of publications (n = 65) in the society impact domain and the lowest in the economy domain (n = 12).
The review also captured whether a publication focused on measuring impacts at different levels of abstraction, namely thematic level insights or with concrete indicators. Insights at the thematic level here refer to identification of different themes (or areas of application) within each domain. For example, Ballard et al. (2017) discuss sciencerelated outcomes in biodiversity research and Cook et al. (2017) focus on science-related outcomes in the theme of participatory health research, but neither of them provide indicators for measuring these. In contrast, Jordan et al. (2012) provide specific indicators for measuring sciencerelated results of citizen science projects within the theme of ecological monitoring, for example, short or longer term changes in understanding of natural systems or number of peer-reviewed publications. As illustrated in Fig. 4, except for the two generic publications (see Fig. 2), all other publications in each domain provide insights at the thematic level, in contrast, a far smaller number of publications in the same domain offer insights at the indicator level.
The largest share of the reviewed publications did not include evidence and supporting material of measured baselines situation, outcomes and/or impacts (e.g., a supplementary material).  (2014), Wehn et al. (2019bWehn et al. ( , 2020a. In the society domain, there is a general distinction in the reviewed literature between (1) individual and collective level outcomes and (2) changes in knowledge, attitude and behaviour. One key theme relates to (individual and social) learning outcomes. Other salient themes relate to changes in relationships and partnerships among societal actors, community dynamics (including capacity, wellbeing and livelihoods) and changes in the understanding of and attitudes towards science, which provide cross-cutting links to the science domain. In the society domain, 31 publications provided specific indicators (Fig. 4). Examples include: • Indicators of community participation (Butterfoss 2006, p. 331 The themes and indicators in the science and technology domain focus on largely quantifiable outputs of the scientific process (e.g., data, publications and citations). Some approaches (Kieslinger et al. 2017;Chandler et al. 2017) capture changes to the scientific process via public participation and community engagement, changes in community-academia relations and enhancements of the scientific knowledge base 16 publications contributing to the science and technology domain provide indicators (Fig. 4). For example, Kieslinger et al. (2018;pp. 88-92)   The themes in the environmental domain focus on the status of environmental resources, e.g., resulting from conservation efforts, ecosystem functions, services and resilience, as well as impacts of environmental status on human health and livelihoods (cutting across to the society domain) and outcomes for agricultural productivity (cutting across to the economy domain). Indicators were identified in ten of the publications relating to the environment domain, such as o "improved conservation action leading to better ecosystem function, ecosystem services and resilience" (Pocock et al. 2018;p. 278) o "enhanced natural habitats and ecosystem services" (Chandler et al. 2017;p. 172).
The themes in the economy domain cover demand and supply aspects of citizen science, including the generation of economic entrepreneurial activities. While the total number of contributions in this domain is already small (n = 12), out of these, only six publications actually provide concrete indicators. Indicators on the demand side include o "number of jobs created" p. 308) o "added value of citizen science data o change in company growth o international trade and investment" (Wehn et al. 2017;36).
The contributions in the governance domain cover a wide range of themes, including the policy cycle, as well as actual changes in policy, multi-level interactions among actors and their power dynamics, communication, relationships and trust. Most contributions highlight relevant themes and only ten publications provide specific indicators. For example, o "contributions to management plans and policy" (Chandler et al. 2017;p. 172) o "stakeholder interactions in decision-making processes (e.g., data provision, expressing preferences, deliberation and negotiation, etc.)" (Wehn et al. 2017; p. 34) o "change in the level of authority and power off each stakeholder" (Wehn et al. 2017;p. 35) Along with the definition of indicators, the reviewed literature describes guidelines on how to collect evidence of impact in each domain. The analysis of the methodological approaches used or referred to reveals that a mixed methods approach (qualitative and quantitative) is by far the most commonly proposed (discussed in > 70% of publications reviewed) approach for capturing impacts of citizen science in the different domains (Fig. 5). The highest percentage of quantitative impact assessment approaches were recorded in the science and technology, and the society domains (Fig. 5); these were the domains with the highest number of papers with specific indicators (Fig. 4). This could be because these two impact domains are frequently assessed in citizen science projects. However, overall, there is a low percentage (< 8%) of quantitative methods used in all five domains (Fig. 5); this could be because of the difficulties with quantifying the impacts of citizen science. The methods used include (and often combine) observations, (semi)structured interviews, questionnaire-based surveys, generating data from document analysis via checklists, gathering data from a variety of stakeholders (including non-participants) to capture the diversity of views about the baseline situation (even in retrospect) and evolving outcomes and impacts at multiple times throughout the project.
The review of 77 impact assessment publications highlights that currently there are no standardised guidelines for assessing citizen science impact, and there is an imbalance in the domains in which citizen science impact is assessed (only 2 out of 77 publications reviewed covered all impact domains). Therefore, there is a need to build on the insights from existing impact assessments and develop a guiding framework that is able to address and navigate the complexity of measuring the impacts of citizen science across all five impact domains.

Empirical evidence of current impact assessment practices
The results of the empirical enquiry among citizen science project coordinators are summarised in Table 3. The Code System column presents the identified insights from qualitative analysis of the interviews. These insights are categorized in five groups, namely; purpose of impact assessment, method of impact assessment, impact indicators for, impact domains and challenges of impact assessment. The Coded Segments column shows the number of times that the coded insights appeared in all 11 interviews, while the

Impact indicators
Note: 'participatory evaluation' refers to situations whereby citizens were involved beyond sharing their perceptions or collecting data, by e.g. devising relevant impact assessment indicators. Such incidents were coded in the category 'impact indicators' and sub-code 'citizens involved'.
(e.g., learning); helping promote the citizen science initiative; accounting or reporting (e.g., to funders or financial accountants); or even for improving project activities and the attainment of envisaged results and impacts via adaptive management (project evaluation and improvement). Accounting/reporting was the dominant reason (coded ten times across eight of the interviews) for measuring impact in the different citizen science projects (Table 3).
The interview results indicate a range of methods for collecting evidence of impacts are used, differing in terms of timing of the methods' application in different project stages (e.g., ex-ante impact assessment before either the start of the project or the hands-on citizen science activities on the ground), as well as in terms of structuring and capturing impacts (e.g., capturing narrative impact stories vs structured surveys or interviews with a range of stakeholders) and focus of analysis (e.g., focus on actors' perspectives, or analysing the usage of citizen science tools). Surveys, interviews and feedback forms were the most commonly mentioned form of impact assessment mentioned 12 times across nine of the interviews (Table 3).
The impact indicators mentioned by the interviewed citizen science practitioners reflect some blurring of definitions or distinctions of terminology, e.g., referring to number of data points collected (arguably these are outputs, not impacts). Nevertheless, the responses indicate the broad range of impact indicators in use, which include not only cognitive changes in awareness of the topic that is the focus of a citizen science initiative, but also changes in attitudes, actions and policy.
Notably, the five impact domains were confirmed as relevant, albeit to differing degrees by the respective respondents. No additional domains were suggested. Similar to the 77 publications reviewed, the impact domains of science and technology and society, had the highest coding and were mentioned in > 45% of the interviews with practitioners. Finally, a number of challenges for undertaking impact assessments of their citizen science projects were identified, relating to the well-known dilemma of misalignment in terms of timing of funded project activities versus the (longer term) manifestation of envisaged (and observable) impacts; difficulties associated with collecting data about impacts; project priorities limiting the attention to impact assessment activities; lack of competencies to undertake sound impact assessment among project partners; and unavailability of resources.

Discussion
The analysis of the results presented in Sect. 3.1-especially the strengths, weaknesses and lessons learned from the application of citizen science impact assessment approaches-as well as the empirical evidence from citizen science projects presented in Sect. 3.2, generate a number of salient insights which we combine here into six guiding principles for a consolidated Citizen Science Impact Assessment Framework (CSIAF). Specifically, these guiding principles refer to the purpose of assessing impact in the context of citizen science, the conceptualisation of data collection methods and information sources for impact assessment, the distinction between relative impact versus absolute impact, the comparison of impact assessment results across citizen science projects, and the incremental enhancement of the organising framework over time. Below, we list the six principles to inform a consolidated CSIAF which, we hope, can serve citizen science practitioners (e.g., project coordinators, community managers) and impact researchers alike.
Putting these principles into practice to compose a consolidated CSIAF will involve the careful comparison, alignment and (if appropriate) combination of relevant indicators per domain and theme, along with the selection of data collection methods to capture evidence of (emerging) impacts. The framework will be implemented as an online resource and tool via a dedicated effort of the MICS project 2 and rolled out to citizen science initiatives in Europe and globally during 2021.

Principle 1: Acknowledging a variety of purposes of citizen science impact assessment
The reasons for the impact assessment of citizen science projects differ from impact reporting to learning for improved (future) implementation and even ex-ante impact assessment to substantiate proposal and grant applications and capture baselines. Thus, the CSIAF needs to be able to accommodate a range of reasons, purposes and timing of undertaking impact assessment within citizen science projects. This requires projects to consider both, process-related as well as results-related indicators (Haywood and Besley 2013;Ravn et al. 2016;Wehn et al. 2020c) 3 . Benchmarks and feedback on the extent to which and how envisaged results are and can be achieved are also recommended and can feed into the adaptive management of projects. At the moment, although some of the 77 reviewed publications highlight the role of evaluation in adaptive project management (e.g., Kieslinger et al. 2017;Wehn et al. 2017Wehn et al. , 2020a, most do not provide explicit examples of projects that have changed or adjusted their strategies based on assessing impacts during the lifetime of the project. 4

Principle 2: Non-linear conceptualisation of impact journeys to overcome impact silos
The intervention logic (also known as results chain or logical framework approach) is behind many impact assessment efforts of public interventions and-in particular-the assessment of research activities, namely the MoRRI framework (Monitoring Responsible Research & Innovation RRI) (Ravn et al. 2016) as well as evaluations of citizen science efforts (e.g., DITOS Consortium 2016). The definitional system of the logic framework in terms of outputs, outcomes and impacts provides useful distinctions for the different results emerging before eventual impact is achieved. Nevertheless, its inherent linear conceptualisation and generic set definitions are limiting, offering too little guidance on the changes related to citizen science. This can result, among others, in 'impact silos', i.e., lack of awareness of other relevant types of impacts.
Moreover, evidence from citizen science impact assessments has shown that impact journeys 'zigzag' across multiple domains, i.e., there are dependencies in terms of the sequence of distinct outcomes, such as social and institutional changes before the realisation of environmental improvements (Wehn et al. 2020b;Wood et al. 2020;Pólvora and Nascimento (2017).
A comprehensive CSIAF therefore needs to provide relevant impact domains as well as sufficient flexibility in the selection of relevant impact domains and respective outcomes. Our systematic review of existing citizen science impact assessment efforts confirmed the domains of society, economy, environment, governance, and science & technology.
Citizen science practitioners need to be able to plan and trace impact pathways in and across (a subset of) these domains. To do so, not only are sound distinctions between outputs, outcomes and impacts in each domain essential (Friedman 2008;Bonney et al. 2009b;Koontz and Thomas 2012), but also, causal relations between intermediary outcomes and impacts within a given domain, and between outcomes in different domains must be identifiable and traceable. Moreover, citizen science already is contributing to monitoring five SDG indicators and could contribute to 76 indicators, together amounting to 33% (Fraisl et al. 2020), providing not only data but a means for stimulating citizen action and informing and/or changing policy for SDG implementation. Therefore, it needs to be possible to select and adjust over time which SDGs the citizen science project intends to monitor and actually contributes to, as a project may pivot towards a different or additional goal.

Principle 3: Adopting comprehensive impact assessment data collection methods and information sources
Reliable impact assessment of citizen science projects involves a range of data collection methods and sources and ideally captures them not only from participants (i.e., citizen scientists) but also other relevant stakeholders and beneficiaries (Wehn et al. 2017;Guldberg et al. 2019) who can provide evidence of a range of (evolving) impacts. Some recent citizen science and citizen observatory projects have attempted more comprehensive reviews (e.g., Woods et al. 2019;Wehn et al. 2017Wehn et al. , 2019bWehn et al. , 2020b. For example, Wehn et al. (2017) proposed and repeatedly applied (Wehn et al. , 2020a a results-based approach that was complemented with relevant theoretical concepts 5 and carefully designed data collection instruments and selected methods, 6 to capture the particular social, institutional and economic changes linked to the implementation of six citizen observatories that ultimately aim for improvements in the environment. This combination of project monitoring, validation and impact assessment provided a comprehensive feedback tool to inform improvements to the final citizen observatories and innovate specific aspects of the initiatives and technological tools (apps, online platforms). The way in which project partners, stakeholders and beneficiaries provide evidence needs to allow and guide them within a wide range of suitable methods of impact assessment data collection, but without being prescriptive (Phillips et al. ,2018 to"…standardise good practice in evaluation rather than use standard evaluation methods and indicators" (p. 143) without consideration for validity of methods to cover wide range of citizen science practices and impacts (Reed et al. 2018). Such guidance towards good practice needs to encourage the provision of evidence of impacts whenever possible, including, for example, in supplementary material of papers reporting on citizen science impacts. 5 E.g., community resilience (Norris et al. 2008), participation paradigms, power dynamics among stakeholders and existing institutions (Fung 2006;Wehn et al, 2015) and economic demand and supply indicators (European Commission 2015). 6 Appropriate methods for collecting the respective data consisted of interviews, survey, social media analysis, content and analytics from the citizen observatory online platforms, observation, focus groups and the use of secondary data sources (e.g., official statistics).
Moreover, data collection for impact assessment of citizen science activities under the CSIAF should allow its users (i.e., citizen science practitioners and impact researchers) to 'practice what we preach' by involving citizen scientists in the collection of evidence about impacts as they emerge over time, gathering measurements not only of 'scientific' indicators but also of community-defined successes (Hermans et al. 2011;Haywood 2015;Graef et al. 2018;Constant and Roberts 2017;Tricket and Beehler 2017;Arora et al. 2015;Jacobs et al. 2010) such as Community Level Indicators (Coulson et al. 2018;Woods et al. 2019. Citizen science projects have different types and levels of resources (financial resources, time, networks and qualified staff) at their disposal for their impact assessment efforts which can affect the extent of their impact assessment efforts and hence the type and range of evidence that they can capture. The CSIAF should therefore provide sufficient and appropriate guidance, as well as links to relevant resources that it can be applied in both a 'light-touch' and more comprehensive manner.

Principle 4: Moving beyond absolute impact
The limitations of sticking to absolute and fixed measures of impact (typically quantified) are becoming increasingly evident, including in the field of citizen science. For example, Cox et al. (2015) acknowledge bias caused by quantitative comparison of impacts of longer running projects against those that have been running for a short period of time. Sound impact assessment needs to measure impact relative to the context and the goals and objectives of citizen science projects (Reed et al. 2018;Gharesifard et al. 2019b). The CSIAF needs to provide the means to enter and measure progress against project-specific objectives and to take context into account, including geographical context, socio-economic setting, available resources such as time, financial, staff, etc., and by providing comparisons to a different citizen science project, a non-citizen science project, or a lack of project.

Principle 5: Fostering comparison of impact assessment results across citizen science projects
As we argued from the outset, the diversity of citizen science projects in terms of thematic issues addressed, stakeholders involved, and extent and type of impact assessment undertaken, make it challenging to compare results across projects (Cargo and Mercer 2008;Hassenforder et al. 2016;DITOs Consortium 2016;Kieslinger et al. 2017;Wiggins et al. 2018), or to other frameworks such as the Sustainable Development Goals (Fraisl et al. 2020). Similar to current efforts to build in interoperability across data systems and platforms of citizen science projects (Bowser 2017;Masó and Fritz 2019;Masó and Wehn 2020), cross-comparison of impacts and data impacts would be a beneficial development for citizen science. A comprehensive CSIAF can enable comparability of impact assessment results that are based on different methods and information sources using consistent overarching categories of definitions Reed et al. 2018;Gresle et al. 2019). This could be done, for example, by capturing impact assessment results from different projects via a single online tool (e.g., questionnaire) (Gresle et al. 2019) based on the CSIAF and, during the visualisation of individual and compared results, by distinguishing validity levels (e.g., via a color scheme) according to the range of underlying data sources. This can serve to generate both, project-specific as well as aggregated results.

Principle 6: Cumulative enhancement of the framework over time
The collective advancement of impact assessment theory and practice in the field of citizen science relies on reflection and cumulative additions, based on insights across projects and methods. To remain relevant over time and serve the citizen science community, the impact assessment needs to be built on collective and cumulatively evolving intelligence, based on additional inputs and definitions by researchers and practitioners as well as more structured reflection and quality control (peer review) to check whether appropriate items, definitions and methods are being used.
A tiered level of indicators (similar to the SDG Tier 1-2 and 3 system of indicators 7 ) may be used to indicate the maturity level or peer review status of new indicators that are under review. A similar system may need to be set up and maintained for curation of the CSIAF. Communities of Practice (CoPs) such as the WeObserve CoPs, and related fora such as Working Groups of the European Citizen Science Association 8 , can offer the continuity and space for practitioners to reflect on, discuss and refine CSIAFs. For example, the WeObserve project 9 launched four Communities of Practice as a key mechanism for consolidating the 7 Tier 1 and 2: indicator is well conceptualized and has an internationally agreed-upon methodology vs. Tier 3: internationally established standards and methodologies are not yet available; however, standards and methodologies are under development. 8 ECSA Working Groups cover strategic work of the association by means of organising ECSA members around specific topics. There is no ECSA WG dedicated to impact assessment but relevant WGs touching upon impact assessment in citizen science include, among others, the empowerment, inclusiveness and equity WG; policy, strategy, governance and partnerships WG; and sharing best practice and building capacity WG. 9 weobserve.eu, H2020 (2017-2021). knowledge within as well as beyond the WeObserve consortium. These CoPs serve as a vehicle for sharing information and creating new knowledge on selected key thematic topics related to citizen science and include one CoP dedicated to capturing the impact and value of citizen science. These fora have contributed to strengthening the knowledge base about citizen science in general and on citizen science impact assessment in particular.

Conclusions
This paper has presented a systematic review of impact assessment methods for citizen science, the resulting insights of which provide guidance for a consolidated citizen science impact assessment framework. The ambition of such a consolidated framework is to overcome the dispersion of approaches and gaps in assessing the diversity of impacts that citizen science projects can generate.
The insights generated by this study have been combined into six guiding principles for a consolidated citizen science impact assessment framework, namely (1) acknowledging that there are a variety of purposes for citizen science impact assessment; (2) conceptualising non-linear of impact journeys to overcome impact silos; (3) adopting comprehensive impact assessment data collection methods and information sources (qualitative as well as quantitative); (4) moving beyond absolute impact to include relative impact; (5) fostering comparison of impact assessment results across citizen science projects; and (6) cumulative enhancing the framework over time.
This study has shown that a key characteristic of such a framework is not only its conceptual grounding in the latest insights, but its flexibility in terms of the purpose for which citizen science projects undertake impact assessment activities and the resources (means) that they have at their disposal to capture evidence of emerging impacts. Providing flexibility for both aspects will maximise the usability of the proposed consolidated CSIAF-and therefore the impact that the CSIAF itself will have among the community of citizen science practitioners.
The publications and interview data reviewed in this study stem from diverse scientific fields and epistemological approaches, incorporating distinct perspectives and framings not only of impact assessment, but also citizen science. This diversity goes hand in hand with the use of varied and comprehensive data collection methods to capture evidence of (emerging) impacts. A key step in the compilation of the framework must therefore be the careful comparison, alignment and (if appropriate) combination of relevant indicators per domain and theme. Also, many citizen science projects may have difficulties to generate an empirically based baseline situation (ex-ante) with respect to the initial stage of knowledge, understanding, attitudes and behaviour of key stakeholders and especially citizen scientists whom they aim to involve. The framework, therefore, needs to provide guidance on how to simulate this, e.g., by drawing on comparisons between participants and non-participants using existing data sources (government reports) as well as innovative data sources (e.g., social media) and analytical techniques (social media mining) and integrating estimates of past projects. These latter will become increasingly feasible with the implementation of the CSIAF framework as an online resource and tool by the MICS project, availing reference data from past projects. This paper has contributed to current efforts in the citizen science community to enhance the ease and consistency with which impacts of projects, large or small, can be captured, as well as the comparability of evolving results across initiatives. Achieving the full potential of citizen science in whatever form it is practiced, requires, among other factors, evidence and demonstration of its outputs, outcomes and impact to highlight its potential for bringing about change and engagement.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.