A Theoretical Framework for the Evaluation of Massive Digital Participation Systems in Urban Planning

Urban development that strives to meet democratic ideals and the needs of all stakeholders must incorporate public participation. Contemporary participation processes may employ digital tools that open new possibilities regarding the range of participants and the intensity of participation. In particular, they can uniquely allow for large and diverse groups of participants to be involved in collaborative design processes. Evaluating such processes is important because it allows for the justification of the necessary costs and efforts, as well as continuous improvement. Using the phases specified in the minimal viable process of the U_CODE project as an example, this paper aims at describing criteria for the evaluation of participation processes and propose several possible methods for their assessment. While the majority of these criteria resemble criteria traditionally used to assess public participation in general, this paper proposes an additional criterion, as well as ways of applying all of the criteria to digital participation methods. In addition, the criteria and methods described in this paper not only may be used for evaluative purposes during or after a digital participation process but may also be useful guidelines during the planning stages of participation processes. Hence, it may help to consider these criteria to assess the value of the process during its inception stage to avoid mistakes and to enhance the democratic value of the participation process.


Background
In the face of expanding human populations and an ongoing trend towards more urbanisation, the living conditions in cities are crucial for residential satisfaction and, thereby, crucial for life satisfaction for a large share of the world population. Hence, it seems justified for the public to directly participate in the design of urban areas in order to thoroughly incorporate their needs in design decisions. This argument can be justified not only from a pragmatic angle but also from the perspective of democratic theory: core democratic principles like popular sovereignty (Thomassen 2007) require that citizens be intensively involved in decisions regarding issues that will affect them. Clearly, the design of the urban habitat, which is the immediate living environment of the majority of today's human population, is certainly such an issue.

Public Participation in Urban Planning
Established formal procedures involving the public in urban planning commonly feature some kind of notification or public consultation (Rodrigo and Amo 2006), which is often limited. Plans and models are exhibited in public buildings during certain opening hours, with individuals and stakeholder organisations being allowed to put their objections or suggestions on record. These procedures are not a good fit for modern societies characterised by direct and immediate communication and interaction, but also by constant social acceleration (Rosa 2013). Furthermore, they leave a number of promising technological innovations unexploited. The developments in question most notably comprise advancements in communicating architectural designs via virtual or augmented reality applications, allowing for a more immersive experience than traditional plans or models. With the omnipresence of smartphones and other digital handheld devices, citizens are enabled not only to interact via multiple additional, Internetbased forms of communication but also to take advantage of in situ visualisations using augmented reality applications.
Conventional, established methods in urban planning designed to profoundly involve the public consist of procedures where interested participants meet in person, e.g. in 'planning cells' or town hall meetings (Rowe and Frewer 2005;Nanz and Fritsche 2012;U_CODE n.d.). These procedures have been very successful in smaller communities, where the participants know and trust each other. However, these methods are harder to use in large cities where people are often not familiar with each other. In larger cites, however, citizen participation in urban planning would be most relevant because the majority of the world's population live in such environments.
While traditional face-to-face settings in participatory planning processes are well-tried and widely investigated (e.g. Brown and Chin 2013; Bryson et al. 2013;Horelli 2003), the use of digital tools for urban participation processes is a more recent approach and is less comprehensively analysed. Different tools have been developed for some of the steps involved in such a process, among them the Quick Urban Analysis Kit (qua-kit; Mueller et al. 2018) for crowd-based creation, or several methods based on public participation geographic information systems (PPGIS) like MapNat (UFZ n.d.) or Maptionnaire (Kahila-Tani et al. 2016). Dedicated approaches and use-case descriptions exist for the creation and evaluation of such tools, e.g. for PPGIS (Jankowski and Nyerges 2003;Kahila-Tani 2015;Lu et al. 2018). However, general design guidelines and evaluation criteria for digital participation processes as a whole, and for digital tools employed during different stages within such processes, are currently lacking. It is, therefore, hard to estimate the appropriateness and the impact as well as the quality of these new models and tools. This paper aims to fill this gap. It is intended as a straightforward introduction, aiming at making the existing social science knowledge, regarding the evaluation of citizen participation, easily approachable to stakeholders with different backgrounds, e.g. urban planners or architects.

The U_CODE Process as Framework for Digital Participation and Collaboration for Urban Design
The following section provides a condensed overview of digital participation and collaboration in the context of urban design, by describing one exemplary citizen participation process that heavily relies on digital tools.
Within the EU-funded Horizon2020 project U_CODE (Urban Collective Design Environment), a novel co-design process was proposed to simultaneously improve upon the shortcomings of traditional methods of public participation and make use of modern tools for massive communication (Stelzle et al. 2017). Going beyond consultation and decision-making featured in traditional participation processes, U_CODE puts special emphasis on collaborative ideation (also called 'co-creation') involving large numbers of participants through the use of digital media and tools. The term codesign refers to the collaborative creation (co-creation) of design alternatives for a given design task. Trying to involve citizens in the design of their environments means seriously putting them in control, corresponding to the highest step on the 'ladder of participation' (Arnstein 1969).
Besides developing an array of new instruments and technologies, special focus within the project was put on the design of a comprehensive procedure which, on the one hand, allows incorporating digital public participation and, on the other hand, includes all necessary steps of an ordinary design and development process in urban planning (the so-called minimal viable process; see Fig. 1; Stelzle et al. 2017). Building upon standard German planning procedures (as stipulated by building regulations), it describes several points during the process at which public participation can take place, informed equally by experiences stemming from own practice and by participation literature (e.g. Sanders and Stappers 2008).
The U_CODE process comprises a number of distinct stages (initiating, co-briefing, co-design, professional design, integrating and decision-making), for each of which digital tools can be used, e.g. to convey information on the planned project; to allow ways for participants to contribute their own ideas, needs and wants; or to rank different designs. The project aims at replacing traditional face-to-face participation and collaboration with a digital co-design procedure and co-design tools that, on the one hand, cover all the technical demands that are necessary for a professional and sound design process but, on the other hand, are accessible and understandable also for non-professional co-designers, i.e. the citizen participants. The process chain developed in U_CODE covers a number of tools, e.g. for the analysis of sentiments and discourses, smartphone/tablet co-design games, virtual reality co-design experiences and digital voting systems.
Each stakeholder group-be it project initiators, professional designers, citizen participants or authorities-raises different thematic, intellectual and cognitive demands that must be met by the respective tools. To do so, the process employs tools which are (a) design-oriented and synthetic (resembling visualisation and mapping approaches traditionally used in urban planning and architectural design; Champlin et al. 2019) and (b) of analytic character (i.e. relating to approaches used in mathematics, statistics or social sciences).
One key research interest within the U_CODE project is to examine influencing factors and best practices in order to create and empirically assess a digital toolset which enables digital-mediated participation in urban planning. Possible Fig. 1 The minimal viable process proposed in the U_CODE project (yellow: roles of persons involved in the process, red: synthetic tools, blue: analytic tools/processes, green: process steps involving external parties, black: outputs; for details, see Stelzle et al. 2017) criteria for the design, evaluation and tools of such a process shall be described in the present paper.

Evaluation as a Scientific Approach: Objects, Criteria and Methods
Evaluation generally refers to the process of systematically assigning a value to a certain investigated object, e.g. a process, an object or living circumstances (for a general introduction, see Rossi et al. 2004; for an introduction to evaluation for public participation methods, see Rowe and Frewer 2000). It is often motivated by the desire to improve the object in question, but also as a means to legitimise public investments-by measuring the benefits from political decisions or programmes (effectiveness) or relating their outcomes to their costs (efficiency). For example, in a post-occupancy evaluation (POE) of a building (e.g. a hospital or public library), evaluators establish how well the building works. For that, one must analyse how (and how well) the different parts of the building serve the purposes they were designed for (Preiser 1995;Preiser et al. 1988). This can have immediate consequences (identifying actions which can be taken to remedy acute problems), medium-term consequences (identifying problems to be avoided with the construction of the next similar building) or long-term consequences (improving building standards). Evaluations may use objective measurements for easily quantifiable criteria or any of the standard social science methods (e.g. interviews or questionnaires), but may also utilise special procedures befitting the evaluated object (e.g. walkthrough interviews, a method particular to POEs).
For the evaluation of digital participation processes or the use of digital tools within traditional processes, new methods may have to be developed in order to adequately capture the relevant data necessary to assess the quality of a digital process. Previous work has been undertaken to formulate evaluation strategies targeting GIS-supported public participation in planning (Jankowski et al. 2019) or web-based public participation (Stern et al. 2009), both contrasting these approaches to conventional, offline participation processes. Depending on the purpose of the tools used in a specific digital participation process (e.g. co-creation tools), evaluation methods may also comprise the analysis of usage statistics of the employed web platforms or eye-tracking methods for usability analyses of the digital tools.
To produce valid results, evaluations need to be conducted in accordance with high methodological standards-ideally using experimental settings, allowing for the identification of causal effects. While in evaluations of social processes, rigorous study designs like randomised control trials (randomised allocation of participants into trial and control groups; Bothwell et al. 2016) may not often be achievable, any evaluation should strive to come as close to this goal as possible.
In general, for an evaluation to measure the attainment of specific goals, these goals must be explicable, concrete and known to the evaluators. Such goals may comprise certain object attributes or process outcomes and will be operationalised in the form of evaluation criteria (see below).

Steps in the Evaluation of Participation Processes
For the evaluation of participation processes, Rowe and Frewer (2004) suggest an evaluation agenda comprising three basic steps: & Define effectiveness: The goals of the participation process must be known in order to assess them. In this regard, it is crucial to be conscious of the perspective of the evaluation (because different actors within the process may have different definitions of process success, it must be specified for whom the evaluation is conducted) and to decide whether to focus on the outcome and/or quality of the process (Chess and Purcell 1999;Rowe and Frewer 2000). As a main result of this step, specific evaluation criteria must be determined which can be measured in the next step. & Operationalise the definition: In this step, methods must be chosen which allow assessing the extent to which the goals of the participation process have been met. Such methods may comprise questionnaires or interviews, but also the collection of rates or patterns of certain behaviours. If the necessary methods do not exist, they must be developed by the evaluators, according to quality criteria of social science instruments (i.e. validity, reliability, objectivity and utility; e.g. Rossi et al. 2004). & Conduct the evaluation and interpret results: Once the evaluation criteria have been defined and corresponding methods have been chosen or developed, the actual evaluation is conducted, meaning that data is collected and statistically analysed and conclusions are drawn from it. The results are often presented in the form of an evaluation report.
Regarding the evaluation of public participation processes, goals are not only defined by the quality of the specific outcomes of the process (or of process steps). Moreover, there are a number of normative criteria, relating to the process itself. They are derived from democratic ideals (e.g. equality and equity, fairness, transparency, etc.) which should always be met when involving the public in decision-making.
This paper puts an emphasis on these general criteria because they are valid for all phases and tools used in any participation process. More concrete criteria can and should be derived to assess specific process tools or procedures. While the present paper provides examples of such possible specific criteria for tools used within the U_CODE process, they may have to be adapted to be used for other processes. Where process designers and software engineers experience difficulties in developing specific design and evaluation criteria applicable to their tools and procedures, they are advised to collaborate with evaluation experts versed in democratic theory, evaluation methods and product usability (e.g. political scientists, psychologists or sociologists).

Evaluation Criteria for the Process Quality of Massive Digital Participation
The criteria described in the current and the next section are based on the Core Principles for Public Engagement (NCDD 2009), and they can be used to evaluate the quality of the participation process as a whole. However, they are also applicable for each process step and for every tool used within the participation process.

Representativeness
The individuals involved in the participation process (the "participants") shall be representative of the population affected by the project or design (the "public"). This is crucial to ensure the democratic legitimacy of the whole process.
Hence, a process facilitator should monitor the make-up of the participants regarding certain socio-demographic characteristics. Variables to be considered may include gender, age, education, socio-economic status or the place of residence.
The proportions of these variables within the participants should be similar to those within the public-ideally also regarding combinations of multiple variables. For example, if a public consists of 500 women aged above 50 years and 500 men aged below 50 years, a participant group of five women aged below 50 years and 5 men aged above 50 years may seem statistically representative when gender balance and mean age are examined in isolation, not however when also examining the combination of the two variables. For certain demographic variables, this may be of particular relevance, for instance, trying to represent certain minorities in a participant group according to their proportion within the total population may result in biases in terms of educational status (Boulianne 2018). While the selection of demographic variables and possible combinations to be separately examined may vary for different participation processes, they must be chosen on firm grounds, ideally based on previous empirical evidence. Hence, it seems sensible to consult sociologists, political scientists and/or psychologists at this stage if they are not part of the project team already.
A participation process facilitator may employ specific strategies to achieve a representative group of participants. For example, to invite the public to the process, one may try to maximise the number of different communication channels typically used by different socio-demographic groups. Alternatively, a random sample of the public can be drawn and then be invited to participate in the process. However, due to self-selection biases, such recruiting strategies may not be sufficient for the participants to be truly representative of the public. To correct this problem, the socio-demographic composition of the participants should be continuously monitored and invitation efforts be intensified for the hitherto underrepresented socio-demographic groups.
Despite such efforts, even if the composition of participants is representative of the public, this will not guarantee the output from the participants (e.g. the number of utterances) to be correspondingly representative of the ideas existing in the public. Hence, it may be appropriate to over-represent certain groups of participants, for example, people who usually talk less and/or with a quieter voice (Schlozman and Brady 1995)-especially if they are of particular relevance for the specific project. These groups may comprise women, immigrants, less educated persons, children or social minorities. As a rule of thumb for Western societies, this may include any group besides white cisgender heterosexual men.

Inclusiveness
Inclusiveness refers to the opportunity and ability of all participants to equally contribute to the process. Although in theory, each participant may be equally able to voice their opinion, in practice, certain groups of participants are more likely to do so than others (Schlozman and Brady 1995). In this regard, gender, age and education are important factors to consider. Although trying to over-represent certain participant groups (see above) may be a means to reduce this bias, the effects of this strategy can be limited. This means that the different abilities and communication habits of different participants must also be taken into consideration when designing the processes themselves. Besides that, powers that different participants may hold over one another outside the participation process as well as differences regarding the processrelevant information and expertise individually available may become problematic issues (Forester 1982). During face-to-face participation settings, facilitators can try to balance the participants' contributions by deliberately cutting off certain participants and actively inviting others to contribute, based on the frequency with which different participant groups have contributed already.
Using digital tools to mediate communication in participation processes has advantages and disadvantages in this regard, where some barriers to communication are reduced (e.g. less educated participants are not in direct contact with more educated participants by whom they may feel intimidated), but other problems may arise. First of all, the technical and informational literacy of participants is a critical factor, meaning certain demographic groups are less proficient in using digital media than others (e.g. some older participants). Secondly, some groups may be less motivated by the prospect of participating in a process aiming at different abstract possibilities for the future, without experiencing instant results. While this of course is a fundamental problem equally relevant for traditional participation processes, the additional layer of abstraction and social isolation imposed by digital media may take away some of the gratification of achieving something together in a group.
Thirdly, the use of the digital media may exclude participants with disabilities. While digital tools may be unproblematic or even beneficial regarding some disabilities such as problems with hearing, special attention must be paid to certain other disabilities, e.g. visual deficiencies, somatosensory problems or cognitive deficits. For the design of user interfaces of electronic tools, the corresponding international standards on accessibility (ISO/TC 159 Ergonomics 2008) may be instructive. In addition, in every communication during the process, it is important to use plain, clear language accessible to all participants, to accommodate for different cognitive abilities of a diverse public. This can be achieved by using short and logical sentences each containing one idea only, formulated in active voice and avoiding technical jargon (Directorate-General for Translation 2016).

Internal Transparency
For participation processes to be successful and legitimate, they must be transparent. This means that at any time during the process, it must be obvious to everyone where the process is headed, what the next step is and why it is being taken. Hence, the information necessary to understand the process must be up to date, easily obtainable and comprehensible to anyone.
In traditional participation processes, process designers are responsible for the provision of this information, and trained facilitators play a crucial role in establishing transparency in face-to-face settings. In digitally facilitated participation processes, internal transparency is also a design challenge for the software engineers. Used electronic tools and media must clearly inform on the overall process, the rationale behind the current step and the concrete task to be completed with the current digital tool. It is crucial that all information is provided timely and comprehensively, and easily accessible to everyone (see inclusiveness).

Facilitation of Deliberation
Public participation aims at generating collective, broadly accepted decisions, ideally found via a consensus of all participants. In particular, when the interests of different stakeholders are initially very disparate, the process must allow for the participants to understand one another-in order to take each other's perspective. The mutually respective, factbound exchange of ideas, bearing the potential of changing one's mind, is called deliberation (Roberts 2004). Any participation process must aim at facilitating deliberation, by establishing the general atmosphere, the specific process steps and the respective tools. Higher-quality deliberation will increase the likelihood of finding consensual agreements. Following the notion of agonistic pluralism (e.g. Mouffe 1999), one may argue that a consensus must not necessarily be found. Nonetheless, a high-quality deliberative process will make it more likely for a decision brought about by a majority vote to be accepted by the outvoted participants.
For participation processes relying on digital media and digital tools, this criterion can be particularly challenging, because digitising such a process aims at reducing the need for professional facilitators who, in face-to-face participation settings, will actively try to improve the deliberation quality. Hence, any digital tool used during deliberation processes must allow for each participant to make ideas and arguments heard and for the other participants to improve their comprehension of the respective idea or argument through further inquiry.
In face-to-face participation settings, the group communication will be facilitated by neutral communication experts, ensuring that only the relevant topics are being focused on, that procedures are followed, that the relations and differences between the participants are not hindering the process and that all participants use the provided tools to their full potential (Pelzer et al. 2015). Digitally mediated participation processes must find ways to incorporate high-quality deliberation without external moderators. For example, a process designer may opt to organise a social system among the participants, e.g. by enabling the participants to assume the role of an impartial moderator who may have additional rights (e.g. banning 'trolls' from the discussion, ending fruitless discussions, requiring participants to explain their proposal better in order for it to be discussed any further or initiating decisions on specific questions or on general rules of conduct). Having a clear set of rules for respectful communication and a subset of participants with the right to enforce these rules may lead to a more civilised atmosphere and, finally, to a higher level of deliberation. The moderators can, for example, be democratically elected by all participants. In the spirit of gamification, a process may alternatively design this role to be earned via a certain number of contributions which have been evaluated as being 'constructive' or 'useful' by the other participants in previous steps.
Digital tools automatically monitoring the quality of the deliberation (e.g. by sentiment analysis and opinion mining; Liu 2012) may be used to identify the most controversial issues within the process-where the instalment of impartial moderators would be most effective.

Evaluation Criteria for Massive Digital Participation Process Outcomes
Regarding the outcomes of public participation, one may distinguish between (1) the effects the process has on political or technical decision-making-regarding the impact not only on specific decisions but also on decision-makers and the general public-and (2) the effects the process has on the participants. Although the former may be separately evaluated, for the sake of brevity, they are pooled for the purposes of the present paper and summarised as 'external transparency'.

External Transparency as a Prerequisite for Effects on the Public and on Political Decisions
A major challenge faced when evaluating any public participation process lies in verifying and quantifying a causal effect of the process on public sentiment and specific political decisions. For one, there may be a substantial time lag between the process and its consequence(s). The participation process may well have an impact on a political decision-however, this decision may only be taken after the evaluation of the participation process. As a second challenge, in complex systems like the societal interrelationships underlying and preceding political decisions, accounting for causal effects of any single factor on that decision is difficult. Where a number of different factors are influencing political decisions simultaneously and in interaction, it will be hard to establish the single effect of a participation process, especially when it is conducted through a process of consultation. A third evaluation challenge concerns generalisability: The actual output quality of different tools can, of course, be the subject of an evaluation, but specific criteria will strongly depend on the tool and task in question, complicating comparisons between different tools or participation processes.
Therefore, in order to evaluate the external outcomes of a process or tool, an evaluation may rather rely on the prerequisites for these outcomes to come about-namely the external transparency of the process or tool. For public participation to achieve external transparency, actions must be taken to continuously inform the public about what happens within the process, and why. In other words, the results of the process must be made transparent, as well as the process itself. In massive electronic participation processes, where the distinction between participants and the (non-participating) public is less clear-cut because they are being conducted completely in an open fashion, all of the actions to achieve internal transparency will simultaneously be beneficial for the external transparency of the process (see above).

Effects on the Participants
Participation processes not only yield results relevant to political decisions (e.g. in relation to urban planning projects) but should also have effects on the subjects partaking in the process. For one, this refers to the understanding of the subject matter at hand, which should ideally have increased during the participation process. Such effects can come about through being engaged with the subject matter within the participation process, but also through learning from the other participants and the process of collaborative knowledge construction, which are important motivations for future participation (Bandura 1977).
Altogether, the participation process should have built up participation willingness and participation competencies, i.e. leaving its participants motivated and empowered to be valuable participants in future participation processes. The latter specifically comprises deliberation skills, i.e. the ability to engage in an open, rational exchange of ideas-where every discussant strives to make their own view understood by the others, while being willing not only to understand the views of the other participants but also to adopt them if they are better. Regarding the motivational aspects, a central indicator to be measured would be the satisfaction with the current participation process.

Possible Operationalisations of the Evaluation Criteria Using the Example of the U_CODE Process Phases
In addition to the previously outlined evaluation criteria that apply to participation processes in general, we suggest novel criteria that incorporate tools and media used in digital participation processes: First and foremost, the whole process and every tool used therein must respect the privacy of the participants and be compliant with applicable laws and regulations concerning data security and privacy. This may be a challenging task because it means balancing two contradicting requirements. On the one hand, transparency is crucial within the process. This is necessary to attribute contributions to specific participants and to allow feedback and collaboration-in particular when online and offline tools are mixed. On the other hand, anonymity may be desired (and legally required) for external communication and long-term documentation.
Besides that, the tools used in the process must of course function flawlessly. One may derive additional evaluation criteria from these technical requirements. This may for example concern technical robustness, operational safety, speed and cost-effectiveness of the hardware and software.
Massive digital participation systems comprise a sequence of distinct phases with different intermediate goals. In the following subsections, we outline possible operationalisations of the criteria for these phases, using the U_CODE process (see Fig. 1) as a paradigmatic example (see Table 1 for an overview of the possible operationalisations).

All Phases
Regarding most of the evaluation criteria described above, possible operationalisations would be very similar in all phases (although they should of course be measured separately for each phase). In particular, this pertains to representativeness, inclusiveness, external transparency, the quality of the digital tools and the effects on the participants. We address the following criteria first: During each process step, the participants should be representative of the people affected by the design. The data necessary for statistically analysing the demographic characteristics of the participants may be readily available within the process (e.g. through a registration process or via associated social network accounts), or it may have to be actively collected. However, the evaluation should cover not only the make-up of the participants themselves but also the weights of the output from each demographic subgroup in each step. In order to achieve this, the contributions of each individual (utterances, design proposals, inquiries on other participants' ideas) must be counted and be related to the demographic characteristics.
For the evaluation of inclusiveness, the provision of information on the procedure and the editing and presentation of the subject matter should be assessed. It must also be determined whether the process designers accommodate the different communication habits and abilities of special participant groups. For an evaluation, this may mean checking whether plain, clear language is used and technical jargon is avoided (as outlined in a number of style guides; e.g. Directorate-General for Translation 2016). Also, the digital tools employed in the process should be easily usable and take into consideration different physical or cognitive disabilities on the part of the participants. In this regard, evaluators may refer to p e r t i n e n t t e c h n i c a l g u i d e l i n e s f o r m e a s u r a b l e operationalisations, e.g. the international standards on accessibility (ISO/TC 159 Ergonomics 2008; esp. the list of requirements in Annex B and the checklist in Annex C.1).
To evaluate external transparency, it may be investigated to which extent-and in which quality-the information collected and presented during the process is being compiled to form some sort of documentation which allows for external parties to easily comprehend each step and each result at any time during or after the process. The process facilitators use should not only compile this output after the process but also constantly feed it out of the process, via automated channels or in an edited format. While the accessibility and comprehensibility of the output may be evaluated via judgements of communication experts, verifying whether it has indeed been accurately understood by the public should involve an investigation involving the public itself, e.g. by interviewing diverse members to ascertain the accurate understanding of the process output. The evaluation may also focus on those persons within the process who are responsible for public relations. Evaluators may check not only whether they are constantly informing the media, relevant stakeholder organisations and political decision-makers but also whether they are available to competently answer inquiries from the public.
The quality of the digital tools is also important for all process phases, concerning both basic technological functioning in general and data protection requirements in particular. For an evaluation of the former, one may, for example, analyse the number and frequency of automatically logged software crashes (relative to the number of active users) or user  External transparency Availability, comprehensiveness and comprehensibility of the information regarding the overall process, the current task, the aim of the current phase, its role within the overall process and the results that have already been achieved; quantity and quality of the output fed to the public via automated feeds or dedicated public relations specialists

Effects on the participants
Accurate understanding of the project objectives Satisfaction with the process; subjective assessments of deliberation quality and likelihood to participate in future participation processes; actual and perceived knowledge gains regarding the process' subject matter Quality of the digital tools Fulfilment of basic technical requirements (e.g. technical robustness, operational safety, speed and cost-effectiveness of the hardware and software); employment of a data protection strategy which adequately balances the needs for process transparency and participant privacy judgements regarding the rendering speed of virtual reality applications. Further parameters of interest in this regard are the amount and quality of the processed data. This includes questions such as the completeness of data sets, appropriateness of the data formats or the availability of metadata. Regarding aspects of data protection and privacy, an evaluator may consider seeking expert judgements from the responsible data safety officers. The effects on the participants could be measured by administering questionnaires to the participants at the beginning and end of the process, with questions measuring the factual knowledge the participants have of the subject matter and/or self-assessments regarding the perceived level of understanding. Comparing the two measurements, the evaluators can determine the extent to which the process has led to improvements in understanding the subject matter. The participants' motivation to participate in future processes could be measured by asking questions relating to the satisfaction with the process, the subjective assessments of deliberation quality and likelihood to participate in future participation processes. In the case of digitally mediated public participation, the process satisfaction must also cover the technological aspects of the process, i.e. the satisfaction with the used tools and a willingness to use them again in the future.

Phase 1: Process Initiation
Starting from a general idea, an 'initial brief' will be created which roughly outlines the envisioned project.
The initial brief may include information on the project scope, relevant stakeholder groups and general objectives. It will be co-created by the project initiator and the super moderator; hence, a critical factor for internal transparency will be the quality of the communication between these two actors. It may be possible to assess the communication quality by analysing documents created in this phase (e.g. e-mails) or by interviewing the actors.
In the previous section, we generally outlined possible operationalisations for the assessment of the effects on the participants. Because the process initiation phase does not directly involve any participants, only a selection of these operationalisations apply to it. In the process initiation phase, effects on the participants will primarily concern the degree to which the public accurately understands the project's objectives. This may be ascertained using interviews or questionnaires.

Phases 2 and 3: Co-briefing and Co-design
In the co-briefing phase, the initial brief will be enriched by requirements regarding the project contributed by process participants, using digital brainstorming and idea-harvesting tools. In the co-design phase, digital co-creation tools will be used to create a professional design brief, consisting of (lowlevel) design proposals. Both phases are similar regarding internal transparency and facilitation of deliberation: To ensure internal transparency, it is crucial that instructions for the task(s) to be carried out in the respective phases are readily available and that they are correctly understood by the participants. Also, it must be clear what the current task aims at and which role it has in the overall process. Since cobriefing and co-design may be novel tasks for many participants, special emphasis may have to be put on explaining their aims, the differences between the two tasks and their roles within the overall process. An evaluation may employ questionnaires at the end of the respective phase, asking the participants for subjective assessments regarding these issues. Also, to identify potentials for improvement concerning the availability and comprehensibility of the information provided in these phases, digital support systems may be analysed (if present). Counting the number of questions, and particularly the frequency of questions relating to very similar aspects, potential weaknesses may be identified.
To facilitate deliberation, the tools employed in the process must enable the participants to effortlessly express their own ideas and to easily understand other participants' ideas in order to build upon them. This is crucial for the collaborative quality of co-briefing and co-designing. The perceived ease of expressing, understanding and building upon ideas, as well as the participants' readiness to adopt and engage with other participants' ideas, may be assessed using questionnaires administered to the participants at the end of the respective phase. Also, evaluators may analyse documents or inquire with the process designers regarding possible strategies used to facilitate deliberation. To evaluate the actual deliberation quality, the evaluation may develop evaluation methods specific to each tool, e.g. focusing on the number of ideas created or built upon, the number of contributions to discussions or the number of incivilities reported to moderators during the process. Furthermore, if monitoring tools like sentiment analysis and opinion mining (Liu 2012) are used within the process, they may also be employed to automatically and continuously gauge the deliberation quality, e.g. by assessing the civility of the interaction between the participants.

Phase 4: Professional Design
The previous two phases resulted in a co-designed project brief and low-level design proposals, which together make up a brief for the professional design phase. In this phase, professionals create design proposals, possibly in the format of a conventional design competition. While following their established work procedures, the design firms may communicate their ongoing work to the public and receive feedback, for example via sentiment analysis.
To warrant internal transparency of this phase within the overall process, the design tasks carried out by the professionals should be thoroughly communicated. For example, it may be explained to the public how the professional design process works, i.e. which inputs are used by the design professionals (and which are not used) and which intermediary steps are taken within the process. The public in general, and the process participants in particular, may also need to be instructed on how to interpret the output of the process, i.e. the professional design proposals. Since most laypeople are not accustomed to reading architectural plans and understanding handcrafted or virtual architectural models, dedicated strategies for architecture communication may be necessary to facilitate the next steps (integration and voting). For the evaluation, questionnaires may be used via which the participants are asked to judge how well they understand the professional design process and the resulting design proposals.
In order to facilitate deliberation, the process designers may provide feedback tools which allow for rational discussions of the intermediary products of this phase. Such feedback platforms may be moderated to ensure respectful and fruitful discussion, e.g. by citizen moderators (for details regarding this idea, see Section 'Facilitation of deliberation'). For the evaluation, the participants may be asked via questionnaires about their perceptions regarding the deliberative qualities of the platform and the quality of its moderation. Also, if sentiment analyses are conducted within this phase to gather feedback on the designs, they may also be used to gauge the deliberation quality, e.g. by assessing the civility of the interaction between the participants. For the evaluation, on the one hand, the results of sentiment analysis may be used to assess the deliberation quality. One the other hand, the evaluation may try to establish how well the sentiment analysis itself was conducted and how much it contributed towards the aim of objectifying the process, i.e. which consequences were drawn from it.

Phase 5: Integration
In the integrating phase, the co-design brief, the design proposals, the professional designs and information from analyses of the public sentiment are integrated, using a gallery tool which allows for discussion and voting. A final design proposal which enjoys broad participant support constitutes the output of this phase.
Regarding the evaluation of internal transparency and facilitation of deliberation, this phase largely calls for the same operationalisations as in phases 2 and 3 (co-briefing and codesign).
In addition, regarding internal transparency, in the integration phase, it will be of additional importance that the voting process is transparent, i.e. it must be clear which differences exist between the alternatives to be chosen from and how the voting process works. The evaluation may rely on expert judgements regarding these issues, but also use questionnaires to assess the participants' perception of the transparency.
For the facilitation of deliberation, the gallery tool will need to allow for discussion and voting. The tool itself, as well as the quality of possible moderation efforts therein, can be evaluated using the same approach as the evaluation of the feedback tool in phase 4 (see above).

Phase 6: Voting
In this last phase, the final design proposal will be approved by the project initiator and later be handed over to the authorities, in order for them to continue the legal process leading to the implementation of the design proposal.
In this phase, the evaluation may focus on the quality of communication between the involved parties and the transparency of the decision-making. Similar to the evaluation of internal transparency in phase 1, evaluators may consider assessing the communication quality by analysing the documents created in this phase (e.g. e-mails, resulting planning documents) or by interviewing the actors. Above that, one may argue that only when formal decision-making accepts and implements the outcome of a participation process, this process will have been successful. Hence, if the timing of the evaluation allows, the actual implementation of the voting results should also be evaluated.

Discussion
Massive digital participation systems may be an important means of striving towards attaining democratic ideals in urban planning because they allow for substantial parts of the public to take part in shaping their living environment according to their wants and needs. Since such forms of public participation are still novel, they need to be evaluated in order to appropriately employ them, justify their use and further improve them. Previous studies comparing specific digital and traditional participation processes (e.g. Jankowski et al. 2019;Stern et al. 2009) indicated that digital processes can be at least equally effective. The present paper tries to add to this line of research by proposing a set of general evaluation criteria building upon evaluations of traditional processes, taking into consideration the specific aspects relevant to the heavy use of digital tools and proposing possible ways of operationalising these criteria for the evaluation of digital tools.
The evaluation of digital participation processes bears both similarities with and differences to the evaluation of conventional participation processes. Similarities concern the general evaluation criteria which can be derived from theories of democracy. Novelties pertain to the fundamental differences in the process itself, i.e. face-to-face interaction playing a subordinate role and instead the use of digital tools and digital media being at the core of the process. Taking these differences into consideration, the criteria and methods used in the evaluation of the participation process will change, making some parts of the evaluation less labour-intensive, e.g. when contributions from the participants do not have to be counted manually because the data created using digital tools can more easily be analysed. In other regards, however, the evaluation may require more effort, especially because the number of participants involved can be larger by orders of magnitude-making data collection and data analysis more complex, requiring efficient data analysis tools and procedures. Data collection methods must be chosen carefully, trying to balance the resources (e.g. regarding time, money and participant effort) associated with certain methods (e.g. questionnaires, interviews) with the potential value gained from using them over less expensive ways of collecting data (e.g. server logs, text mining). Participant motivation during a process and its evaluation may be maintained and increased using gamification strategies, but possibly at the cost of changing the source of motivation to participate (e.g. Thiel and Fröhlich 2017).
Questions to be focused on in future evaluations of massive digital participation processes should relate to these differences, i.e. it may be asked whether the traditional evaluation criteria may need to be adapted to account for the use of digital media, or which new criteria must be formulated to fairly assess the technical requirements that need to be met (e.g. technical robustness, operational safety, speed and costeffectiveness of the hardware and software). Before the evaluation criteria and their operationalisations suggested in the present paper are applied in any actual evaluation process, they should be critically reflected concerning possible additions, omissions or alterations-adapting to the particularities of the respective participation process, especially considering the particular selection of digital tools.
From a methodological standpoint, the integration of a control group is an essential part of any rigorous evaluation in which the effectiveness of a participation process is to be verified. This is necessary to control external variables (e.g. external effects brought about by changes in society) that may also influence the outcomes of the participation process. Ideally, this control would consist of studying the same urban planning task being conducted in the same city at the same time but without the participation system the evaluation is concerned with. This is, of course, unrealistic in practice. Instead, it may be very hard to find a suitable control-even a similar urban planning task in the same city or the same kind of planning task in a similar city. If the differences in the digital participation process to be evaluated are very small, the conclusions to be drawn from the evaluation are much weaker. Therefore, several participation processes should be studied so that a large number of processes can even out the influence of possible confounding factors.