Evaluating universal design of built environments: an empirical study of stakeholder practice and perceptions

Universal design aims to reduce environmental barriers and enhance usability of buildings for all people, particularly those with disabilities. There are known challenges relating to the evaluation of universal design and evidence supporting this concept is limited. This study aimed to gather information on current practice and what stakeholders perceive as important to universal design evaluation. A mixed methods approach was employed, and data were collected via online survey (n = 157) and semi-structured interviews (n = 37). Participants included industry professionals, policy makers, government officials, academics, and people with disabilities. Just over one-third of participants stated that they had experience of evaluating universal design in public built environments. Checklists were most commonly used, yet participants expressed concern with their suitability for this purpose. Almost all participants perceived evaluation of universal design as important, citing its value to advocacy, professional development and strengthening the evidence base of universal design. Findings from this study highlight a tension between a desire for efficiency and consistency, as offered by a checklist approach, and the adoption of a holistic and multidisciplinary method of evaluation that encompasses the complexity of universal design application.


Introduction
Universal design (UD) is an approach where designers go beyond minimum standards and legislation to design a building that is usable by all people (Hamraie, 2017). As a means of reducing discrimination and enhancing social participation, it is called for internationally by the United Nations Convention on the Rights of Persons with Disabilities (CRPD) (United Nations, 2007) and, in Australia, by a range of national, state and local policy directives. Despite growing demand, there are no agreed upon models or methods to evaluate how UD is applied or what outcomes are achieved. This study aimed to gather information from a range of stakeholders on current UD evaluation practice, and thus inform suggestions for improving this practice.

Background
The Principles of Universal Design (Connell et al., 1997) were developed to guide practitioners on what should be considered during UD application. However, in emphasising outcomes relating to usability for all people, these guidelines have been critiqued as being too subjective (Lid, 2013), broad, and dynamic, to easily measure (O'Shea et al., 2016). In contrast to accessibility, which refers to environmental parameters that influence human function, are quantifiable, and mandated via legislation (Erkiliç, 2011;Ostroff, 2011), the concept of usability denotes a 'fit' between the abilities of a person and their environment that supports activity performance (Iwarsson & Stahl, 2003). In more recent work, Steinfield and Maisel (2012) argue that a broader focus on outcomes relating to social inclusion and equity would better meet current political, social, and economic environments and, in response, these authors propose a new definition of UD that highlights a broader range of outcomes -"a process that enables and empowers a diverse population by improving human performance, health and wellness, and social participation" (p.29).

Evaluation methodology
Evaluation of building processes and outcomes provides valuable information to not only optimise the design and function of a new building but also to inform future designs. However, evaluating public buildings is inherently complex and can involve assessing diverse performance outcomes, including environmental (e.g. energy/water consumption), physical and space (e.g. structural stability/ergonomics), psychosocial (e.g. thermal/acoustic comfort), and socioeconomic (e.g. security/maintenance cost) (Vásquez-Hernández & Restrepo Álvarez, 2017). Architectural projects typically proceed through eight temporal stages: Strategic Definition, Preparation and Brief; Concept Design; Developed Design; Technical Design; Construction; Handover and Close-out; and In Use (Royal Institute of British Architects (RIBA), 2013). Performance evaluation may occur during design or construction phases using methodologies such as simulation, and post-occupancy using processes that seek to gather feedback on actual performance (Preiser, 2001a;Vásquez-Hernández & Restrepo Álvarez, 2017). Although there is an array of tools and systems by which specific building or construction processes and outcomes can be evaluated, there is no consistently used method or tool to evaluate how usable a building is from the perspective of its endusers (Mosca & Capolongo, 2018;Preiser, 2001b).

Evaluation of universal design
There are known barriers to evaluating the application of UD to public built environments. Firstly, earlier research suggests that stakeholder understanding can be limited and that the concept of UD is commonly considered synonymous with accessible design (Bamzar, 2019;De Cauwer et al., 2009;Larkin et al., 2015;Sørensen & Ryhl, 2017;Van der Linden et al., 2016;Welch & Jones, 2001). Inconsistent use and interpretation of terminology poses a challenge for stakeholders in understanding what outcomes can be anticipated and evaluated when compared to accessible design which can be assessed and measured through code-compliance. Secondly, despite the fact that Mace (1985) originally called for UD to be applied from the beginning of the design process, and that research indicate supports for this from practitioners (Hitch et al., 2012) and academics (Afacan & Erbug, 2009;Maisel, 2006), it is not known when evaluation can effectively occur. Finally, where the intention is to design a public building that is usable by all people, there is no clear guidance on whose perspectives should be included in evaluation. Involvement of user-experts, such as people with disabilities, has long been called for (Ostroff, 1997;Ringaert, 2001) as people who do not have such experience may be unable to effectively predict the experience of people who do (Barnes, 2016;Bickenbach et al., 1999;Oliver, 1990). Unintentionally, designers may overlook aspects of design that significantly impact usability (Heylighen et al., , 2017Pritchard, 2014). Although valuable knowledge and insight can be gained from people whose participation is regularly restricted by environmental barriers (Australian Government Department of Social Services, 2019; Heylighen et al., 2017;Lid, 2014), there are limited published examples of people with disabilities being involved in the evaluation of UD as applied to public buildings. Additionally, sample and project sizes in these studies are relatively small and it is not known how effective these methods are in providing perspectives on usability for public buildings (Siddall et al., 2011).
Despite challenges noted, efforts have been made to develop methodology and tools to evaluate UD in public built environments. In their review, O'Shea et al. (2016) classify these in four ways: 'checklist-driven'; 'holistic'; 'value-driven'; and 'invisible'.
Checklist-driven evaluations are commonly based on the Principles of Universal Design (Connell et al., 1997) and, in the form of access audits, frequently reference local regulatory systems where building features are checked against code-compliance (Access Institute, 2017). Examples of checklist-based UD evaluations have been published as case studies (Afacan & Erbug, 2009;Guimarães, 2001Guimarães, , 2016Kim & Chang, 2018) but criticisms include the arduous length of tools, reliance on code-compliance, and a focus on potential environmental demands rather than actual demands (Mosca & Capolongo, 2018;O'Shea et al., 2016;Sanford, 2009).
Holistic evaluations aim to measure both potential and actual demands of a building (O'Shea et al., 2016). Sanford's (2012) Universal Design Assessment Protocol (UDAP) provides an example of this approach. By systematically analysing abilities required to complete a single task, such as operating a door handle, the UDAP aimed to provide a means to evaluate the usability of environments. However, the inherent complexity of this tool was noted to limit its practicability (Sanford, 2012). Froyen's (2012) methodological pattern-based approach to UD offers a second example where the author attempts to evaluate and collate patterns of interactions between people and their environments. Work by Preiser (2001b), and more recently by Cassi et al., (2021) and Mosca & Capolongo (2018), has emphasised the value of post-occupancy evaluation (POE) as a means of evaluating environmental usability, in particular interactions between people and environments. However, as noted by O'Shea et al. (2016), a significant challenge presented by holistic, postoccupancy evaluations is the demand on time and resources.
Value-driven evaluations focus on broader outcomes, such as equity and social participation (O'Shea et al., 2016). Notably, this form of evaluation emphasises a single aspect of UD and offers a focused perspective on outcomes. The final type of evaluation method outlined by O'Shea et al. (2016) is the invisible evaluation conducted by designers who, throughout the design process make constant, informal evaluations based on their training, skills, and values. O'Shea et al. (2016) conclude that simple checklist-driven evaluations are likely to be insufficient to evaluate the complexity of UD and that holistic, post-occupancy methods that measure actual demands of environments on people's performance are required. These authors and others (Afacan & Erbug, 2009) support using a combination of methods. Contextualisation of evaluation criteria to local, state, or national regulations, and ongoing data collection over time from a range of users, including people with disabilities, is also recommended (Afacan & Erbug, 2009;O'Shea et al., 2016).
Acknowledging these challenges to evaluating UD and the lack of support for one approach or methodology, insight into current practice is called for. It is not known if there is demand for UD evaluation in practice and, if so, what methods and tools are employed. It is also not known how stakeholders perceive UD should be evaluated. This study aimed to address these gaps in knowledge by asking: what is current practice in the evaluation of UD and what do stakeholders perceive as important in UD evaluation? A mixed methods approach was taken to explore perspectives and experiences of stakeholders via online survey and in-depth interview. Findings highlight a tension between the desire for efficiency and consistency offered by a checklist approach, and the adoption of a holistic, multidisciplinary method of evaluation.

Methods
This study employed a mixed methods study design. This provided opportunity to triangulate data and to expand upon early research findings with more in-depth data collection techniques (Creswell & Plano Clark, 2011;Greene et al., 1989). As an exploratory study that aimed to understand participants' experiences and situated knowledge, a descriptive approach was selected for quantitative data analysis (Punch, 2014), and a content analysis approach was selected for qualitative data (Berg, 2001).

Study context
This study received ethics approval from Deakin University Faculty of Health Ethics Committee, Australia (HEAG-H 99_2017).

Data collection
Data were collected via online survey and in-depth interviews. These methods aimed to gather a comprehensive range of data, validate findings and triangulate data (Creswell & Plano Clark, 2011). In this study, a survey provided opportunity to collect structured, largely quantitative data while the interviews enabled greater exploration and triangulation of topics with key stakeholders.

Survey
A survey was selected as an effective, efficient and ethical way to collect data from a large number of participants (Sue & Ritter, 2012). Survey development was based upon expert knowledge within the research team and existing literature. The survey was hosted on the Qualtrics © platform (Qualtrics, 2017). A total of 24 questions were structured into three sections on UD: knowledge; application; and evaluation. This paper focuses on methodology and findings pertaining to UD evaluation. A project overview has been published elsewhere (Watchorn et al., 2018).
Survey participants were asked about experiences and recommendations relating to UD evaluation in the built environment. Questions asked if participants had experience in evaluating UD and, if so, to describe what was done. Participants were also asked, 'Who was involved in evaluating UD?' and, 'During what stage of the design process was UD evaluated?'. Participants were invited to respond to 18 closed questions, e.g., multiple choice, and to provide textual data to six open questions. Participants were asked if they had used specific evaluation tools and to rank these in order of preference. To explore perceptions on UD evaluation, participants were asked to rate its importance on a 10-point Likert scale (1 = not important at all, 10 = extremely important) and to share perceptions on when UD should be evaluated and who should evaluate it. Demographic data (age, gender, occupation, country of residence, and experience of disability) were collected to provide contextual information regarding respondents and an invitation to participate in a follow-up interview was included. The full survey is available as Appendix A.

Interviews
Semi-structured interviews were used to gather detailed information and provided researchers opportunity to ask set questions and participants the freedom to elaborate on points or issues of their choice (Liamputtong, 2020). All interviews were conducted over a six-week period in 2017 after the survey closed. A short questionnaire was used to collect demographic details (age, gender, occupation, country of residence, and experience of disability). The interview structure was developed simultaneously with survey data collection and was informed by emerging findings. Questions were open and explored participants' experiences and attitudes toward UD application and evaluation. Examples of questions include: 'What do you perceive as the actual outcomes of applying UD?'; 'Are you aware of any tools or standards to evaluate the application of UD in the built environment?; and, 'In your opinion, what would a good tool look like to evaluate the application of UD?'. The full interview schedule is included as Appendix B.

Recruitment
Participants were recruited via purposive, convenience, and snowball sampling. Key informants and professional groups with experience in or knowledge of UD in built environments were nominated by the project team. As the researchers were in Australia, emphasis was placed on recruiting participants through Australian networks, but international participants were not excluded. An invitation to participate was disseminated online via professional websites, email, Twitter and Facebook, peak industry bodies, policy makers, user-expert groups, such as Disabled People's Organisations (DPOs), and to expert academics. The survey began with a plain language statement and sought informed consent prior to commencement and response submission. The survey was available for four weeks in 2017 and contact details were removed prior to analysis.
Survey participants who volunteered for a follow-up interview were contacted and emailed a plain language statement and consent form. Interviews were conducted faceto-face, over the phone or via VoIP, by a researcher with whom the participant had no preexisting relationship. Informed consent was sought when arranging an interview time and formalised in writing. All survey participants who indicated interest in being interviewed were contacted and, upon nearing the limit of project resources, the sample was stratified to ensure inclusion of a diverse range of stakeholders. Others were invited to respond to interview questions in written format. The inclusion of people with disabilities was prioritised in recognition of the valuable insight that can be gained from those who commonly face barriers in the built environment (Heylighen et al., 2017;Ostroff, 1997).
To be eligible for inclusion, participants required a recent (< last 5 years) role in: environmental design, planning and/or policy; and/or advocacy, implementation and/or evaluation of UD of built environments. Participants also needed to be aged over 18 years, able to provide informed consent, and communicate easily in English.

Survey
The survey was completed by 157 respondents. Most were working in Australia (n = 130; 82.8%), with the remainder largely in North America and Europe. Older age groups were represented at higher rates and gender distribution was relatively even. More than half (n = 89; 56.7%) reported experience of disability, either from personal experience or experience of a family member or close friend. Thirteen participants (8.3%) had secondary experience, such as via employment. Twenty respondents (12.7%) identified as a disability advocate or representative. Academics and access consultants were represented in largest numbers. See Table 1 for details.

Interview
Interviews were completed with 32 participants (n = 3 face-to-face; n = 24 telephone; n = 5 VoIP). Additionally, five participants provided written responses. Of the 37 participants, the majority were working in Australia (n = 31; 83.8%), aged over 55 years (n = 15; 40.5%), and had experience of disability via personal experience or that of a family member or close friend (n = 23; 62.2%). Five participants (13.5%) identified as a disability advocate or representative and the greatest number identified as academics (n = 8; 21.6%). See Table 1 for details.

Data analysis
As a mixed methods study, quantitative and qualitative data were merged during data analysis (Creswell & Plano Clark, 2011). Descriptive quantitative methods were used to analyse numerical survey data. Brief complementary data provided as additional comments, and as responses to the question 'What was done to evaluate UD?' were analysed using quantitative content analysis. Raw textual data were analysed for meaning and organised into similar categories by two to three researchers and then translated into frequencies and percentages (Maier, 2018). For example, text provided as "compliance with building rules" and "compared it against current requirements" were both coded as "Review against minimum standards". To enhance reliability, any ambiguity in the interpreted meaning was discussed between researchers and consensus made on categorisation (Krippendorff, 2006). Qualitative content analysis was employed to analyse data collected via interview and longer forms of textual data provided by survey participants as 'further comments'. All interviews were audio-recorded and transcribed verbatim within six weeks of completion. Member checking was employed where participants were invited to review their transcript, check that it reflected their intended meaning and amend as needed.
Each transcript was read in its entirety to enable the researchers to become immersed in the data and conduct a detailed analysis of that case (Cresswell, 2017). Two forms of qualitative content analysis were employed in this study, in response to the respective characteristics of the survey and interview data. Survey data were subjected to manifest analysis, which closely reflected the comments from participants by using the exact wording and concentrating on obvious themes (Berg, 2001). In contrast, interview transcripts were evaluated using a latent approach, which included analysis of both the explicit content and the interpreted meaning of text (Berg, 2001).
Regardless of its source, qualitative data were analysed using the methodology described by Bengtsson (2016). Two members of the research team independently decontextualized the data by breaking it into inductively identified meaning units (or codes). Once identified, codes were allowed to evolve as the analysis revealed new relationships and understandings of the key themes in the data. Once the codes were finalised, an iterative process of recontextualisation began with a review of the source texts to ensure all codes relevant to the study question had been captured. The independently identified codes were then condensed to reduce repetition and themes were identified via discussion between the two researchers, before being peer reviewed by the broader research team. All codes were aligned to a single categorical theme (Krippendorff, 2004), and some of the less frequently occurring (or prevalent) codes were omitted as they were not representative of the data as a whole. Please see Appendix C for an excerpt of the process of theme identification with examples from both survey and interview data.
The final step of Bengtsson's (2016) method is the formulation of the final analysis, which occurred in this study as part of the integrated analysis of all data. Quantitative and qualitative data were analysed sequentially, and then merged into a combined set of findings to address the research questions (Creswell & Plano Clark, 2011). This process took the form of comparing the datasets for instances where there was agreement and disagreement. Findings therefore report data in its integrated form, drawing upon results from both sources.

Trustworthiness of study design
Examining the trustworthiness of mixed methods studies is crucial to ensuring quality of this form of research. The Rosalind Franklin Qualitative Research Appraisal Instrument (RF-QRA) (Henderson & Rheault, 2004) provided the basis for selecting three key strategies to enhance trustworthiness in this study design. Firstly, the research team's professional and academic expertise in UD enhanced the credibility of this study, as did triangulation of data. Secondly, while the study did not intend to produce broadly generalisable results, a detailed description of the sample and context supports readers' analysis of relevance to practice settings. The detailed description of methodology, member checking, and independent coding of qualitative data support study dependability. Finally, the research team communicated regularly, and kept a comprehensive audit trial, which enabled frequent reflection and critical appraisal which supports confirmability of findings.

Results
Findings are presented separately here as they respond to the questions: 'How is UD of built environments currently evaluated? ' (i.e., Who is involved in UD evaluation? When is UD evaluated? How is UD evaluated?); and 'What are stakeholders' perceptions on how UD should be evaluated?' (i.e., Who should evaluate UD? When should UD be evaluated? How should UD be evaluated?).

Stakeholder practice in universal design evaluation
Just over one-third of survey respondents (n = 57; 36.3%) stated that they had been asked to provide evidence of having applied UD or had participated in UD evaluation. Over half of interview participants (n = 21; 56.8%) responded affirmatively to this question.
Analysis of interview data revealed four broad themes of people involved in UD evaluation. These were: 'building users'; 'building construction stakeholders'; 'multiple stakeholders'; and 'government'. Building users were most frequently identified and highlighted involvement of community members broadly, but also multiple instances where people with disabilities had participated in evaluation. Building construction stakeholders included architects, builders and project managers, urban planners, and access consultants. Findings highlighted frequent involvement of multiple stakeholders and that UD evaluation is a multidisciplinary process. Government representatives, particularly at the level of local government, were identified as a key stakeholder.

When is universal design evaluated?
Survey participants indicated that UD is being most frequently evaluated when the building is In Use (n = 32; 56.1%) and during Concept Design (n = 29; 50.9%). See Table 3 for details. Analysis of qualitative data revealed two strong themes of UD evaluation taking place during design stages and post-building completion.

How is universal design evaluated?
Survey participants were asked to describe what was done to evaluate UD. Brief textual responses were provided by 57 participants. Content analysis of this data revealed a diverse range of descriptors that identified specific tools or methods each reported by just 1-2 participants. Beyond this, the three most used descriptors were: 'documentation audit'; 'user participation'; and 'review of legislation compliance'. Further details are presented in Table 4. Total is greater than 100% as participants could select more than one option Total is greater than 100% as participants could select more than one option Almost three-quarters (n = 42, 73.7%) of those who had participated in UD evaluation had used specific tools or methods. Checklists were most frequently identified (n = 33; 78.6%), followed by access audits (n = 31; 73.8%). 'Other' was selected by 29 participants (69.0%) and a range of specific tools noted, e.g., the Environmental Audit Tool (Fleming, 2011) and the innovative solutions for Universal Design (isUD) (Center for Inclusive Design and Environmental Access, 2017). When asked to rank preference of tools used, checklists and access audits were most preferred. For details see Table 5.
Five major themes emerged from interview data outlining how UD is evaluated: 'use of existing resources'; 'user participation; 'checklists'; 'use of technology'; and 'documentation'.
Use of existing resources included the following subthemes: 'policies and guidelines'; 'formal tools', e.g. the isUD (Center for Inclusive Design and Environmental Access, 2017), Environmental Audit Tool (Fleming, 2011), and Residential Environment Impact Survey (Fisher et al., 2014); 'compliance with legislation', such as building codes and accessibility standards; 'use of examples', both ideal and ineffectual practice; and 'use of universal design theory'. For instance: "It's based on the Principles, and we're creating our own set of standards … we're looking at the Principles and thinking about how to translate those into standards all the new construction on our campus has to consider". [Interviewee 15, Occupational Therapist].
User participation included surveys, questionnaires, and interviews with community members. Examples were provided of engaging with people who have disabilities using methods such as building walkthroughs. As described by Interviewee 28 [Safety Professional]: Generally, it's more about the process that you get there, the buildings that have been the most successful are the ones that have learned from past mistakes and the ones  Total is greater than 100% as participants could select more than one option that engage the people that are actually using that building or will be using that building. And meaningfully engaged, not just sort of go through a token process.
Checklists included a range of tools, e.g. audit checklists, access appraisals and, as described by several participants, a 'tick-box' approach is common. Use of technology highlighted tools used to simulate environments, such as 3D modelling / isometric drawings and virtual reality "… headsets to give people spatial awareness of something on a screen and they can walk around inside…" [Interviewee 7, Safety Engineer], while documentation included review of drawings and reports, development of certification documents, e.g., access statements, and reporting complaints to address issues of concern.

Stakeholder perceptions on universal design evaluation
Overall, 95.5% (n = 150) of participants rated evaluation of UD in the built environment as being 'very important' [7 out of 10 where 10/10 is extremely important]. Almost half (n = 77; 49.0%) rated evaluation as 10 out of 10 'Extremely Important'. Qualitative analysis revealed three major themes describing why stakeholders perceived UD evaluation as important: 'to advocate for UD'; 'to strengthen the evidence base of UD'; and 'professional development'. In their discussion on advocacy, participants highlighted that evaluation provides opportunity to increase public awareness and to promote UD. Participants also highlighted a need for greater evidence. For instance, "the biggest need in universal design is this evaluation and providing the evidence to document its impact" [Interviewee 12, Academic] and "a lot of the time there's not a lot of those sort of statistics out there. So, it is quite hard to justify besides just saying that you know like it will be of benefit financially and for your customer base or your employees, yeah without those sort of numbers." [Interviewee 31, Urban Planner]. Finally, evaluation was considered a means of professional development where stakeholders could use evaluative data to aid decision making, reflect on and share experiences, both positive and negative: "…when you meet someone … where the space hasn't worked for them and has had a negative experience, then that's a useful learning experience, in a way it's almost like we need a database of times when the environments stuffed up … like case studies of where it failed as well as where it worked, and a sort of tool for us to be, to reflect on things we do, our practice. [Interviewee 14, Academic].
Analysis of qualitative data regarding who should be involved in UD evaluation revealed three themes: 'building users'; 'building construction stakeholders'; and 'multiple stakeholders'. The importance of including building users in evaluation was emphasised. Evaluating universal design requires knowledge in many areas (anthropometrics, cognition, health, etc.). Should not be done by a single person (e.g., architect), but by a board of people knowledgeable in the building environment, universal design, and of course representative users with varied ranges of disabilities.

When should universal design be evaluated?
Almost all survey participants (n = 149; 94.9%) stated that UD should be evaluated during the Concept Design stage. Evaluating during the In Use stage (n = 86; 54.8%), during the Developed Design stage (n = 78; 49.7%), Construction stage (n = 76; 48.4%), and at Handover and Close-Out (n = 74; 47.1%) were also recommended. Further details are presented in Table 3. Qualitative data strongly emphasised 'planning stages', 'design stages' and 'throughout all project stages'. As stated by Interviewee 4 [Policy Officer], "it's not something separate -it needs to be woven through all of the processes".

How should universal design be evaluated?
Analysis of qualitative data revealed four major themes describing how participants perceived UD should be evaluated. These were: 'data collection methods'; 'systemic requirements'; 'tool development'; and 'tool features'.
Recommended data collection methods included subthemes of 'user participation', 'use of technology', and 'multiple methods are needed'. As participatory forms of evaluation, participants emphasised including building users, particularly those with disabilities, in conversations, focus groups, interviews and/or simulation. As stated by Interviewee 15 [Occupational Therapist], "…always include a variety of users, and getting user feedback, I think professionals spend a lot of time with good intentions on designing things without the users in the mix, and then miss the mark." Use of technology was recommended, e.g., 3D drawing, virtual reality, and online simulation. The subtheme of 'multiple methods are needed' explicitly acknowledged the need for both quantitative and qualitative data. As stated by Interviewee 21 [Academic]: We couldn't just say one tool, but the tool could be composed in different phases. And each phase could integrate different matters, one could be checklist, and the other could be a face-to-face interview, the other focus group. I think that the ideal solution would be, or could be achieved if users participated, I totally encourage using all the techniques.
Acknowledging that design and construction of public buildings is influenced greatly by systemic requirements, such as building legislation, codes, and guidelines, participants called for a legislative approach to UD evaluation: There should be a single legislated source of compliance" [Survey Participant 130,Architect]; "There should be penalties for non-compliance" [Survey Participant 115, Landscape Architect]; and "until it's embedded in legislative requirements any application beyond minimum standards will provide challenges [Survey Participant 17, Access Consultant]. This was not agreed upon by all participants. As stated strongly by Survey Participant 19 [Infrastructure Officer]: "It needs to made clear that universal design is a philosophy based on a set of principles rather than a set of measurables that can be found in a set of building standards."; and "I don't think it should be regulated. I think the benefit in it is that it's a voluntary thing but should definitely be strongly encouraged".
Method of tool development highlighted that existing resources, such as Liveable Housing Guidelines (Liveable Housing Australia, 2017), were considered by some to be a starting point for developing a tool to evaluate UD. Co-design and ensuring processes were in place to receive ongoing feedback from users were perceived as valuable.
Tool features of 'user-friendliness', 'inclusion of examples', and 'checklists' were perceived as important. Features such as being easy to apply to different projects, using clear and consistent language, being suitable for people with diverse abilities and being timeefficient were emphasised. Establishing a database of 'best practice' and 'fails' was deemed as a useful way to reflect on practice and share evidence. Participants discussed the use of checklists, but emphasised risks with a 'tick-box approach'. For instance: "Have seen some checklists used in the past, but my reservation with this is that it becomes too rigid and concerned with ticking the boxes" [Interviewee 1, Access Consultant]. Overall, participants expressed concern for simplifying UD to a rigid checklist and expressed a need for a flexible tool. As stated by Interviewee 23 [Architect]: I think it's got to be iterative -it's got to be something that you don't just do once and go, right, done that. It's something that has to be flexible so that at the starting stages you're basically just flagging that these sorts of things are important and then working down through to the very end where you're thinking very specific things about whether somebody can actually get through a door.

Barriers to universal design evaluation
Qualitative data highlighted perceived barriers to UD evaluation. Three major themes emerged: 'no appropriate evaluation tools available'; 'evaluation is not prioritised', and 'UD is difficult to evaluate'.
Participants acknowledged that existing tools were limited in their suitability, stakeholders frequently did not prioritise evaluation, and that they received limited requests for evaluation. For instance, "we very rarely get requests to do the sort of evaluation on access type projects. … outside of the academic sector, there doesn't seem to be a lot of funding for this type of research or specific funding" [Interviewee 5,Traffic Engineer]. Participants also highlighted a lack of stakeholder knowledge in this field.
Several participants discussed challenges inherent in evaluating UD. Participants acknowledged that stakeholders hold varied interpretations of UD and how it should be applied and that UD may not meet the needs of some people. Participants also noted that the concept is associated with diverse outcomes that are difficult to quantify and may in fact be 'invisible'. For instance, Survey Participant 1 [Other] stated that, "there is no such thing as a universally-designed building, or at least not something that is static or fixed in the way that is implied by universal design principles." Similarly, Survey Participant 24 [Access Consultant] stated that "universal design should be so entrenched into the built environment that nobody really notices".

Discussion
Descriptions of current practice suggest that UD application is being evaluated and that methodology varies. A key finding from this study is that checklist-based tools or access audits are most used and preferred over other methods. Although checklists provide an efficient form of evaluation, and a means by which quantifiable outcomes can reliably be measured (Vásquez-Hernández & Restrepo Álvarez, 2017), strong concerns were raised by participants in this study about the suitability of checklists to evaluate UD. These concerns reflect limitations noted by others (Afacan & Erbug, 2009;O'Shea et al., 2016) of only assessing potential demands of an environment and reflect broader risks of simplifying UD evaluation to a simple 'tick-box approach'. The way checklists are reported to be designed and interpreted also suggests that some practitioners are misinterpreting the concept of UD as synonymous with accessibility, and supports earlier findings on this topic (De Cauwer et al., 2009;Larkin et al., 2015;Sørensen & Ryhl, 2017;Van der Linden et al., 2016;Welch & Jones, 2001). UD can be viewed as a "paradigm that aims at a holistic approach" (D'souza, 2004, p. 3), that is applied wherever possible and practical during design processes. It is flexible and adaptable to overcoming challenges -in a sense iterative and able to evolve. In these terms, check-listing could be seen to be anathema to, and a perversion of UD.
UD was not developed as a prescriptive method of design, but rather to offer designers broad guidelines to apply during design processes (Connell et al., 1997). Reflecting this intent and the fundamentally holistic nature of the Principles of Universal Design (Connell et al., 1997), findings from this study suggest that evaluation should not occur at one point in time but rather should occur throughout design stages. This supports Preiser's (2001a) cyclical approach to evaluation and Afacan and Erbug's (2009) call for an iterative process of evaluation. Additionally, in this study, notable differences were apparent between the timing of current and recommended practices. Although the reasons for these discrepancies were not explicated, further exploration of methodologies or tools suitable for evaluation during design and construction stages is called for as these may be different to those suitable for evaluating actual performance and usability post-occupancy.
Findings from this study support the use of multiple methods of evaluation and call for tools to aid this process. Participants expressed a desire for a tool that is efficient, userfriendly, usable by people with diverse abilities, and that includes practice examples. Prac-titioners are commonly developing bespoke tools or using tools not specifically designed to evaluate UD. It can be expected that this diversity in methodology creates inconsistencies in the type and quality of data being collected. These findings suggest a need to strengthen the rigour by which this data is being collected in order to gain broader insights into how UD is applied and what outcomes are gained. Such data would contribute toward an identified need to increase the empiricism of UD (Iwarsson & Stahl, 2003;Keates & Clarkson, 2001).
Dependent upon project type and scale, several stakeholders are involved in architectural design processes. These include clients, designers, construction personnel, regulatory bodies, users, and the broader public (Jenkins, 2009;Preiser, 2001a). With its origins in both design and rehabilitation, expertise in UD spans both disciplinary fields (Hamraie, 2017;Watchorn et al., 2019). Reflecting this, participants in this study called for multiple stakeholders to be involved in evaluation, a finding supported by others (Afacan & Erbug, 2009;Young et al., 2019). Access consultants were reported to most commonly be involved in the evaluation of UD and were recommended as integral to future work in this field. In Australia, access consultants, frequently occupational therapists, commonly provide expert advice on accessibility and usability (Association of Consultants in Access Australia (ACAA), 2019). This study finding may reflect participant demographics but may also suggest a call for expertise in this field of practice. More broadly, findings suggest that academics and policy officers are frequently involved in UD evaluation but are less preferred than other, perhaps more practice-based stakeholders. To inform future practice, exploration is warranted to better understand the roles and expertise that various stakeholders bring to UD application and evaluation.
In a commentary on the development of UD theory, Lid (2013) highlights usability as the stated outcome of UD in its original definition and in the CRPD (United Nations, 2007). In this study, participants commonly referenced the Principles of Universal Design (Connell et al., 1997) in their development of bespoke tools and emphasised usability as a key anticipated outcome. However, the concept and experience of usability is subjective and, as acknowledged by Afacan & Erbug (2009), challenging to evaluate. Lid (2013) emphasises that usability must be measured from the individual perspective and that greater understanding of how people interact with their environments is needed in order to understand and evaluate outcomes relating to UD and usability. In this study, participation of building users, particularly people with disabilities, was perceived as valuable and strongly recommended. Further research is needed to understand processes and outcomes that support participation of people with disabilities in design processes.

Limitations and future research
Limitations of this study include the use of self-report. The subjective nature of findings, such as participants' knowledge of evaluation tools, and the potential for recall bias is acknowledged. It is also noted that descriptive analysis of quantitative data limits application beyond this study. Further, although some participants were working overseas, most were based in Australia. Due to diversity in socio-political and cultural contexts, international applicability of findings may be limited. It is acknowledged that, when asked about the timing of evaluation and tools used, participants were not provided with definitions. This may have resulted in variations in how participants interpreted terminology. Similarly, it is acknowledged that a high number of survey participants selected 'other' in response to survey questions. This suggests that terminology used in this multidisciplinary field of practice is varied and warrants further exploration with an intent to enhance inter-disciplinary communication and collaboration. Finally, statistical analysis was limited by the sample not being balanced by occupational groups and, although building owners are a key stakeholder in design, they were not actively recruited in this study. Findings gained from this exploratory study provide initial insight into a complex area of design practice and highlight a need for further research to better understand stakeholder needs and preferences and inform development of evaluation methodologies and tools.

Conclusion
Findings from this study indicate that evaluation of UD in public built environments is being called for and carried out in practice. Evaluation is perceived as a means to advocate for greater application of UD, inform future design processes, and strengthen the empirical base of UD. This call for evidence reflects a belief that UD improves performance and participation at individual and population levels. There is, however, tension between stakeholder desire for a checklist-based approach to evaluating UD, and more holistic and iterative methodologies that can be applied across all design stages. Although perceived as efficient and user-friendly, checklists are limited in their capacity to effectively evaluate the complexity of outcomes that can be gained through UD application. Further research is needed to better understand what these outcomes are and how they can be effectively evaluated. Findings from this study offer early insight into perceptions of current practitioners and can serve to inform evaluation methodology and tool development that better addresses the complexity and creativity of UD application and outcomes gained.
Given the multidisciplinary nature of design practice and the inherent diversity in the needs of end-users, it is imperative that multiple perspectives are sought during UD evaluation. Yet it is not known what methods, tools, and approaches best meet the needs of different stakeholders, in particular those who are not design professionals. Research is currently underway to better understand how people with disabilities can effectively participate in design processes and what factors serve as barriers and facilitators to participation.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.