Keywords

1 Introduction

Usability is a basic attribute in software quality. There is still no clear and generally accepted usability definition; usability’s complex nature is hard to describe in a unique definition. The current ISO 9241-210 definition of usability refers to “the extent to which a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use” [1].

User eXperience (UX) goes beyond the three generally accepted usability’s dimensions: effectiveness, efficiency and satisfaction. The ISO 9241-210 standard defines UX as a “person’s perceptions and responses resulting from the use and/or anticipated use of a product, system or service” [1].

The UX concept is very popular nowadays. To move from usability to UX is a tendency; even the former “Usability Professionals Association” (UPA) redefined itself as “User Experience Professionals Association” (UXPA). Most authors consider UX as an extension of the usability concept; others still use the terms usability and UX indistinctly [2].

Measuring effectiveness, efficiency and satisfaction does not represent the only way of evaluating usability. Two major conceptions on usability have been pointed out: (1) summative, focused on metrics, “measurement-based usability”, and (2) formative, focused on usability problems detection and associated design solutions, “diagnostic usability” [3].

Evaluating UX is more challenging than evaluating usability. If usability is a subset of UX that means usability evaluation methods are also able to evaluate some UX aspects. But how can we evaluate other UX aspects? Almost 90 UX evaluation methods are described at www.allaboutux.org [4].

Usability and UX in virtual museums is one of our current research topics. We identified a set of users’ needs, we developed a set of specific usability heuristics, and we proposed a methodology to asses UX in virtual museums. The results are yet to be published. We used as case studies a well-known virtual museum, such as Google Cultural Institute [5], but also a local (Chilean) one, the Pre-Columbian Museum [6]. We conducted several experiments, mainly with graduate and undergraduate students in Tourism and Computer Science.

The paper presents a UX study on Pre-Columbian Museum. Section 2 reviews the concepts of usability, UX and their evaluation. Section 3 refers to usability and UX evaluation of virtual museums. Section 4 describes a co-discovery experiment on Pre-Colombian Museum. Section 5 highlights conclusions and future work.

2 Usability and UX

A well-known usability definition was proposed by the ISO 9241 standard back in 1998 [7]. The ISO 9241 standard was updated in 2010 [1]. Yet a new revision started briefly after, in 2011 [8]. It proves once again the evolving nature of the usability concept.

Literature refers to usability dimensions as “attributes”, “factors” or “goals”. Several aspects are recurrent in all definitions, as well and in ISO standards: effectiveness, efficiency, satisfaction, and the context of use. As Bevan, Carter and Harker highlight, the ISO 9241 current approach directly relates usability to user and business requirements: effectiveness means success in achieving goals, efficiency means not wasting time and satisfaction means willingness to use the system. Three main lessons learned since 1998: (1) the importance of understanding UX (2) the “measurement-based” usability approach is not enough, and (3) the need to explain how to take account negative outcomes that could arise from inadequate usability [8].

As usability, UX does not limit to software systems; it also applies to products and services. The ISO 9241-210 standard considers that UX “includes all the users’ emotions, beliefs, preferences, perceptions, physical and psychological responses, behaviors and accomplishments that occur before, during and after use” [1]. The “User Experience White Paper” aims to “bring clarity to the UX concept” [9]. It highlights the multidisciplinary nature of UX, which has led to several definitions of and perspectives on UX, each approaching the concept from a different point of view: from a psychological to a business perspective, and from quality centric to value centric. Rather than intending to give a unique UX definition, the document mentions the wide collection of definitions available at www.allaboutux.org.

Usability evaluation methods are basically classified as: (1) empirical usability testing, based on users’ participation [10], and (2) inspection methods, based on experts’ judgment [11].

Evaluating UX is more challenging and arguably overwhelming for newcomers. Almost 90 UX evaluation methods are described at www.allaboutux.org and the list will probably still grow up [4]. If we consider usability as a subset of UX that means usability evaluation methods are also able to evaluate some UX aspects. But how can we evaluate other UX aspects?

3 Evaluating the Usability and UX in Virtual Museums

3.1 Virtual Museums

The International Council of Museums (ICOM) defines a museum as “a non-profit, permanent institution in the service of society and its development, open to the public, which acquires, conserves, researches, communicates and exhibits the tangible and intangible heritage of humanity and its environment for the purposes of education, study and enjoyment” [12]. Several characteristics established by the ICOM definition are strengthened by a virtual museum; probably the most important one is the public access.

Virtual museums evolved from digital media in the pre-internet era, to on-line museums and immersive digital museums. As MacDonald points out, the virtual museum experience delivered through museum websites is a critical concern for museum professionals [13]. Sylaiou, Liarokapis, Kotsakis and Patias indicate that digitizing museums’ collections should have a double purpose: (1) to preserve the cultural heritage, but also (2) to make the information content accessible to the wider public in an “attractive” manner [14].

A basic feature of a virtual museum is the on-line collection. However, it seems to also be among the least popular features of a museum website [15, 16]. Museum experts usually attribute this low popularity to a lack of interest, but a poor UX may also decrease users’ interest.

Usability and UX in virtual museums is one of our current research topics. We identified a set of users’ needs [17], we developed a set of specific usability heuristics [18], and we proposed a methodology to asses UX in virtual museums [19]. The work we have done is only available in Spanish, locally, at Pontificia Universidad Católica de Valparaíso, Chile. We used as case studies Google Cultural Institute [5] and the Pre-Columbian Museum [6].

We conducted several experiments, mainly with graduate and undergraduate students of programs from two areas:

  • Tourism - students from Universidad de Playa Ancha, Valparaíso, Chile,

  • Computer Science - students from Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile).

3.2 Heuristic Evaluation: Evaluators’ Perception

We conducted several heuristic evaluations on Google Cultural Institute, based on Nielsen’s heuristics [20]. 33 Computer Science undergraduate and 15 Computer Science graduate students were involved as evaluators. Then, all of them participated in a survey. We developed a standard questionnaire that assesses evaluators’ perception over a set of usability heuristics, concerning 4 dimensions: D1 - Utility, D2 - Clarity, D3 - Ease of use, and D4 - Necessity of additional checklist. All dimensions were evaluated using a 5 points Likert scale.

Results were analyzed in a previous work [21]. The number of correlations within the four surveyed dimensions is low.

In the case of graduate students, there are no significant differences between the evaluators with and without previous experience, excepting the dimension D4 - Necessity of additional checklist.

In the case of undergraduate students the perception is rather different:

  • there are no significant differences between the evaluators with and without previous experience for dimension D1 – Utility and D4 - Necessity of additional checklist;

  • there are significant differences between the two groups of evaluators in the case of dimension D2 – Clarity and D3 – Ease of use.

Nielsen’s heuristics are not perceived as one would expect, even when evaluators have previous experience in their use. The study offered relevant information particularly for the development of a new set of usability heuristics for virtual museums [18]. We definitely recommend the use of specific heuristics instead of generic heuristics, when evaluating the usability of virtual museums.

3.3 UX Evaluation

Evaluating the UX in virtual museums is challenging. There are certainly UX aspects beyond usability’s dimensions. Specific domain characteristics have to be considered when selecting an evaluation method. “Traditional” usability evaluation methods may be used, but also specific UX methods.

We analyzed the pros and cons of all methods proposed by Allaboutux.org [4]. We tested several methods: co-discovery, thinking aloud, controlled observations, emocards, semi-structured experience interviews, valence method, heuristic evaluation, card sorting, and formal experiments. We also conducted a communicability evaluation experiment. We proposed and partially validated a (preliminary) methodology to asses UX in virtual museums [19].

Applying a single evaluation method offers a limited perspective and results. If time and resources are available, several methods should be used: quantitative and qualitative, inspections and user tests, usability and UX oriented methods.

4 The Pre-colombian Museum: A Co-discovery Experiment

Co-discovery is a user testing method also referred as “co-discovery learning” or “constructive interaction”. As it offers valuable user thinking/thoughts insides, it is also a suitable UX method. Two users explore a (software) product together, freely discussing about it, while performing specific tasks.

The evaluation protocol is quite similar to the thinking aloud method, when a single user expresses his/her thoughts while performing the specified tasks. However, when two users are working together is more natural to comments what they are doing, than in a single user scenario.

4.1 Methodology

The experiment took place in the Usability Laboratory of the School of Informatics Engineering at Pontificia Universidad Católica de Valparaíso, Chile, in October 2016. It was conducted by two UX experts. One has a MSc in Computer Science, the other one is an architect, currently studying Psychology. They both have a Diploma in UX. The Pre-Colombian museum was evaluated, based on a pre-defined set of tasks.

The participants worked in pairs, performing the tasks without interruptions and distractions. During the experiment the participants were observed by the evaluators through a polarized glass that allows the vision in a single direction. Cameras were placed in each room to record the comments made by each couple, and their facial expressions during the experiment, also recording the display of theirs computer screens.

At first, each participant signed a confidentiality agreement which indicated the conditions of the test. Afterwards, they were shown the website to be evaluated and were informed about the different stages of the test.

In the first stage each user had to complete a preliminary questionnaire, to collect general data: sex, age, level of education and information regarding experience in portals similar to the product that will be evaluated.

In a second stage a list of tasks to accomplish was provided to each pair of participants: the search of elements in different sections of the website, the selection of news, and the playback of an audio. Participants were asked to comment aloud their opinions regarding the website, the fulfillment of the requested tasks, and any other significant elements.

After completing the tasks, a third stage consisted of a questionnaire of perception that had to be completed individually, aiming to measure the ease of task completion, accessibility of the sought information, orientation within the site, ease of navigation, effectiveness of results, level of satisfaction with the portal, and intention to reuse the site. In addition three open questions were asked, related to navigation difficulties, the most and the least preferred elements of the virtual museum.

4.2 The Pre-test Questionnaire

The preliminary questionnaire consisted of 5 questions whose objective was to identify broadly the profile of the user and his/her previous experience visiting virtual museums. 8 users participated, 5 males and 3 females, of age ranging from 23 to 37 years. They were all graduate students of a Master program in Computer Science.

Half of the users (4) reported no experience in visiting virtual museums, while the other half had already visited a similar site. 3 students visited the Google Cultural Institute web site, and another one visited the Metropolitan Museum of New York web site; they had a general notion about virtual museums, but they indicated that (almost) never visit such type of portals.

4.3 The Test

The first task consisted of finding elements of different materiality belonging to one of the collections of objects of the site. 100% of the participants did it, approaching the time limit period (5 min), showing some delays in the process of distinction of elements by materiality. When returning to the main menu (activity that was part of the task), the users showed confusion regarding the different icons and signage that allowed to fulfill such function; only 25% of the users correctly achieved it. However they did not performed the task in the pre-established time limit. Therefore no pair of users was able to perform the task properly, and within the pre-assigned time period.

The second task consisted of selecting and printing a news item. Although all users completed it, only 50% managed to do it within the pre-established time (3 min). Common problems were lack of orientation within the site and delays in finding the appropriate section and the printing function.

The third task required to play an audio. It was executed by 100% of the participants within the assigned time period (5 min). However, there where difficulties to reproduce the audio in the first instance; some of the users tried to download it, or to configure the audio options. Only 25% of the users returned to the main menu, which was one of the task requirements.

The fourth task asked to identify a freely chosen element for each of the exhibition views of the virtual museum. Although it was completed by all participants, 25% of them surpassed the time limit assigned (5 min), experiencing problems in finding the required destination. One of the pairs of users even had to use the website’s search option, instead of following the indicated sequence of steps. Some users experienced confusion between denominations of different options and sections (identical denominations but with different destinations, and complications in distinguishing between “rooms” and “views”). The task was fully accomplished by 75% of the users.

Analyzing the above results, we may highlight the failure in the subtasks of returning to the main menu, either because of the lack of attention to read the instructions or lack of guidance within the portal, along with exceeding the pre-established time limit. Only 25% of the users managed to return to the main menu, as required in the first and in the third task; all users had managed to comply with the rest of the instructions.

Analyzing the reasons that markedly reduced the success rate, the failures and delays mentioned above were not so much due to the difficulty of the task itself. They were related to the lack of orientation within the site, the troubles in finding sections and functions, the confusion between similar signage, and the complications to remain inside of a certain navigation route, caused by deviations of the system to other tabs and links. However, most of the subtasks (finding elements, printing news and audio playback functions) were performed by all participants. In some cases, routes other than those indicated in the task specification were followed; however this was not perceived as a failure in the execution of the task, because the evaluation was emphasized on the ability of an intuitive navigation that can respond to the user’s needs, more than on obeying instructions.

4.4 The Post-test Questionnaire

The post-test questionnaire aimed to analyze the perceived level of difficulty in the accomplishment of the tasks and the conformity with the evaluated site. It included 8 questions based on a 5 points Likert scale, and 3 open questions.

Regarding the difficulty in completing the tasks, half of the participants (4 users) considered that it was difficult to complete them, while for the remaining half, it seemed neutral (2 users) or easy (2 users). Regarding the difficulty in finding the required information, more than half of the participants perceived it as difficult (5 users), being appreciated by the rest as neutral (2 users) or easy (1 user). As for the orientation level within the portal, most participants affirmed feeling little (4 users) or very little oriented (2 users), while the rest (2 users) considered it neutral. With respect to navigation difficulties through the collections of the virtual museum, more than half of the users considered it difficult (5 users), the rest seeing it as easy (1 user) or very easy (2 users).

Thus, throughout the evaluation, it was evident that participants encountered several difficulties and lack of orientation through the process of task execution, information identification and navigation in general. Most of the users who had previously visited other virtual museums tended to assign a high level of difficulty to the above mentioned aspects. It suggests that, compared to other similar websites, the Pre-Colombian portal presents sever usability problems, mainly related to a disorganized structure and difficult navigation.

The search effectiveness oscillated between effective (3 users), neutral (2 users) and ineffective (3 users). It was the item that showed more variety, possibly because involved personal judgements of a more subjective nature.

Regarding the level of satisfaction with the information available in the site, more than half of the participants evaluated it as little satisfactory (4 users) or totally unsatisfactory (1 user); only 3 users considered it satisfactory. The conformity with the use of the portal in general, was of little satisfaction for half of the participants (4 users), being for the remaining half neutral (2 users) or satisfactory (2 users). Finally, more than half of the users (5 users) stated that they would not use the portal again, while the rest (3 users) were neutral in this aspect. There is a prevalence of a low level of satisfaction with the Pre-Colombian virtual museum, being mainly manifested in the majority of the users with previous experiences in similar sites, probably because they had higher expectations for this type of portals.

Among the aspects that made the navigation difficult, the participants emphasize the lack of orientation due to the existence of multiple menu types, the complexity of perceiving and distinguishing some sections and options, as well as their location. They also mentioned that the deviation to new tabs generates confusion within the navigation route that was intended to be followed. In addition, they underlined problems to understand the information structure and its utility in some cases. They also indicated some difficulties in using the audio playback function.

These same aspects were highlighted as less preferred by participants: lack of orientation through the website, lack of structure, excessive menus and annoying new windows, poor visibility of some functions, difficulties to find certain types of information (objects’ description), and some useless data. In addition, more specific problems were mentioned, such as troubles in the audio’s playback, existence of sites and images without associated information, redundant information, inconsistency in translation from Mapuche (a local ethnic language) to Spanish and vice versa, and lack of minimalism.

As favorite elements of the portal, users highlighted the pertinence, variety and quality of textual, visual and auditory information, the graphic and chromatic aesthetics of the interface. Other appreciated aspects were the possibilities of interactive tours through the exhibition halls (being perceived as didactic) and the availability of a menu in English.

4.5 Remarks on Users Emotions

Throughout the evaluation, the pairs of users expressed different emotions in response to the various elements and situations they encountered during their navigation through the Pre-Colombian portal. First of all, there were general feelings of confusion about the logic and structure of the site, about a multitude of similar options, being difficult to differentiate from each other, and about the usefulness of information, as well as the lack of adequate feedback regarding URL changes and the location within the site, which generated the prevalence of a sense of disorientation in the majority of the students. All the above observations were later on confirmed through the perception questionnaire. This was accompanied by frustration due to the lack of efficiency and correct operation of each option, followed by feelings of resignation, expressed by some pairs of users after unsuccessfully repeated attempts to return to the main menu.

Secondarily, some participants reflected anxious emotions about difficulties in finding the required information, or a certain level of distraction with some elements of the site (images, photographs), which eventually may have affected their attention to the required instructions. It drew attention the appearance of anxiety patterns in one of the pair of user; dynamics of asymmetric relationship occurred, when one of the members adopted a dominant attitude towards his partner getting to “take control”, and directing the other one. In spite of his limited facial expression, possibly due to concentrating on having a good performance because of being the person “in charge”, the user expressed his anxiety to quickly complete the requested tasks.

At last, frequent reactions of surprise were manifested by some users as new information and functions were discovered during the navigation through the website. In particular, this type of emotional response was evidenced in pairs with symmetrical relationship dynamics, with a climate of cooperation and sociability, which produced a sense of confidence for the members involved, to freely express their opinion.

5 Conclusions

Usability and especially UX are well established concepts, but still under review. There are well known and widely used usability evaluation methods, but UX evaluation is still a challenging task. There is an overwhelming amount of UX methods, and the list is still growing.

Virtual museums evolved from digital media in the pre-internet era, to on-line museums and immersive digital museums. Virtual museums’ UX is a critical concern, and virtual museums’ UX evaluation is a relevant topic.

Even if heuristic evaluation is a usability related method, it also offers valuable UX related information. Usability issues detected through a heuristic evaluation are sources of potential poor UX. General usability heuristics, such as Nielsen’s, are hard to apply when evaluating virtual museums. We suggest the use of specific heuristics instead.

UX aspects beyond usability’s dimensions should be evaluated using specific methods. Co-discovery offers valuable user thinking/thoughts insides. It offers a firsthand look on users’ perception, reactions, responses and emotions. The co-discovery experiment on Pre-Colombian Museum highlighted severs usability issues. But more important, it provided valuable information on users’ perception, thoughts and emotions.

As future work, we will extend our research to other case studies. The set of specific usability heuristics that we developed and the methodology of evaluating UX in virtual museums that we proposed require further validation.