Towards a Novel User Satisfaction Modelling for Museum Visit Recommender Systems

. Modern recommender systems technology appeared in Cultural Heritage application relatively recently, particularly during the dawn of the 21 st century. There is already a signiﬁcant amount of relevant works in the bibliography, which has been primarily empowered by large-scale research and development projects. Various approaches have been adopted from the recommender systems technology, including collaborative ﬁltering, content-based, knowledge-based and hybrid systems. In most of these approaches that focused on museum guidance, which is the focus of this paper, the museum has been assumed to be a form of a gallery and the visitor was treated primarily as a user in seek on engagement and enjoyment. The free museum roaming was the main form of visit that has been considered and targeted, while the educational factors and storytelling aspects have been markedly overlooked. In this paper a new framework for the user satisfaction modelling is being presented that quantiﬁes user satisfaction based on a weighted combination of various probabilistic factors that are being estimated during a museum visit. The goal is to provide a model of user satisfaction that can be used for museum recommenders that could guide either free-roaming visits or guided-tour scenarios for visitors of various motivations and backgrounds.


Introduction
Recommender technologies have been widely adopted in various fields of applications since the late-20 th century. Recommender systems are, typically, systems that exploit some form of knowledge for a group of users and user preferences on a list of items, in order to provide recommendations to the known or new users about unseen items that might be of possible interest. This recommendation can take any form of suggestion for any kind of interaction and engagement, that might depend on the context, including, for example, buying options, music selections, movie suggestions, or even path following. Generally, recommender systems can be considered to be a form of narrow artificial intelligence applications, adapted to perform efficiently for particular cases. The goal of recommender systems is to create meaningful and personalised recommendations, which is an effective solution to the pervasive and persistent problem of information overload. This personalisation takes any form that complies with a particular user group and a contexts (such as educational, recreational, location-based, time-dependent) [1,2,7,35].
Recommender systems draw on theoretical background from cognitive science, approximation theory, information retrieval, forecasting, management, consumer modelling and more [1]. As computational systems that are able to provide contextually valid recommendations they are studied as machine learning application in information technology, and they are already being massively applied in online advertising and general items recommendation [20]. Based on that specific goal of creating valid recommendations, these systems focus on creating associations among users and items for a pre-defined motive. If all potential users are modelled as a gaussian sample the most obvious strategy of a naive recommender system would be to create recommendations based purely on popularity; this includes ranking the items in a decreasing order of popularity (statistically estimated) and suggesting the top ranked items to the active user (this is the terminology for the current system user). Although this user model is simple, the results by such a recommender are usually valid and thus any new recommender technology has to be compared against a popularity-based recommender that serves as the baseline.
Recommender technology is heavily based on existing data that might include user demographics and user assumptions, item features, ratings of items by the users and contextual information. Apparently the situation is far from being ideal, and the data are largely 'incomplete'; the user data and model assumptions are incomplete and inaccurate, the item ratings are significantly scarce and contextual data are not usually considered as having temporal dynamics and biases. Ratings are of particular interest for a specific type of recommender systems, those that employ collaborative filtering, which is a method of predicting item ratings for users that have not rated these items yet. The item ratings are mathematically formulated as a ratings matrix, which is expected to be significantly sparse. Another challenge for the recommender systems is what is called the cold-start problem, which relates to the difficulty in creating valid recommendations for new users (or users that do not rate the items or they do not express any preferences). Typically this problem is tackled by exploiting user model assumptions by following the activity of the user and any user (demographic) data available. Of course, recommenders need also to deal with fraud and attacks, which are forms of manipulation of the system to generate biased recommendations [3,13,29,34,35,46,47].
There is a vast bibliography regarding recommender systems in various domains and settings, that has appeared since the 1990s [1,2,5,12,18,19,[24][25][26]28,30,40,41]. Equally massive is the bibliography of recommender systems in cultural heritage and particularly museum recommenders, although there have been recommender systems focusing primarily on wider applications of cultural tourism. In the following section a brief account of those works is presented, focusing on the user modelling aspects and assumptions, and in the subsequent sections of the paper a new framework for user satisfaction modelling is presented that is able to provide quantitative measures for effective museum recommenders. Throughout the paper the terms 'user' and 'visitor and also 'item' and 'exhibit' are used interchangeably, but fundamentally correspond to analogies between a generalised setting and the domain of museum visits and exhibition tours. The terms 'user' and 'item' with the wider sense are used to denote a more general notion or application.

User Satisfaction in Museum Recommenders
In 1999 the nomadic guide Hippie [38] was developed within project HIPS, as an electronic guide capable for adaptive guided exhibition presentation, which exploited visitor location data by tracking visitors with IR sensing devices. User modelling was a mixture of the explicit visitor preferences, user interactions with items and localisation. An interesting feature of Hippie was the support for communication among visitors that enhanced the social dimension of a visit.
In 2002 the Sotto Voce was developed [6]. Sotto Voce was an audio guide for PDA devices that focused largely on the social aspects of a visit. The users where supposed to be socially active entities that draw satisfaction by the social interactions among them, so this work resulted a guide that supported a mediated sharing of audio content. The approach was named eavesdropping, since it was a form of overhearing what other visitors were listening to at any given close distance. In this work the recommender was based on the localisation and social interaction.
Some years later, project PEACH presented another approach and a mobile system for an engaging experience in a museum visit [42]. Cinematic techniques were primarily employed as means for young visitor engagement, and user localisation was also implemented for better user modelling and context building. A technique for dynamic image sequence generation and sound synchronisation was the engine for the content personalisation. Interestingly, temporal dynamics were considered in the user modelling in this case, as the content personalisation was based on a user profile that evolved during the visit.
In 2005, semantics and context-awareness were used in a set of museum guide applications for PDA devices presented in [16]. Their aim was to adapt to visitor models that included visitor behaviours, which were initialised using typical demographics and preferences. Sensing technologies were employed to provide localisation context, based on [36]. This was a case of a content-based recommender that assumed a semantics-based and proximity-based visitor satisfaction model.
In 2006 the ARCHIE mobile guide system [32] proposed a mobile guide that focused largely on social awareness. Social interaction was of paramount importance in the assumption for the user modelling in this work, drawing on studies like [17]. ARCHIE supported the building of an evolving visitor profile, and relied on mechanisms similar to those in PEACH to personalise the content, along with visitor localisation. In 2007 a system was proposed that extracted semantic similarities from textual descriptions for accurate user modelling and item matching [21]. In this work the proximity, text-based metadata, and popularity were combined to support the recommendation process. Interestingly, in this work any exhibit was symbolised as a word and any path (or tour) was symbolised as a sentence. The system was able to combine the visitor activity, collaborative data, proximity and textual semantics and similarities into a naive Bayes approach to predict the most probable next exhibit that might by of the active visitors interest. Nevertheless, and despite the insightful design and formulation of this purely machine learning approach the reported results did not outperform a naive popularity-based recommender, and special heuristics were employed to improve the performance of the new system. In this work, the user is a dynamically evolving entity which is influenced by the semantics and proximities in each point of interest.
In 2007 a first version of museum recommender approaches within the framework of project CHIP appeared, the CHIP interactive tour guide, which was a content-based personalisation framework [39] that focused on effective user model learning and the recommendation of web content. The next year another approach within the same project resulted in recommendations based on semantics [50], in which a concrete ontology was created to support the new contentbased recommender, which was also coupled with localisation data. In 2009 an updated version was presented in an annual student research competition, focussing on the mobile implementation for museum visits [43]. Finally, in 2010 the most advanced version of the system was presented that included routing functionalities based on connectivity graphs [49].
In 2008, and within project CHAT, a content-based recommender system was developed that was able to learn user profiles from static and user-generated content [8]. The system was based on a folksonomy based on integrating static and user-generated content in the form of tags. The visitor profile learning was based on probabilistic models, and the recommender was based on naive Bayes text categorisation. The same year, another interesting work focused on an newly studied aspect, that of recommending based on visitor lifestyles [31]. Although technically the method was based on typical collaborative filtering, the novel contribution was the introduction of interesting lifestyle factors for a different user modelling approach.
In 2012 a personalised guide was presented that focused on the educational aspect of museums and the tackling of the information overload [23]. The activity of visitors is monitored to create and update personalisation rules online, formulating a rule-based recommender. A case-based approach was used to switch between collaborative and content-based filtering for either collective or individual patterns that were augmented with localisation. In this work user satisfaction factors, as defined in [37] and upgraded by the authors were used to assess the system performance. During the same year, a recommender that relied on a semantic network on museum exhibits was presented [33], which generated recommendations based on item relations, user preferences, and the limited timeframe of a visit. The item relations captured influences of any exhibit by other exhibits and the recommender resulted in an estimate of the probability that the active user would appreciate the exhibit in question In 2015, personalised museums tours on smart mobile devices have been presented in a system that combined content-based and collaborative filtering [9]. The system exploited relevance and contextual information, to provide accurate context-aware recommendations, and adopted an ontology-based approach using CIDOC-CRM. In this system users are grouped by their demographics and the recommender is a complex hybrid system. The same year, within the AMMICO project a museum recommender was presented that focused on enhanced audio guidance in museum tours [27]. Once more, the user satisfaction modelling was based on the assumption that linear predefined narratives with no interactions are uninteresting. Like in any content-based recommenders, the method creates neighbourhoods for users and items based on a modified similarity measure, adopting definitions of local communities from [15], and of communities from [11], taking also into account a disjointed active visitor [10].
In 2016, another system motivated by a pre-supposed uninteresting linear narratives approach in museums appeared, in which a collaborative filteringbased system was developed, focusing on increasing both individual and group visitor satisfaction [44]. Technically, matrix factorisation was adopted for the collaborative filtering implementation and the localisation context was integrated by describing the museum with a directed acyclic graph. Influenced by the work in [45] the method quantified the individual or group satisfaction by estimating the accumulated user satisfaction by viewed items. The same year a hybrid recommender was presented in the framework of the eHERITAGE project adopting a mash-up approach, that integrated intelligent virtual assistant, Google Street View and recommender technology for virtual museum tours [48].
In 2017, within project meSch, another hybrid recommender appeared that focused on free-roaming museum visits and localisation [22]. What is interesting in this work is the integration of online and on-site user activity in the user modelling approach. Users are described in an eighteen-dimensional feature space. Logistic regression and a deep neural network were used to learn the recommendation model, the former targeting the understanding of the contribution of each feature in the recommendation, and the latter targeting the cold-start problem.
The same year, an association rule-based approach was proposed within project M5SAR [14]. This was a hybrid method for museum visit recommendation, capable of supporting multiple visitors and multiple museums and sites, based on the Apriori algorithm [4] for rules learning.

A Novel User Satisfaction Modelling Framework
Clearly, in most of the reviewed works on museum recommenders, the conceptualisation of a museum as a gallery that provides linear narratives seems to prevail. This view draws a frame on what users expect to gain by museum visits and how their engagement should be quantified or predicted and their satisfaction could be modelled. The free-roaming visitor model also seems to prevail, completing the gallery-like picture. Social engagement and participatory factors have been exploited by some of the works, but the museum role has been somehow diminished to an academic repository.
On the technical side, collaborative filtering approaches model user satisfaction in terms of items ratings and latent factors that, unfortunately, lack explainability. This approach relies on existing ratings to create predictions on missing ratings, and, consequently to predict a user's satisfaction in interactions with unvisited items. On the other hand, content-based approaches rely on grouping users and items by similarities and trying to match a user to a user group and to items so as to predict the user satisfaction by the interaction with unvisited items. Hybrid approaches try to combine the best of both worlds by employing domain knowledge, context awareness, assumptions and ratings to improve the prediction of items that would potentially enhance a user's satisfaction by their interaction. In the most complex cases, all those aspects are complemented with temporal dynamics to capture time-based changes in user behaviour, and possible biases in how users rate the items or in how specific items enhance user satisfaction.
This paper presents a novel approach in modelling user satisfaction based on the modelling of user dissatisfaction, or better, the probability that a user will be dissatisfied by an item during the interaction with a sequence of items. The formulated framework takes into account four basic factors, that is -temporal dynamics in user behaviour; -biases in an item's proximity neighbourhood; -biases in an item's content-based neighbourhood; -influences by obstacles, obtrusions and access difficulties.
Quantification of these factors in the novel framework is based on -the usage of the available visit time by a user and the patterns in the way time is spent on each item, in relation to a 'required' (or desired) time for an effective appreciation of an item; -items 'reputation' expressed by both the involved parties, the users and the stakeholder, which means that a museum exhibit can have ratings by users (which express the popularity of this item) and also ratings by the museum itself, the latter denoting a degree of significance of an exhibit by the stakeholder (how featured the item is considered by the museum); -proximity based attractions, corresponding to highly rated or featured items in the close neighbourhood of an item; -content based attractions, corresponding to highly rated or featured items that are thematically or semantically related to an item; -proximity based obtrusions due to crowded locations, possible scheduled or un-scheduled stakeholder interventions and any other possible access difficulty.
These user satisfaction modelling assumptions and factors that are taken into account are, in essence, an attempt towards a mathematical formulation of the most influential stimuli for divergence during a visit, which is abstractly presented in Fig. 1. In this conceptualisation of a typical museum visit that can be either in a free-roaming visit or a guided tour, the visit is depicted as a set of points of interest, physical and semantic distances and groupings and local obtrusions. Each item is depicted by a circle with a size proportional to the item's average popularity; the shading of each circle depicts how significant (featured) the item is considered by the stakeholder; thus dark large circles correspond to highly popular and featured items, whereas light small circles correspond to less popular and featured items. The straight arrowhead lines show the tour path depicted with a length that corresponds to a physical distance. In addition, closely grouped circles (items) correspond to physical neighbourhoods of items and obtrusions, the latter represented by shaded squares (with a size proportional to the level of local obtrusion). A physical neighbourhoods is depicted as an area enclosed by a dotted grey ellipse. Finally, content-based similar items in various locations are semantically grouped into content-based neighbourhoods, as the one depicted with the dashed grey line. The way those entities are quantified is pretty straightforward; (a) there are the average user ratings and also (b) the featured item scores provided by the stakeholder, (c) there are Internet of Everything (IoE) and localised notification approaches to assess obtrusions (or scheduled interventions), (d) the physical distances can naturally form proximity neighbourhoods and (e) content-based similarities among items can form similarity neighbourhoods by exploiting the items dataset and features. This conceptualisation of a museum visit or tour is the basis for the development of probabilistic features that aim to capture the probability of a user being attracted by items not in the user's initial schedule or not in the list of items suggested by a recommender system. Experimental study in simulated settings of the aforementioned visit conceptualisation, the influencing factors and the notion of focusing on dissatisfaction estimation (rather than satisfaction) resulted in the design of four features which are being presented in the following subsections. The feature that relates to obtrusions (denoted as p o ∈ [0, 1]) is not further analysed as it is a pure normalised estimate of any available information relating to obstacles or access issues in the vicinity of an item.

Temporal Dynamics in User Behaviour
Time is a very important factor in human activities and even more important in time-constrained settings like a typical museum visit. In museum visits there is a clock ticking for the user, based on the interactions with items that gradually consume the initially (in many cases predetermined) available time, but with some inherent fuzziness, as this available time can be easily shortened or prolonged according to various factors. On the other hand there is another clock that is ticking according to a prescribed duration of interactions among users and items, stemming from the amount of information attached to each item and the item to item distances. By exploiting those two clocks the temporal behaviour of a user can be modelled based on the pretty apparent assumptions by which -a user that skips a provided item description is highly likely to have been dissatisfied by the presentation of an item or by the item itself -a user that spends more time than the minimum required in front of an item is more likely to have been fascinated by the item The situation can be modelled by adopting a piecewise continuous function, a composite of an exponential and a quadratic function, where p t is the estimate of the probability of user dissatisfaction due to temporal dynamics, α is the exponential function steepness factor accepting values ∈ [5,10] and t f is a time factor computed as the ratio of the user available time and the total 'required' tour time up to the current (or next) point of interest. This function is depicted in Fig. 2, in which the horizontal axis is the time factor t f . The exponential (left side) function quantifies the dissatisfaction probability in cases in which the user skips item presentations. The inverse quadratic (right side) function quantifies the dissatisfaction probability in cases in which the user seems to spend too much time with items, thus need to be more 'conservative' than the exponential part. Apparently, this function estimates a probability, since the mapping values are bounded in the [0, 1] interval, and its inverting influence leads to high values near 1 in cases in which there is significant disagreement between the estimated required times and the times actually spent by the user. Values close to 0 (around the 1 in the horizontal axis) correspond to a user that should be more or less considered 'satisfied', since the two running clocks seem to be in synchronisation. Obviously, a beta or a gamma distribution function could be used to approximate the temporal dynamics behaviour instead of this piecewise continuous function, which, nevertheless, provides better control over the modelled situation.

Proximity Dynamics in User Behaviour
As popular or significant items in the physical proximity of a viewed item are likely to become attractions for the active user, they have a potential to divert the user from a predefined (in any way or concept) course. Thus their influence in their neighbourhood need to be modelled and quantified as a probability of producing a dissatisfaction by the initially intended item interaction. Simply, the satisfaction by an item may be reduced by a popular or significant item in its immediate neighbourhood. This can be modelled as a function of the distances, the popularities and featured item scores, where p p is the estimate of the probability of user dissatisfaction by item i due to proximity dynamics computed as a weighted summation of three features p f , β are heuristic weights for the three features (scaling in the range [0, 1]), and the features p f are defined as 1 where, essentially, p (1) f is the weighted summation of the average popularity and featured score of the examined item i, p (2) f is the same quantity normalised by the summation of all the corresponding weighted summations in the proximity, and p (3) f is the normalisation with respect to the maximum influence in the proximity, R (k) is the average rating of item k and F (k) is the featured item score for item k. Apparently the weights in β and in ω are either predefined based on domain knowledge and experimental result, or can be learned within a machine learning framework (which is not the focus of this paper).

Content-Based Dynamics in User Behaviour
Content-based and semantic similarities among items have already been highlighted and studied in the relevant research and many works try to take into account that users might be biased towards (or against) specific families or groups of items that are possibly linked in multiple ways. This item linkage can be based on ontological connections, hierarchical relations, or even description similarities. When a recommender is hinted by the repetitive preference of the active user towards similar items (and possibly not nearby or popular items) then it is reasonable to assume that the user draws satisfaction by the specific type or group of items and should focus in recommending items based on some learned item clustering. Nevertheless, even under this line of reasoning that promotes similar items, the impact of proximity cannot be totally neglected. Apparently, a fairly relevant item in the vicinity of the user might be a more probable target than a more relevant item in a distance. Thus the quantification of content-based dynamics in the modelling of user satisfaction has to combine both content similarity and distance. In addition, there is still a strong influence by those items that are considered to be popular and this influence has to also be accounted for. In the presented framework the content-based user satisfaction model by the current (or next) item i is defined as where p c is the estimate of the probability of user dissatisfaction due to contentbased dynamics (influenced by proximities and popularities), sim (j) is the content-based similarity of the active item i with item j that is in the high similarity (content-based) neighbourhood of i, prox (j) is the normalised inverse distance of item i with item j, and pop (j) is the average popularity of item j. The estimate takes into account the strongest influence in the neighbourhood of similar items and results a value for p c ∈ [0, 1]. Again, heuristically predefined or learned weights are being used (γ) in order to emphasise the influence of similarity, proximity or popularity in this composite feature computation.

The Composite Feature for User Satisfaction Modelling
The four probabilistic factors (or features) defined in this framework can be used to either create a composite (vectorised) four-dimensional feature or a final one-dimensional feature that models user dissatisfaction at any point during a museum visit. The estimation of all four features at each new item creates an estimate of the dissatisfaction of the next probable items, either in a list of items created by collaborative filtering, either in a list of items generated by content-based approaches, or knowledge based item lists, and thus it can be used to propose adjustments to the predictions for the minimisation of the dissatisfaction. As the proposed framework focuses on the dissatisfaction rather than the user satisfaction modelling, it assumes a minimax approach, since it can be used for minimising the possible worst-case loss, that is for minimising the maximum possible estimated dissatisfaction, and is inherently a 'defensive' or conservative approach. Formally, the final dissatisfaction model can be expressed as a four-dimensional vector or a composite one-dimensional factor in which the δ (k) values are weights imposed on the features that can either be heuristic or learned based on an iterative machine learning technique. In order to visualise how this user satisfaction modelling framework performs, a large-scale simulation has been conducted, including 1,000 items, 10,000 users with about 1% ratings of items. Indicatively Fig. 3 presents a typical representation of all estimated featured during a realistic museum tour along with all proximities, similarities and obstacles in each point of interest in three dimensions (real world relative locations). In this figure, the green circles are the target items, blue circles represent proximity related items, red circles represent content-based similar items, magenta squares depict the localised obtrusions and grey squares with numbers represent the composite overall dissatisfaction factor estimated at each item. The strong dotted line segments connect the target (green) items and represent the distances from point to point. In addition, light red dashed lines stemming from the target items connect with the similar items in various locations, and the number on each such line corresponds to the strength of the attraction ("1" being the strongest). Finally, as defined in the introduction of the framework, the size of the circles corresponds to the popularity of each item  and the saturation of the colours of the circles corresponds to the featured score. For tracking the visit direction, all strong dotted line segments denote the connection they correspond to, ie. the label "6 → 7" denotes the pathway from the sixth to the seventh item in the tour. Another visualisation of the evolution of the dynamics estimated by the four features and the final composite feature are shown in the graph of Fig. 4. Here the horizontal axis represents the items visited in the sequence, from left to right, and each feature's evolution is shown with a different area plot for comparison. The foremost bar graph is the composite dissatisfaction estimate.

Conclusion
Intelligent recommenders have appeared in Cultural Heritage applications during the recent decade, with promising results. Specifically cultural tourism and museums have been at the centre of development of technologies and techniques to tackle the information overload and increase user satisfaction for whatever the user motivation in the interaction with cultural heritage. This paper reviewed the recent history of recommender systems in museums from a point of view relating to how user satisfaction has been modelled, and presented a novel user satisfaction modelling framework that is able to capture temporal, proximity, content-based and obtrusion dynamics in the user behaviour during a museum visit, either in the form of an organised predefined narrative or a typical freeroaming visit.
The modelling framework was designed primarily having in mind what was missed by the previous works, the support of scenario based (storytelling) guided tours and a needed augmentation of the main notions of satisfaction applicable to any case of individual museum visit. The modelling framework has already been tested with a large amount of simulated data within an overall hybrid museum recommender and interesting preliminary results have been reported in terms of how it can be used as a means to support minimax-based strategies (user dissatisfaction minimisation). It has also been tested for another class of applications, that of the museum curators and visit program designers, where it can provide aid to identify the weak and strong points in each tour or individual item interactions so that decisions may be supported for changes and rearrangements.