An online research approach for a dual perspective analysis of brand associations in art museums

The paper develops a research approach that combines digital ethnography with text mining to explore consumers’ perception of a brand and the degree of alignment between brand identity and image. In particular, the paper investigates the alignment between the art museum’s brand identity and the brand image emerging from visitors’ narratives of their experience. The study adopts a mixed methodology based on netnography and text mining techniques. The analysis concerns an art museum’s brand, with the case of the “Opera del Duomo Museum” in Florence. The methodological approach enables a combined investigation of user-generated content in online communities and the company’s online brand communication, contributing to identifying branding actions that can be taken to increase the brand alignment. It also enables the measurement of the degree of alignment between museums and visitors among common brand themes. Specific indicators of alignment are provided. A key point is the replicability of the model in other contexts of analysis in which the content produced by consumers in online contexts are relevant and readily available, such as fashion or food.


Introduction
visitors and increasing competition with other museums or cultural enterprises in attracting visitors and tourist fluxes (Gilmore and Rentschler, 2002). In this scenario, brand management is strategic for museums, as demonstrated by cases of success such as the Guggenheim or the British Museum. The study of art museums brands represents the integration of traditional museum management and marketing approaches, and the understanding of museums as brands "opens up new dimensions for understanding the marketing of museums and galleries" (Caldwell, 2000, p. 28).
Studies on art museums have dealt with brand associations, namely the complex of values and images that consumers associate with art museums (Caldwell, 2000;Caldwell and Coshall, 2002;Scott, 2008), and brand identity, namely how the organization defines itself and what it strives for (Aaker, 1996). Museums with a strong identity can better compete, promote exhibitions and perform other tasks connected to their long-term objectives (Pusa and Uusitalo, 2014). Aaker (1996) separates the concept of brand identity from that of brand image, which refers to how consumers perceive a brand as distinct from its competitors and what they associate with a brand. In this regard, the literature on branding highlights the need for an alignment between brand image and brand identity (Venkatraman, 1989). Surprisingly, most studies in this area only investigate consumer-perceived brand image without comparing it with what is expressed by a company to define its identity, while only a few studies investigate both perspectives.
Following this line of reasoning, the paper aims to present an approach that combines digital ethnography (netnography) with text mining to explore consumers' perceptions of a brand and the alignment between the company's brand identity and the brand image defined by consumers. More specifically, the paper focuses on art museums' brand identity and the alignment with the brand image as it emerges from visitors' narratives of their tourism experience. In other words, the paper adopts a perspective that compares what is communicated by art museums in terms of brand identity with visitors' perception of art museums brands emerging from their narratives of tourism experience. To this aim, the study develops an online research approach based on qualitative (netnography) and quantitative (text mining) techniques in which textual data are collected in online contexts, such as blogs, forums, wikis and social networks, where users interact to produce and mutually exchange information (Schau et al., 2009;Chan and Li, 2010). Thus, the paper investigates content generated by tourists (from now on, user-generated content -UGC), namely travel-related content created and uploaded by tourists online mainly in the form of texts, as a relevant means through which tourists express themselves and narrate their tourism experience online. UGC is particularly important for marketers when it is brand-related, thus it affects brand image through brand associations and shapes consumer brand perceptions (Zhang and Sarvary, 2014). Studies that use user-generated content for the analysis of a brand are quite common in the marketing domain (Archak et al., 2011;Tirunillai and Tellis, 2012). UGC is recognized as increasingly influential in tourism management, especially in the form of visualization of visitors' experience through photographs shared in image-based social media (Konijn et al., 2016;Acuti et al., 2018;Milanesi and Guercini, 2020), as well as in the form of reviews of visitors' experience published in thematic websites such as TripAdvisor (O'Connor, 2010). UGC is characterized by credibility, virality and accessibility that render UGC a powerful medium in shaping the online image of a cultural institution or a destination (Mak, 2017).
However, even if museums play a central role in developing tourism, at the best of authors' knowledge studies on UGC related to both museums and museum brands are limited. To fill these research gaps, the paper investigates the case of the "Opera del Duomo Museum" in Florence (Italy), one of the main attractions of the city proposing a methodology that combines netnography and text mining. By identifying appropriate online communities (Tripadvisor, Yelp and GetYourGuide) where visitors virtually interact, netnography provides narratives (texts) of tourism experience (Westwood, 2004). Once consumer textual data are collected together with company textual data (from company website), text mining is used to extrapolate information utilizing computer applications. The contribution of the paper is twofold. In addition to proposing a novel methodology for the study of a brand, based on user-generated content and the use of text mining tools, it contributes to the understanding of art museums as brands by investigating the alignment between brand identity and brand image according to with a dual perspective (museum vs visitors).
The paper is structured as follows: in the next section, we present the theoretical background of the study; in section 3, we introduce the methodology and the research process based on data source identification, data collection, data processing and data analysis. We then present and discuss the main results of the study. The paper concludes with implications, limits and future research agenda.

Brand associations and alignment: Review, research gap and methodological issues
A central issue in marketing research is the study of consumer brand knowledge and brand associations that synthesize consumers' brand perceptions (Aaker, 1991). Consumer brand knowledge is related to the cognitive representation of the brand (Peter and Olson, 2001). Firstly, companies need to develop a clear understanding of consumer brand knowledge to implement marketing activities that will improve brand equity (Keller, 2003). Brand knowledge is defined in terms of "the personal meaning about a brand stored in consumer memory, that is, all descriptive and evaluative brand-related information" (Keller, 2003, p. 596). Thus, consumer brand knowledge is a strategic resource to be controlled and managed by companies over time (Romaniuk and Gaillard, 2007). Keller's (1998) model proposes that brand knowledge is made of brand awareness and brand image. Brand image is further detailed due to its complex nature. Brand image is the result of favourable, strong, unique brand associations held by consumers. Various types of brand associations are depicted in the model by Keller (1998): product-related attributes; non-product related attributes, such as price, user/usage imagery, brand personality, and feelings and experiences; benefits (functional, experiential and symbolic); attitudes. Thus, brand associations compose the brand image as perceptual nodes that consumers associate with the brand and keep in their minds. Companies look for strong, positive and unique brand associations to increase brand equity (Broniarczyk and Gershoff, 2003). The perception of brand uniqueness produces brand differentiation (Pechmann and Ratneshwar, 1991) and has a positive impact on consumer choices and brand performance (Romaniuk and Gaillard, 2007). However, such a positive relationship is not taken for granted, since it implies a high degree of alignment between consumers' and the company's brand association. Nandan (2005) describes alignment as a situation in which "the consumer has great understanding of (an agreement with) the brand message" (p. 271). Therefore, studies on branding have highlighted the importance of congruence and brand association matching. In particular, congruence is defined as "the extent to which a brand association shares content and meaning with another brand association.
[…] The congruence among brand associations determines the 'cohesiveness' of the brand image-that is, the extent to which the brand image is characterized by associations or subsets of associations that share meaning" (Keller, 1993, p. 7). There can be brand image incongruity, defined by Sjödin and Törn (2006) as the mismatch between brand communication and existing brand associations. In this regard, studies have been conducted on the fit between brand attitude and information conveyed through brand extension (Grime et al., 2002), as well as the match between brand associations and advertising (Dahlén et al., 2005) or sponsorship (Gwinner and Eaton, 1999). All these studies concern brand association matching that implies the degree of match/mismatch between brand associations defined by the company (company-defined brand associations) and brand associations perceived by consumers (consumer-perceived brand image).
Thus, companies look for an alignment (Venkatraman, 1989) between brand image and brand identity, which includes all the defining attributes that a company seeks to communicate externally (Keller, 2003). Despite companies' intense efforts to effective communication, consumer brand perceptions may however assume features that differ from those companies seek to transfer as components of the brand identity. Ross and Harradine (2011) addressed the issue of intended vs actual perceptions of brands and found a substantial misalignment between the perception of a clothing brand by its owners and the perception by consumers. Madhavaram et al. (2005) observed that brand association matching is difficult to achieve because communication to transfer brand identity is complex. However, companies that manage to achieve a high degree of alignment can better reshape the network of brand associations in the consumer's mind than competitors, with positive implications for brand image and brand equity management, positioning, communications and perceptual competition analysis (Till et al., 2011;Crawford Camiciottoli et al., 2014).
The evolution of the technological and sociological context implies new challenges and opportunities for research in this field, pushing towards greater integration and hybridization between qualitative and quantitative research methodologies (Guercini, 2014). The adoption of novel methodologies allows filling a relevant gap in the field of branding research, that arises in offline brand studies and persists in the recent online brand studies. Extant studies that explore the impact of offline communication on brand image, limit their investigation to the ensuing consumerperceived brand associations, without a comparison with the company-defined brand associations (Meenaghan, 1995;Gwinner and Eaton, 1999;Underwood, 2003). No study takes into account a double perspective, not even those, to cite a few, on brand and store image brand image with the store image (Yoo et al., 2000), consumer brand perception following decisions of brand extension (Salinas and Pérez, 2009), and country-of-origin effects (Diamantopoulos et al., 2011). We find the same research gap in online brand studies.
The advent of social media has stressed the multi-vocal nature of the brand, which is related to the participatory, collaborative and socially-linked behaviours by consumers that serve as creators of brand stories thus determining brand associations (Gensler et al., 2013). The risk for companies is to lose control over their brand communication because of the multi-vocal nature of brand authorship in online contexts, where the voice of consumers is the most relevant. Consumers experience multiple virtual settings (blogs, forums, social networks) as new market spaces where they spontaneously interact, exchange information, opinions and perceptions (Kozinets, 2002) in the form of user-generated content (UGC) and give voice to their relationship with products and brands (Ramaswamy and Ozcan, 2016). Moreover, social media have a highly pervasive impact on the market since they are digital, visible, ubiquitous, available in real-time and dynamic (Hennig-Thurau et al., 2010). Such characteristics and trends can't be ignored either by marketing researchers or by brand managers. The latter need to deal with a critical question that is to "understand how to successfully coordinate consumer and firm generated brand stories" (Gensler et al., 2013, p. 243). Marketing researchers need to develop methods that measure the brand association alignment and tools to deal with UGC. Only a limited number of studies proposes a dual perspective. An attempt to combine the voice of consumers with what the company communicates emerges in the studies of Crawford Camiciottoli et al. (2014) and Költringer and Dickinger (2015). Both of them do not provide measurements of brand association alignment but demonstrate alternatives to extract brand identity and brand image through web content mining. Ranfagni et al. (2016) adopt a dual perspective and develop indicators to measure the alignment between the brand personality that consumers perceive and what a company intends to communicate. Following this line of reasoning, we develop a mixed methodology that includes netnography and text mining techniques and allows a combined investigation of UGC in online communities and the company's online brand communication.

The study of art museum's brand
When it comes to museums, research on the topic has increasingly emphasized the market orientation and the application of marketing management in the museum sector (Caldwell, 2000;Camarero Izquierdo and José Garrido Samaniego, 2007), and specific attention has been given to brand management (Caldwell and Coshall, 2002;Wallace, 2006;Liu et al., 2015). Studies on art museums have dealt with brand associations, namely the complex of values and images that consumers associate with art museums (Caldwell, 2000;Caldwell and Coshall, 2002;Scott, 2008). In this regard, Caldwell and Coshall (2002) consider art museums as the locus of both product and brand associations, intended as visitors' attitudes, impressions, dispositions, or mental constructs. The two authors note that, in the case of art museums, a peculiar characteristic of brand associations is that they are developed by visitors both to the museum as a brand name and to the experience of visiting the collections. Such research on brand associations has been conducted with the repertory grid method that allows detecting the cognitive constructs relevant to the consumer's image of the museum, and what these constructs tell about the associations which visitors have developed about art museums as brands.
Brand associations that visitors develop constitute a set of expectations about a particular museum (Caldwell and Coshall, 2002). Research in this field has also dealt with brand identity. Pusa and Uusitalo (2014) discuss ways to create brand identity in art museums and rely on the four general brand dimensions described by Aaker (1996) -product, person, symbol and organization -discussed in the context of an art museum. Art museums can be perceived as products, in which collections and exhibitions form their core product, and museum services (e.g., museum shops or educational programs) form the augmented product. The view of museums as persons implies the building of a brand personality. Art museums have a personality that is partly defined by the sum of personalities of exhibited artists (Schroeder, 2005). The museum's manager, curator or founder, especially for smaller organizations, can contribute to building brand personality by projecting their characteristics and artistic taste onto the museum.
The creation of brand identity for museums goes also through symbolism. Scott (2008) states that museums provide many symbolic and intangible benefits for communities. Components of museum brand as symbol are also the brand name, the brand inheritance and also the museum building and its architecture. Art museums can be also seen as organizational brands, in the sense that certain organizational characteristics, such as values, culture, norms, people and behaviours, contribute to the creation of brand identity in museums. Art museums that manage to build a strong brand identity can compete, promote exhibitions and achieve long-term objectives (Pusa and Uusitalo, 2014). Even for art museums, the study of brand associations matching, and thus the relation between brand identity and brand image can be relevant. Each art museum has a distinctive identity, but it may fail to communicate it, with consequences on brand image based on visitors' perceptions and how visitors distinct the museum from its competitors.

Methodology
We propose a methodological approach in which the qualitative technique of digital ethnography is reinforced by the integration of the quantitative text mining techniques (Crawford Camiciottoli et al., 2014). More specifically, digital ethnography -or "netnography" -may be seen as an instrument to understand "tastes, desires, relevant symbol systems, and decision-making influences of particular consumers and consumer groups" (Kozinets 2002, p. 61). Thus, netnography is a qualitative method used to explore consumer interactions in virtual communities through computer-mediated discourses, rather than data collected from live encounters (Kozinets, 2002;La Rocca et al., 2014).
Text mining (Hearst, 1999;Witten, 2005) is used to extrapolate information from relatively large amounts of electronically stored textual data utilizing computer applications. As with data mining, the distinctive feature of text mining is the capacity to extract new and previously unknown information from textual data, thus offering far more than simple information retrieval (Hearst, 1999). To achieve this, metadata (i.e., data about data) is automatically inserted into text files, often in the form of tags that label items according to specific criteria, such as part-of-speech category, thematic area or semantic domain. This makes it possible to reveal trends and patterns across textual data that could not otherwise be discovered, particularly when dealing with relatively large amounts of text. Examples of text mining are becoming increasingly visible on websites, for example, tag or word clouds that provide a visual summary of the thematic content of web texts, or graphical visualizations of the cycle of news on a given topic over time produced by meme-tracking software.
We selected the case of "Opera del Duomo Museum" in Florence. The museum is one of the main attractions in Florence. The numbers give an idea of what is really behind this monument: 6000 sq. m. of showroom space, over 1000 years of history in Florence, more than 750 works of art on display in 25 rooms on 3 floors. 1 The Museum attracts impressive tourist fluxes with thousands of visitors each year, who participate as active online users reporting their tourism experiences and perceptions of the museum on online platforms. To explore consumers' perception of the "Opera del Duomo Museum" and investigate the alignment between the company's brand identity and the brand image perceived by consumers, we followed methodological steps that are synthesized in Fig. 1 and further detailed in the next sub-sections. More specifically, we followed the netnographic principles for the identification of data sources and data collection, while we used text mining techniques for data processing. Then, we manually built two alignment rations based on the results of data processing.

Data source identification
The process begins with the identification of data sources to collect online textual data. Intending to compare the art museum's brand identity and the brand image emerging from visitors' narratives of their tourism experience, we selected sources of institutional communication (Capriotti and Kuklinski, 2012) transferred from the museum to customers (visitors), and sources containing UGC in terms of ideas, perceptions and opinions that customers (visitors) have about the experiences the museum offers them, such as services and cultural products. Thus, while on the museum side, we collected data from the company website, on the visitors' side, we collected data from virtual interactive spaces (Kozinets, 2010). The latter were selected based on a high ranking in terms of Alexa (a tool that allows finding traffic statistics) traffic data, membership and incoming links (Bardzell et al. 2009). More specifically, we chose UGC from Tripa dvisor. com, Yelp. com and Fig. 1 The methodological steps: netnography and text-mining GetYo urGui de. com. Tripa dvisor. com is probably the main travel website for information and reviews. GetYo urGui de. com offers more circumscribed information and reviews focused on specific actors of a city's tourism offer. Yelp. com is a generalist website; it contains opinions and perceptions on experiences lived in places of cultural entertainment.
Data collection Company data were manually collected from the above-mentioned data sources and stored in a text file (5900 words, company dataset). Data collection was facilitated by the tagging and search tools provided on the selected websites. Because of their greater number, visitors' data from the three data sources were collected with a software, DataMiner, which is a Google Chrome extension that helps researchers in crawling online data (Liu and Zhang, 2012). The textual data were downloaded automatically and saved in an Excel file. They were then copied and pasted into a text file (67,157 words, visitors' dataset). For the analysis, only texts in English were collected and covered a timeframe spanning from January 2016 to May 2018.
Data processing and co-occurrence analysis Data were processed with text mining tools. We started by analysing the company text file using the T-LAB software, an all-in-one set of linguistic, statistical and graphical tools for text analysis. We mapped the co-occurrence relationships between keywords. Before beginning the analysis, the TLAB software did a "linguistic normalization" (Salton, 1989) to correct ambiguous words (typing errors, slang terms, abbreviations), carry out cleaning actions (e.g., the elimination of excess blank spaces, apostrophes, and additional spaces after punctuation marks, etc.), and convert multi-words into unitary strings (e.g., "Opera Duomo Museum" became "Opera_Duomo_Museum"). Then, it executed the text "lemmatization" (Karypis et al., 2000): it has turned words into entries corresponding to lemmas. A lemma defines a set of words having the same lexical root (or lexeme) and belonging to the same grammatical category (verb, adjective, etc.). Thus, lemmatization acts by transforming, for example, verb forms into the base form ("speaks" and "speaking" become "speak") and plural nouns into the singular form. The word co-occurrences are lemmas computed within elementary contexts (henceforth EC) (e.g., sentences, fragments, paragraphs). Their analysis allowed us to identify the total amount of ECs where a lemma is used together with (co-occurs with) a lemma-target (Doddington, 2002). In our analysis, the target lemma was Opera_Duomo_Museum (henceforth ODM).
The co-occurrences as basic linguistic tools can be interpreted in different ways. Netzer et al. (2012) propose an interpretation assessing that "the proximity or similarity between several terms [is] based on the frequency of their co-occurrences in the text" (p. 523). However, it is possible to go beyond this possible tie. Given a certain p value (0.05) and degrees of freedom (df1), the value of the chi-square relative to co-occurrences that T-LAB provides also indicates the statistical significance of the relationship between lemmas. Moreover, the set of relations between a target lemma and other lemmas can be designated by an associative semantic network. According to Akama, Miyake et al., (2009), semantic networks are often "used to represent linguistic information in intuitively accessible forms with a graph consisting of vertices that represent words or concepts and edges that represent lexical relationships, such as adjacency, association, or co-occurrence" (p. 4). Co-occurrence analysis can identify key concepts in a language corpus and determine how these concepts are semantically interrelated (Andéhn et al., 2014). The resulting network map can be seen as an assemblage of concepts organized around themes that express a representation of an underlying thought (Trochim, 1989). Thus, by extracting co-occurrences from the company file, we discovered the themes that the ODM deals with in its brand communications. Then, by combining the frequency of each of them with the values of the relative chi-square (>3,84), it was possible to select the most significant ones. Subsequently, we also performed the same analysis on the visitor text file. According to Netzer et al. (2012), "the information that consumers voluntarily and willingly post on consumer forums and message boards opens a window into their associative and semantic networks, as reflected by co-occurrences of brand references and descriptions of those brands in the written text" (p.523). From the co-occurrences resulting from visitors' text files there emerged a semantic network identifying the brand themes that visitors deal with as they narrate the experience they had at the museum. The comparison of this network with that derived from the co-occurrence analysis performed on the company file generated an intersection area facilitating the identification of common brand themes. These themes correspond to common lemmas that co-occur with the target lemma-ODM both in the communications of the museum and in the narratives of the visitors (Alfa, Beta and Gamma). At a later data processing stage, these common lemmas assumed the role of target lemmas. Moreover, using a co-occurrence analysis on the company file and visitors' file, we identified the ECs in which the common brand lemmas co-occur with other lemmas. For each of them, there emerged two semantic networks expressing associative items that the company and visitors deal with talking about the related brand theme. The comparison of these semantic networks, shown in Fig. 2, led to another intersection area identifying the second level of common themes as subcategorization of a first level-brand theme (these are L, M, N for Alfa; A, B, S, T for Beta; D for Gamma).

Fig. 2 Co-occurrences and semantic associative networks
Data analysis For the data, we proceeded to the calculation of two alignment ratios.

1) Visitor brand alignment (first level)
The first ratio (VBA FL ) takes into account the common brand themes related to ODM that are more relevant both for the museum and visitors. Thus, we selected co-occurrences identifying ECs where the relationship between the target lemma (ODM) and the common lemmas is statistically significant (chi-square values higher than 3.84). Then, we proceeded to calculate the ratio. More specifically, raw frequencies of common lemmas were tallied and then normalized to the number of contexts per 100 ECs in the visitors' text file. This indicates for every 100 ECs of the visitor text file the frequency of occurrences of common brand themes.

2) Visitor brand alignment (second level)
The second ratio (VBA SL ) takes into account the common associative items resulting from each brand common theme (Alpha, Beta and Gamma). Each of them is used as a target lemma. For the analysis, we selected co-occurrences identifying ECs where the relationship between each target lemma and the common lemmas is statistically significant (chi-square values higher than 3.84). These are L and M for Alfa, A and B for Beta, D for Gamma. The ratio (VBA SL_Alfa ) for the first brand common theme (Alpha) is calculated. More specifically, raw frequencies of common lemmas were tallied and normalized to the number of contexts in the visitors' text file per 100 ECs. The same measurement is made for the other common brand themes (Beta, VBA SL_Beta and Gamma, VBA SL_Gamma ). The sum of the results of the three ratios gives the value of VBA SL . The higher VBA SL , the higher the degree of alignment on the associative themes related to common brand themes. More specifically, the higher the ratio, the more museum and visitors, when they talk about common brand themes (Alfa, Beta and Gamma), engage with common items.

Results/analysis
We now present the main results of our study on the "Opera del Duomo Museum" (data from the co-occurrence analysis are shown in Table 1). On the museum side, the analysis shows that among the lemmas that co-occur with the target lemma ODM and are statistically significant, the main lemmas are "Cathedral", "Art", "Florence", "Visitors", "Work", "Museum" and "Pietà". On the visitors' side, the analysis shows that among the lemmas that co-occur with the target lemma ODM and are statistically significant, the main lemmas are "Art", "Doors" and "Baptistery". However, if we only concentrate on the most significant common co-occurrences, it emerges four themes on which museums and visitors speak about ODM. These are "Art", "Exhibit", "Pietà" and "Work" (see Fig. 3). If we consider the frequency of the ECs for each common lemma resulting from visitors' narratives and the total of ECs composing these narratives (1284), we obtain VBA FL = 34.26. The theme on which there is the greatest alignment is "Art" (13.31), followed by "Work" (9.34), "Pietà" (5.84) and "Exhibit" (5.76). In Table 2 there are some examples of ECs in which the target lemma is used together with the lemmas "Art", "Exhibit", "Pietà" and "Work".    We now perform a second level of analysis in which we process the company file and the visitors file identifying the co-occurrences for the target lemmas "Art", "Exhibit", "Work" and "Pietà". The goal is to understand what the museum and visitors associate with these four lemmas and measure the alignment for each of these. The measurement takes into account the most significant co-occurrences both on the museum side and on the visitors' side. With regard to the theme "Art", the analysis shows that both the museum and consumers talk about "Art" in combination with the items "Cathedral", "Museum", "Time", "Florence" and "Tour", with VBA SL_Art = 25.5. If we break down this ratio, we discover that "Museum" and "Florence" are the most recurrent associations; the former is present in 171 ECs, the latter in 62 ECs. The share of VBA SL_Art that "Museum" generates is equal to 13.31, while "Florence" accounts for 4.82. For the theme "Work", both the museum and visitors talk about it together with the items "Florence", "Place", "Visit" and "Year", with VBA SL_Work = 8.33. "Florence" results in more VBA SL_Art than VBA SL_Work . Indeed, it constitutes a share of VBA SL_work at a value of 2.180. The lemma that mainly composes VBA SL_work is "Visit", with a share of 3.73. Regarding the theme "Pietà", both museum and its visitors associate it together with "Work" and "Great Museum", with VBA SL_Pietà = 2,80. "Work" contributes to VBA SL_Pietà more than "Great Museum" (1.55 vs 1.24). For the theme of "Exhibit", visitors and the museum deal with it together with the themes of "Baptistery", "Duomo", "Florence", "Museum", "Sculpture" and "Visit". Among these, "Museum" weighs in most heavily to form the value 16.4 of VBA SL_Exhibit , with a share of 5.76. However, it contributes less than it does for the share of VBA SL_Art . Overall, the value of VBA SL is 34.25.

Findings
Our analytical approach allowed us to carry out a brand alignment analysis by applying two indicators, VBA FL and VBA SL. The greater the value of VBA FL , the more the themes that the museum transmits by communicating its identity are aligned with those narrated by the visitors. The themes the visitors touch upon reflect the image they have of the museum as a brand. VBA FL is a synthesis of the matching that exists between the brand identity and the brand image of the museum. The greater the value of VBA SL, the more museum and visitors share the content of the themes they associate with the museum brand. In the case investigated, this value

ECs from the museum text file
"What the Opera_Duomo_Museum says about itself: as part of the Opera, the Opera_Duomo_ Museum defines itself as a member of the unique ensemble of art, faith and history". "Dialogue with contemporary art through works exhibited in the Opera_Duomo_Museum: we know the only show on Venturino Venturi and the work on display at the Opera_Duomo Museum". "A series of Opera_Duomo_Museum branded products that draw inspirations from the masterpieces and details of the works of art that make up the monumental complex of Piazza_del Duomo". "Here's what you need to know about his works exhibited in the Opera_Duomo_Museum". "During the renovation of the rooms on the ground floor of the new Opera_Duomo_Museum, in December, 2012, a particular buried micro-architecture emerged: a sort of well, with a cylindrical section, with a dome-shaped covering, which is now exhibited in Room V: the Sculpture Gallery". "In theme continuity with the celebration for the centenary of the artist's birth, when in the same Opera_Duomo_Museum, in the Sala della Pietà by Michelangelo, was hosted the marble Pietà by Venturi". "The particuliar mission of the Opera_Duomo_Museum is to adequately present the works done for these buildings which together constitute what is today called the Great Cathedral Museum. "Today, the Opera_Duomo_Museum welcomes a fundamental work by Verrocchio: the Beheading of St. John the Baptist for the silver altarpiece of the Baptistery of Florence to which the artist worked from 1477 to 1480".

ECs from the visitors' text file
"This is a small museum located right by the Duomo. It is about three floors, and all the art has a religious theme I am really into art, so I was very impressed by many of the pieces that were on exhibition". "No long lines to get in perfect summary of Renaissance art. One of the most important museums about Renaissance art. Orginals of the Baptistery doors are also here". "Nice museum. The Museum is less packed than other known museums in Florence and it is actually very good nicely organized with good exhibits". "The museum is very well laid out and all the rooms are numbered which helps you to stay on track.
The exhibits are phenomenal ranging from sculpures, paintings, Christian artifacts, clothing as well as a silver alter". "Pietà was probably my favourite bit of sculpture in Florence. And to crown it all, it seems like the amorphous mass of tourists shares my previous assessment of this as a museum you could afford to miss out on". "The museum also hosts Michelangelo's final piece of work (Pietà) considered by many his testament.
The museum is airy, not crowded but definitely worth visiting". "One of the best museum in Florence with amazing art work". "The newly renovated museum of works of art from the Florence cathedral more exciting than ever. It contains many of the original works from the Duomo, like the famous Ghiberti_doors". is given by the sum of VBA SL_Art, VBA SL_Work, VBA SL_Pietà e VBA SL_Exhibit. "Art" is the theme where the museum and visitors converge most in terms of content. Both talk of "Art" by associating it with "Cathedral", "Museum", "Time", "Florence" and "Tour". Among these, "Museum" and "Florence" contribute more to form the value of VBA SL_Art . This means that the more they are used in the communication of the museum, the more it is possible to strengthen "Art" as an associative brand theme and its relationship with the museum. It follows that VBA FL and VBA SL are not two distinct ratios. They are tied together. The higher the VBA SL , the greater the possibility of strengthening the bond between the Opera del Duomo Museum and the common brand themes. The methodological approach we propose makes it possible to investigate the perception of the museum brand, to compare it with the communicated identity contributing to identifying branding actions that can be taken to increase the brand alignment. It also enables the measurement of the degree of alignment between museums and visitors among common brand themes. Moreover, the museum can determine if the content it attributes to common brand themes are similar to those the visitor attributes and can consequently reformulate its brand communication. The strength and immediacy of the link between the target lemma ODM and common brand themes depend on this alignment. The structured analytical process that we propose investigates brand alignment by exploiting information from UGC in online communities, without directly involving the consumer or using complex mathematical and statistical techniques. In this way, it provides managers with a more accessible way to analyse brand perception as it emerges from travel-related content created by tourists online. The application of this method requires close integration between language skills and managerial skills.

Conclusions
The paper has developed a research approach that combines digital ethnography with text mining to explore consumers' perceptions of a brand and the degree of alignment between brand identity and image. In particular, the paper has investigated the alignment between the art museum's brand identity and the brand image emerging from tourists' narratives of their experience. The method is innovative as it uses a mix of techniques that by extrapolating associative brand themes, describe the brand image and the brand identity associations. Being easily transferred to research domains that are different from museums, it gives a contribution to the brand analysis studies in marketing literature (Broniarczyk and Gershoff, 2003;Nandan, 2005;Sjödin and Törn, 2006;Ross and Harradine, 2011;Crawford Camiciottoli et al., 2014). From a practical point of view, since it does not directly involve consumers and require the use of complex mathematical and statistical techniques (cf. John et al., 2006), it provides managers with a more accessible way to analyse brand associations as perceived by online consumers. Moreover, managers can benefit from our method in analysing brand associations with reference to competitors: they could investigate levels of matching with competitors through cross-brand comparisons between associative themes used by consumers and competitors to characterize brand associations. Future research directions can be outlined. First, our research approach can be used for comparative analysis among competing museums. This kind of analysis is facilitated by using information that is easily accessible online at relatively low costs. It is possible to investigate whether museum brands as competitors are perceived similarly or differently, as well as the distinguishing elements and the positioning that each of them takes in the mind of the consumer. Second, a key point is the replicability of the model in other contexts of analysis in which the content produced by consumers in online contexts are relevant and readily available, such as fashion or food. Finally, future research may consider the integration of textual data with other types of data emerging from online contexts and created by consumers, such as images and videos, whose collection is facilitated by the everincreasing importance of visual-based social media, such as Instagram.
However, our approach is not free from limitations. It only considers common brand lemmas and does not explore those that are not in common (neither in a first-level analysis nor in a second-level analysis). Understanding where alignment is missing is important for setting certain branding strategies properly. Furthermore, our approach presents ratios that arise from online data. It might be interesting to integrate them with textual data from sentimental analysis and combine them with performance indicators such as the number of visitors and measurements of the quality of the services provided. Despite these limits, we propose a method that could be innovative and useful for museums. Their managers find themselves living in increasingly competitive environments, making communication policies aimed at generating a diversification effect in the minds of visitors. The possibility of exploiting online digital data could be advantageous. It makes it possible to be reactive while managing to monitor the perception of visitors in online contexts and promptly intervene in cases of word-of-mouth that negatively affects the brand image. Considering this, analytical attitude and market orientation are destined to be two inevitable components of a modern brand management approach in cultural enterprises.
Data availability Not applicable.
Code availability Not applicable.

Conflicts of interest/competing interests Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.