UR: SMART–A tool for analyzing social media content

The digital transformation, with its ongoing trend towards electronic business, confronts companies with increasingly growing amounts of data which have to be processed, stored and analyzed. Instant access to the “right” information at the time it is needed is crucial and thus, the use of techniques for the handling of big amounts of unstructured data, in particular, becomes a competitive advantage. In this context, one important field of application is digital marketing, because sophisticated data analysis allows companies to gain deeper insights into customer needs and behavior based on their reviews, complaints as well as posts in online forums or social networks. However, existing tools for the automated analysis of social content often focus on one general approach by either prioritizing the analysis of the posts’ semantics or the analysis of pure numbers (e.g., sum of likes or shares). Hence, this design science research project develops the software tool UR:SMART, which supports the analysis of social media data by combining different kinds of analysis methods. This allows deep insights into users’ needs and opinions and therefore prepares the ground for the further interpretation of the voice. The applicability of UR:SMART is demonstrated at a German financial institution. Furthermore, the usability is evaluated with the help of a SUMI (Software Usability Measurement Inventory) study, which shows the tool’s usefulness to support social media analyses from the users’ perspective.


Introduction and Motivation
The digital transformation, with the rise of new technologies such as "Cyber-Physical-Systems (CPS)", "Virtual Reality", "3-D Printing" and "Auto-ID-Techniques" (Hänisch 2017), and the ongoing trend towards electronic business confront companies with increasingly growing amounts of data that have to be processed, stored and analyzed (CapGemini 2020;Fill and Johannsen 2016). Instant access to the "right" information at the time it is needed is crucial and thus the use of techniques for the handling of big amounts of unstructured data, in particular, becomes a competitive advantage (Bali et al. 2017;Fill and Johannsen 2016;Grover et al. 2018;Hwang 2019). This is particularly true for so called knowledge-intensive business areas where the processing and analysis of big data become highly relevant.
In this context, one important field of application is digital marketing (Chaffey and Ellis-Chadwick 2019) because a sophisticated data analysis allows companies to gain deeper insights into customer needs and behavior (Kitchens et al. 2018;Schwaiger et al. 2017). For this purpose, it is especially necessary to critically scrutinize textual data to get to know customers' opinions and preferences in the most direct and genuine way possible. In a digitalized world, user generated content, such as guest reviews, complaints, and posts in online forums and social networks, is particularly suitable for this task (Hwang 2019;Pinto and Mansfield 2012;Sigala 2012). Above all, data from social media provide a valuable source since more and more customers prefer to contact a company via social media to make service requests and complaints or to settle transactions (Baumöl et al. 2016;Hanna et al. 2011;Statista 2018a). Furthermore, social media technologies have advanced to a key component in today's social life, counting 3.6 billion people, including about 2.7 billion active Facebook users, using some sort of social media (Statista 2020a(Statista , 2020b. Those platforms are often used to honestly express someone's opinion and they cover a wide range of customers but also non-customers at the same time. That way, social media data can be exploited by IT-based data analytics, e.g., for the purpose of market analysis or campaign planning straight away (Hwang 2019;Malthouse et al. 2013;Stieglitz et al. 2014;Trainor et al. 2014).
However, because of the particular characteristics of the underlying data basis as well as the intended analytical methods, this type of data analysis carries certain challenges (cf. Sivarajah et al. 2017;Stieglitz et al. 2014Stieglitz et al. , 2018. With regard to the data basis, it requires the gathering, processing and analysis of extremely large amounts of highly complex textual data, which emerge and change with high velocity (Holland 2020;Idrees et al. 2019). By looking at textual data in particular, one crucial aspect in this context is the handling of textual errors (e.g., misspelling) or ambiguous data (e.g., irony, slang, etc.) and their correct interpretation concerning the content (Idrees et al. 2019;Laboreiro et al. 2010;Naaman et al. 2010;Petz et al. 2013;Stieglitz et al. 2014). Considering analytical methods, there is a noticeable trend towards more complex forms of analyses (Stieglitz et al. 2018). These include, among others, making forecasts, revealing cause and 1 3 UR: SMART-A tool for analyzing social media content effect relationships, and providing guidance on how to act in specific situations, which requires sophisticated analytical approaches that allow the combination and integration of various data sources, formats and analysis techniques (Hübschle 2017; Hwang 2019; Sivarajah et al. 2017).
In the domain of digital marketing and social media, many analysis and monitoring tools for collecting and processing user data directly from platforms like Facebook or Twitter have emerged in recent years (Batrinca and Treleaven 2015;Guesalaga and Kapelianis 2016;Kohli et al. 2018). Tools such as Brandwatch, 1 Falcon.io, 2 Facelift, 3 Buffer, 4 etc. "offer access to real customers' opinions, complaints and questions, at real time, in a highly scalable way" (Stavrakantonakis et al. 2012, p. 53) and thus have helped to reduce the manual analysis efforts. However, existing tools cannot completely meet the aforementioned challenges as they, for example, only focus on one analytical approach by either using pure quantitative data (e.g., the tool "Buffer" analyzes the number of fans, likes and shares amongst others) for statistical purposes, or analyzing posts based on textual data (e.g., sentiment analysis or text classification). At this point, it is important to mention that popular tools such as "Brandwatch" and "Falcon.io" offer a wide range of analytical capabilities as they provide functionalities such as sentiment analysis, classification and quantitative data analysis. However, tools on the market usually do not support the combined application of these analyses at the same time. Much more, users may perform singular types of analysis (e.g., analyses of sentiments and quantitative data) in a sequential manner. Therefore, the capabilities for conducting more complex analyses to investigate the data from different perspectives at once -based on complementary approaches (e.g., analysis of the number of shares for particular negative sentiments only) -are restricted (cf. Schwaiger et al. 2017;Stieglitz et al. 2018).
Therefore, the research at hand provides an answer for the following research question: What should a concept and a tool to support a mixed method approach for social media analysis to facilitate decision-making look like?
For that reason, we developed the social media analysis tool UR:SMART as a prototypical implementation, which combines different analysis techniques (e.g., qualitative and quantitative techniques) as well as various data formats (e.g., structured, unstructured) and, hence, facilitates a more thorough investigation of a given data basis, including social media posts or comments on a company's fan page or website. Additionally, it is capable of addressing a wider range of even more complex issues by allowing the combination of various analysis forms and new types of analyses, such as inquisitive or pre-emptive analytics (cf. Sivarajah et al. 2017;Stieglitz et al. 2014Stieglitz et al. , 2018. In this way, a company using UR:SMART may directly learn, for instance, about the reasons behind a positive or negative customer experience (e.g., customer service, product quality, etc.). Such types of information constitute substantial 1 3 knowledge gain and can be utilized in many reasonable ways, for example as a reliable base for decision-making concerning future marketing campaigns.
However, in addition to these obvious benefits for practice, our research also aims to contribute to scientific theories. The use and evaluation of our analysis tool in real-life case studies contribute to knowledge of the underlying social media theory as a result of an improved understanding of the problem (analysis of various kinds of data formats) and solution spaces (scenario-based combination of data analysis techniques). Furthermore, our analysis tool opens new access to the "voice of the customer" (Pande et al. 2014) and therefore provides starting points for the improvement of corresponding methods, e.g., in quality management.
The paper is structured as follows: Sect. 2 covers conceptual basics and related work. In Sect. 3, the research procedure is outlined. Afterwards, the requirements as well as the design of the tool UR:SMART are presented (Sects. 4 and 5). In addition, a case study (Sect. 6) and the results of a usability study by means of the SUMI approach (Kirakowski and Corbett 1993) are described (Sect. 7). The results and implications for the field of research are discussed in Sect. 8. The paper ends with a conclusion and an outlook on future research.

Social Media Analysis
Social media analysis is a vibrant research area and multiple benefits for digital marketing are discussed in literature (e.g., discovery of brand fans, most important social mentions, etc.) Perakakis et al. 2019;Alalwan et al. 2017). Accordingly, a huge variety of social media analysis approaches have been developed in recent years (e.g., Vashishtha and Susan 2019;Stieglitz et al. 2018;Yue et al. 2019). To narrow the scope, we focus on sentiment analysis, classification and clustering in particular, because their effectiveness has been proven in practice (e.g., Alalwan et al. 2017) and they can be purposefully combined to come to a mixed method approach (e.g., Stieglitz et al. 2018) that enables social media data analysis from different angles, which is the research objective of this study.

Sentiment analysis
Sentiment analysis deals with the analysis of "people 's opinions, sentiments, evaluations, appraisals, attitudes, and emotions" (Liu 2012, p. 415) and has been recognized as a highly dynamic research field (Yue et al. 2019;Capatina et al. 2020). Consequently, automated sentiment analysis is a discipline linked to Natural Language Processing, Text Mining, Web Mining and Information Retrieval for instance (Al-Ghamdi 2021;Liu 2012;Yue et al. 2019). Yue et al. (2019) differentiate research streams on sentiment analysis into a (I) "task-oriented", (II) "granularity-oriented" and (III) "methodology-oriented" perspective. Whereas "task-oriented" efforts focus on the tasks to be conducted for determining the sentiment of social media content (e.g., feature identification, subjectivity detection, etc.), the "granularity-oriented" approaches differ in whether the analysis is performed on a document, a sentence or a single word (Yue et al. 2019;Capatina et al. 2020). Then finally, "(semi-)supervised" and "unsupervised" methods are summarized as "methodology-oriented" approaches in this respect (Yue et al. 2019).
Typical works for (I) "task-oriented" research focus on "polarity classification", i.e., analysis of whether a comment is positive, neutral or negative (Yue et al. 2019). In this context, Khan et al. (2016) introduce the so-called "Enhanced Sentiment Analysis and Polarity Classification (eSAP)" framework for instance, which extracts the sentiment scores of user comments with the help of part of speech information. Further approaches to realizing "polarity classification" can be found in Tellez et al. (2017), Raghuwanshi and Pawar (2017) and Nguyen and Shirai (2018), among others. Additionally, "feature" or "aspect-based sentiment" methods allow for a more fine-granular analysis (Yue et al. 2019), whereby those parts of an opinion that refer to listed features (attributes) of a product or service (entity) are investigated (Wojcik and Tuchowski 2014;Zeng et al. 2019;Ojha et al. 2021;Chauhan et al. 2019). Further, the assessment of objectivity and subjectivity in user statements or the usage of scales to judge a positive or negative sentiment more precisely are mentioned in terms of "task-oriented" approaches (Yue et al. 2019).
(II) "Granularity-oriented" approaches with reference to sentiment analysis can be categorized into the three classes "document-", "sentence-" or "word-based" method (Yue et al. 2019;Cao et al. 2016): First, document-based approaches aim towards the classification of the sentiment of a whole text corpus, for example newspaper articles (e.g., Pratiwi 2018). The second category focuses on sentence-based approaches, which analyze whether a single sentence can be classified as having a positive, negative or neutral sentiment. Thereby, sentence-based approaches play a dominant role in the social media discipline because of the shortness of social media posts (Zhao and Rosson 2009).
Five different procedures for sentence-based approaches are discussed in literature, namely dictionaries, corpus-based approaches, syntactic patterns, artificial neural networks and treebanks (Medhat et al. 2014;Yue et al. 2019). When using dictionaries, the sentiment of each entity (e.g., each word) from a text is classified into a positive or negative class. The dictionaries annotate opinion-carrying words. The sentiment of the whole text is determined by considering the sum of the combined scores of all its entities (Kundi et al. 2014;Turney 2002;Khoo and Johnkhan 2018). Corpus-based approaches determine the sentiment based on a domain specific "text corpus" (cf. Sinclair 2004) regarding the context of the sentence, which can be recognized by particular adverbs (Liu 2012;Rice and Zorn 2021). Depending on the application field of the social media analysis (e.g., at a financial service company), a domain-specific dictionary, which relates "text corpora" (e.g., "I appreciate the willingness", etc.) to sentiments (e.g., positive, negative statement, etc.), has to be established for that purpose. Examples for the definition of "text corpora" for Twitter posts are shown by Mainka (2012) amongst others.
Treebanks disassemble the sentence into a hierarchical grammatical structure (Sadegh et al. 2012;Turney 2002;Cignarella et al. 2020;Baly et al. 2017). Therefore, texts are decomposed according to their grammatical structure as exemplarily demonstrated by Augustinus et al. (2017). Then the sentiment of each word is 1 3 analyzed, whereby recursive negations can be identified straightaway (e.g., "it is not the quality which comes off badly").
Artificial neural networks consist of units (neurons) operating in parallel to classify the sentiment of a sentence (e.g., Onaciu and Marginean 2018;Paliwal et al. 2018). In this respect, a semantic network has to be established (cf. Dengel 2012), whereby the terms represent the nodes that are related by weighted arcs (Sebastiani 2002;Socher et al. 2013). The words that need to be classified (e.g., positive, negative, etc.) traverse the network and the overall sentiment of a sentence is reached by aggregating the sentiments of the words. The network can be trained by adjusting the weights of the branches (Sebastiani 2002).
(III) "Methodology-oriented" research on sentiment analysis focuses on the creation of supervised, semi-supervised or unsupervised approaches to determine user opinions (cf. Yue et al. 2019). In supervised learning, the data used for training the algorithm is labeled, which means that the dataset is marked with the results to be provided by the algorithm (Salian 2018). For instance, the social media posts used for training purposes have been marked as having a "positive", "negative" or "neutral" sentiment beforehand (e.g., Vilares et al. 2017;Madhoushi et al. 2015). In unsupervised learning, on the other hand, the algorithm searches for structure in the dataset, while the desired outcome has not been defined (Salian 2018). Corresponding approaches have been introduced by Hu et al. (2013) and Cheng et al. (2017) among others. The use of labeled and unlabeled data is called semi-supervised learning and is frequently used in cases where the full labeling of training examples is too time-consuming (Salian 2018).

Classification
Classification (regarding topics) describes a widely-known supervised analysis technique, which provides the automated mapping of data and uses labeled training data to determine the affiliation towards previously defined categories (Feldman and Sanger 2007;Heyer et al. 2006;Rodrigues and Chiplunkar 2019). Typical approaches in the research field of classification are k-nearest-neighbor (Cover and Hart 1967;Daeli and Adiwijaya 2020), naïve bayes (NB) or more specifically multinomial naïve bayes (MNB) (McCallum and Nigam 1998;Tuarob et al. 2014;Abbas et al. 2019;Farisi et al. 2019), the use of seed words (e.g., Yu et al. 2013) and support vector machines (SVM) (Gunn 1998;Samuel 2021;Shahi and Pant 2018 Focusing on the analysis of large data sets with specialized event models such as social media posts, SVM and NB/MNB deliver convincing results (Jin et al. 2013;Kibriya et al. 2004;McCallum and Nigam 1998;Tuarob et al. 2014;Abbas et al. 2019;Farisi et al. 2019). The classifier implements the NB algorithm for multinomially distributed data. This NB variant is often used within the classification of textual data. Especially when handling a large amount of data, according to McCallum and Nigam (1998), MNB achieves better results than NB (Tuarob et al. 2014).

Clustering
In contrast to classification, clustering describes unsupervised analysis approaches, which focus on the assembly process of data to achieve automatically defined homogenous groups by identifying statistical structures and patterns (Dayan 1999;Ahuja and Dubey 2017). Clustering approaches like k-means (MacQueen 1967; Orkphol and Yang 2019), expectation maximization (Dempster et al. 1977;Shelke et al. 2017) and agglomerative hierarchical clustering (Tan et al. 2005;Praveen et al. 2020) renounce a reduction of dimensionality and try to group matching elements of the dataset based on their structure (Feldman and Sanger 2007;Heyer et al. 2006;AL-Sharuee et al. 2018). The resulting clusters are derived directly from the structure of the data themselves (Feldman and Sanger 2007;Heyer et al. 2006). A common distinction between different types of clustering is whether the set of clusters is nested or not. The former is called hierarchical clustering, in the latter case partitioning clustering (cf. Tan et al. 2005;Kumar and Kumar 2019). Hierarchical clustering thus means that clusters can also have subclusters (Tan et al. 2005;Kumar and Kumar 2019).
To demonstrate the principal difference between classification and clustering we refer to the example of a customer post on the Facebook page of a cooperation partner of ours, namely a European manufacturer of caravans. The post is: "We have bought the product second-hand but it was in a top condition with no scratches or notable damages despite its long usage." In terms of classification, the categories would be predefined (e.g., "product quality", "order", "price" and "complaint"). The post is then analyzed and assigned to one of these categories. A simple approach may use seed words (cf. Yu et al. 2013) for that purpose, which specify each category more precisely, e.g., the seed words "condition", "damage" and "scratch" may characterize the category "product quality". In our example, the post contains these seed words for "product quality" and would most probably be assigned to this category. In terms of the heterogenous character of textual data, the handling of multiple category allocations needs to be considered as well, where a post contains seed words of various categories. At this point, transformation methods for algorithm adaption help to resolve such multi-label classification challenges (Tsoumakas and Katakis 2007). Clustering searches for new ways to group data in an ad-hoc manner. Hence, there are no pre-defined clusters or categories, respectively. Much more, the clusters are derived from an analysis of posts with the help of the above-mentioned approaches and clusters like "caravan", "tent", "camper" or "service" may come up for the caravan manufacturing company. Clustering approaches thus help to identify new topics customers are talking about in the social media channels (e.g., in case there have been changes in the product portfolio).
Additionally, the use of artificial intelligence (AI) for social media monitoring has come up as a further research stream in recent years (cf. Perakakis et al. 2019;Capatina et al. 2020;Al-Ghamdi 2021;Micu et al. 2018;Davenport 2018). For instance, Micu et al. (2018) analyze correlations between users' experience in social media marketing and their knowledge about machine learning algorithms as well as the frequency of applying these procedures in day-to-day business. The authors' major aim is to derive future AI functionalities for social media monitoring software in the areas of audience, image and sentiment analysis (cf. Micu et al. 2018;Capatina et al. 2020). Perakakis et al. (2019) emphasize the potential of AI for social media monitoring and introduce a software tool called "Social Intelligence Advisor" to support digital marketers. The authors also emphasize, however, that much research still has to be done regarding AI usage for social media monitoring (cf. Perakakis et al. 2019). Similarly, Geru et al. (2018) perform an analysis of posts and photos on Instagram to receive insights for marketing research with the help of a machine learning algorithm. In addition, Abd El-Jawad et al. (2018) compare machine learning and deep learning algorithms for social media analysis and develop a combined approach building on text mining and neural networks. To sum up, the literature recognizes the general effectiveness of AI techniques for social media analysis, whereby further efforts are required to create corresponding solutions and quantify the impact for social media marketing more precisely (cf. Al-Ghamdi 2021).

Combining Social Media Analysis Approaches
The literature recognizes the potential of combining different kinds of analysis methods for social media data (Stieglitz et al. 2018). Generally, the combination of analysis methods is widely used in research due to the fact that mixing methods results in a more accurate and complete depiction of the phenomenon under investigation (Johnson 1995;Johnson and Christensen 2000;Patton 1990;Tashakkori and Teddlie 1998). Correspondingly, in different research fields, combinations of different analysis methods were successfully applied, e.g., for IS research methodologies (cf. Gable 1994), process-oriented quality management (cf. Stracke 2006) or Method Engineering (cf. Brinkkemper 1996;Ralyté et al. 2003;Tolvanen et al. 1996). The use of complementary methods is generally thought -as confirmed by all the above-mentioned examples -to lead the investigation to more valid results either by extending the investigation's perspectives or by uncovering new or deeper dimensions. This rests on the premise that the weaknesses in each single method will be compensated by the counter-balancing strength of another (Jick 1979). Since combining methods is time-consuming and compensation relations between methods are not always obvious, a multi method approach is not without some shortcomings and may not be suitable for all research purposes (Jick 1979). The selection of the appropriate methods for combination needs to be carefully justified and made explicit in terms of the definite research aim.

3
UR: SMART-A tool for analyzing social media content In the field of social media analysis, Chen et al. (2014) for instance investigate the major concerns and worries of students in their academic and private lives by looking at Twitter feeds. Therefore, they use content analysis (and the naïve bayes algorithm) to classify the tweets and analyze the categories with the help of statistical frequency measures (total/relative number of tweets for each category, etc.) (cf. Chen et al. 2014). Similarly, Tinati et al. (2014) propose to analyze Twitter data via content analysis and quantitative measures (e.g., number of followers, interaction with others, retweets, etc.) to provide a richer and more contextualized view of the data. Borgmann et al. (2016) analyze the Twitter posts at a medical conference via a combination of descriptive statistics and content analysis to uncover those topics which the participants discussed most often. Further, Brown et al. (2019) show how the combination of various text analyses approaches on Instagram posts helps to recognize young people experiencing serious psychological crisis. Additionally, the research of Tseng et al. (2019) integrates various kinds of social media data for building a hierarchical structure with sustainable supply chain capabilities in the context of the textile industry. Lastly, an example from the food industry is provided by He et al. (2013), who introduce a text mining process for social media content using different kinds of analyses, to enable competitive comparisons for the pizza industry. Even though research on the combination of different kinds of analysis methods has a long tradition, existing research primarily focuses on domain-specific problems that are solved with a predefined set of analysis methods. A general approach that proposes a multi-faceted combination of analysis methods -to be applied independent of a certain domain -is missing in the social media analysis literature to the best of our knowledge. With this research, we aim to close this gap.

Available Social Media Analysis Tools
As already mentioned in Sect. 1, there is a vast number of available analytics tools in the context of social media on the market. We took an up-close look at the most popular tools (e.g., Falcon.io, Facelift, Buffer, Keyhole, Brandwatch) and analyzed their functionality in terms of textual data processing (see Table 1). For this, we installed demo versions of the tools as well as interviewed sales representatives from these companies on the specific features.
Considering Sect. 2.1, sentiment analysis, classification and clustering have been introduced as established social media analysis approaches that are promising for the development of a mixed method approach. Furthermore, the literature in Sect. 2.2 highlights the benefits of using quantitative measures (e.g., number of followers, retweets, etc.) to complement findings from the content or text analysis for instance. Accordingly, we compare existing social media tools with the help of the above-mentioned features -i.e., support of (1) "multi-language sentiment analysis", (2) "classification", (3) "clustering" and (4) "quantitative analysis" -and further add the criterion of whether a (5) "mixed method approach" is supported or not, which was derived from our research question (see Table 1).
None of the tools provides a mixed method approach, considering both a quantitative (e.g., number of likes) and qualitative (e.g., text classification) analysis as focused in this research. Additionally, multi-language sentiment analysis is not supported by some tools. This is especially important for this study since the analysis of German social media posts needs to be considered for our context. Further, it should be mentioned that the functionality of some tools extends the focus of the analysis at large. Sales representatives from "Falcon.io" stated that the tool's focus is the publishing functionality for social media posts. Other sales representatives, for instance from "Facelift", stated that their tool is considered a social media management tool and therefore offers further domain-specific analysis functionalities. To sum up, it can be stated that the available tools provide a vast number of highly beneficial functionalities for digital marketing, which, however, are not focused on the type of social media analysis that is pursued in our research (e.g., no mixed method approach). Hence, the comparison of the tools unveils the need for a dedicated mixed method social media analysis tool in case deeper insights of the data analysis are required by a company.

Design Science Research
The design science (DS) paradigm has its roots in engineering and the science of the artificial (Simon 1996) and is fundamentally a problem-solving paradigm (Hevner and Chatterjee 2010). Research projects that follow the DS paradigm are concerned with the design, development, implementation, use, and evaluation of socio-technical systems in organizational contexts. Design scientists produce and apply knowledge of tasks or situations to create effective artifacts (March and Smith 1995). These artifacts are delineated in different structured forms such as software, formal logic, and rigorous mathematics to informal natural language descriptions (Hevner et al. 2004).
An important step in DS research is to prove the utility, quality, and efficacy of the artifact via well-executed evaluation methods. Since the artifact's performance is related to the environment in which it is used, an incomplete understanding of the environment can induce inappropriately designed artifacts (March and Smith 1995). Therefore, Hevner's "design cycle" (Hevner 2007) substantiates the importance of constructing and evaluating the artifact, and suggests balancing the efforts spent for both activities, which must additionally be convincingly based in relevance and rigor (Hevner 2007).
However, DS research aims not only to increase the technical knowledge base through the designed artifact but also contributes to the scientific knowledge base by performing the research process rigorously (e.g., by reflecting the construction and/ or evaluation of the artifact). Especially in recent sources, the focus on the contribution to the scientific knowledge base (or scientific theories) is strongly discussed (Hevner 2007;Gregor and Hevner 2013;Baskerville et al. 2018) and possible contributions are vaguely outlined. It is noted as the key differentiator between the professional practice of building IT artifacts and DS research (Hevner 2007;Gregor and Hevner 2013) and represents Hevner's "rigor cycle" (Hevner 2007).
Last but not least, the use and evaluation of the artifact in the application domain lead to several practical contributions, which constitute Hevner's "relevance cycle" (Hevner 2007) and which can be seen as self-evident objectives of a DS research project.

Research Procedure
Since we strive for the development of a social media analysis solution, we followed the DS research paradigm (Gregor and Hevner 2013;Hevner et al. 2004) and aligned our research activities with the procedure as proposed by Peffers et al. (2007). This procedure provides a commonly accepted framework for conducting research based on DS principles since it was developed with a consensus-building approach (cf. Peffers et al. 2007). The procedure is a result of a synthesis containing well-agreed upon process elements in DS (Peffers et al. 2007). Peffers et al. (2007) propose a procedure consisting of six activities in a nominal sequence (see Fig. 1). In the first activity, the specific research problem and the value of the solution are defined. As mentioned in the introduction, we argue that efficient tools and methods need to be provided to tackle the rising amount of digital data so that deeper insights into customers' needs can be gained (step 1 -"problem", see Sect. 1). The second activity aims to define the solution objectives, which can be To establish design objectives and to derive requirements that have to be fulfilled to achieve the design objectives for our solution UR:SMART (step 2 -"objectives of a solution", see Sect. 4), we conducted several interviews with practice partners to identify their expectations and to gain insights regarding current limitations of social media analysis efforts. Based on that, we identified the core functionalities to be implemented. Then, in the third activity, the artifact (UR:SMART) was designed based on the requirements as identified beforehand (e.g., classification of data, sentiment analysis) (step 3 -"design & development", see Sect. 5). To demonstrate the applicability as well as the usefulness of UR:SMART and to prove that the solution works, as specified in activity four, we applied our solution at a financial institution (step 4 -"demonstration", see Sect. 6). UR:SMART is still used by this financial institution and also -in the meantime -by further cooperation partners. Activity five aims to observe and measure how well our artifact supports a solution to the identified problem. As part of a broad evaluation (step 5 -"evaluation", see Sect. 7) we present the results of a SUMI usability study. Further evaluations with firms from various branches are currently running. Once these are completed and the software optimized based on the feedback received, the tool will be made publicly available (step 6 -"communication") which is also the sixth and last activity, which aims to communicate the findings of our research.
The orientation towards the procedure by Peffers et al. (2007) also makes it possible to align our research with the guidelines of Hevner et al. (2004) or Hevner (2007), respectively. According to the design cycle, we present our artifact as the result that has gone through the process of demonstration (application of UR:SMART at a financial institute in Sect. 6) and evaluation (conduction of the SUMI usability study in Sect. 7). In view of the relevance cycle, we identified several requirements (from practice) that guided the design of the artifact (see Sect. 4) and the practical application of our artifact brought up several contributions for practice (see Sect. 8.1). In view of the rigor cycle, we used several methods and techniques to rigorously construct and evaluate our artifact (e.g., deductive content analysis, conceputal modeling, SUMI usability study) and derived initial findings as contributions to theory (see Sect. 8.2).

Definition of Design Requirements
To establish design requirements for UR:SMART, we consulted our network of practice partners from various branches such as technology, consumer products and mechanical engineering, which operate in both the B2C and B2B sector. In total, eleven companies agreed to take part in our study. All these companies openly stated their commitment to social media analysis as part of their digital marketing strategy or planned to profoundly invest in corresponding data analysis efforts in the near future. Therefore, they were considered suitable candidates for our investigation.
For the derivation of requirements, we used (1) semi-structured interviews and (2) literature about social media analysis tools (e.g., Aggarwal and Zhai 2012;Batrinca and Treleaven 2015;Guesalaga and Kapelianis 2016;Fan and Gordon 2014;Kohli et al. 2018;Maynard et al. 2012;Stieglitz et al. 2018). Semi-structured interviews put us in a position to adapt the questions to a company's specific context and ask for more in-depth information on demand, which helped to uncover specific challenges of social media analyses in practice as well as desired key functionalities of a possible solution (Corbin and Strauss 2015;Meuser and Nagel 2009). Our interview partners were mainly social media managers with profound knowledge in this field. Moreover, in some cases we were able to interview C-level executives.
First, we asked whether the companies were using automated tools to analyze their social media data. Second, the specific aim and field of use of each type of social media analysis performed was studied (e.g., on products, brand building, reputation or recruiting). Another object of investigation was the analysis method, describing the way in which social media content was being analyzed (either manually or in an automated way). Furthermore, limitations of the tools and desired enhancements of their analysis functionality were considered.
For the systematic analysis of the data we transcribed the interview results and drew upon a twofold procedure for the visualization of the results as shown in Table 2. Following Mayring (2000), we used a deductive content analysis on the one hand for the questions regarding the use of a social media tool and the analysis method used. These questions were asked in a binary manner (e.g., yes/no, automated/manual). On the other hand, we used an inductive content analysis for the questions regarding the focus and desire of the companies to adequately consider their individual character and goals. However, we need to mention that the derived results should not be seen as exhaustive since it only reflects the situations of the interview partners.
By doing so, we were able to discover several problems as well as limitations faced by these companies concerning social media analysis. Overall, it became evident that the efforts required for a manual data analysis are still a big challenge throughout all sectors independent of the company size. Most companies rely on manual efforts for data analyses, as only 36% of the companies interviewed use a commercial social media analysis tool, whereas the rest purely rely on the integrated analytics functionalities of the individual social media platforms (e.g., Facebook Analytics), which are generic and very limited. Regarding the field of use of the social media activities, most companies put an emphasis on brand and reputation building as well as improving their products based on the "voice of the customer" (Pande et al. 2014). In this regard, the desire to gain more detailed insights into the customers' opinions, which extend the findings retrieved from a manual analysis, became evident during the interviews. Several questions such as, "How can product criticism be automatically detected for specific product improvements?" and "How can a shitstorm be detected, so that it does not result in a loss of reputation?" arose, which can only be purposefully answered with the help of an automated analysis approach.
Based on the interviews and findings from literature, we came up with the core design requirements as shown in Table 3, which guided the upcoming design phase UR: SMART-A tool for analyzing social media content

Description
Data import/extraction 1 Automatized data extraction and preprocessing A tool must enable the automatized extraction of data from the channels Twitter and Facebook. Moreover, the preprocessing of the data ("tokenization", "stop word elimination", "stemming" and "normalization") is to be made possible (cf. Aggarwal and Zhai 2012;Batrinca and Treleaven 2015;Fan and Gordon 2014;Kohli et al. 2018) 2 Import data from open accessible internet forums Further, the opportunity to import data from open accessible internet forums via a CSV-interface should be supported by the tool. That way, additional insights about customer expectations, e.g., captured in comments in fan forums, were strived for (e.g., Maynard et al. 2012) Language and user adaptation 3 Support of English and German languages As the interview partners operate on an international level, the tool needs to enable the analysis of English and German language posts (e.g., Coşkun and Ozturan 2018;Maynard et al. 2012) 4 User administration & generation of individual reports Administrators of the tool are to be put into the position to configure it for particular user groups (e.g., grant access to certain social media channels or analyses). In this context, the easy modification and adaptation of automatically generated reports are also desired (e.g., Stieglitz et al. 2014) 5 Deactivation of the spell-checker on demand The spell-checking functionality should be able to be deactivated upon request (e.g., Fan and Gordon 2014; Singh and Sachan 2019) 1 3 UR: SMART-A tool for analyzing social media content 7 Classification There needs to be the option to automatically classify customer posts and to define new classes on demand (e.g., Feldman and Sanger 2007). As a first shot, the following general categories to classify customer posts were determined in consultation with the above-mentioned practice partners: (1) product, (2) service, (3) processes, (4) suppliers, (5) competitors, (6) retailers, (7) campaigns, (8) brand, (9) events, (10) User Generated Content (UGC), (11) contests and (12) topics related to provincial specifications. On this basis, firms should be able to easily define enterprise-specific subcategories within the tool to realize a classification of posts in terms of a particular business field (cf. Maynard et al. 2012;Mosquera and Moreda 2012) 8 Support of a mixed method analysis To gain deeper knowledge about customers, a combination of different analysis methods (mixed method approach) should be provided by the tool (e.g., Stieglitz et al. 2014;Sect. 2). Particularly the scenarios "Product Commendation/Criticism" and "Topic Identification" (see Sect. 4.2) are to be supported upon request of our research. Hence, the requirements either affected the functionalities "data import/extraction" (DR 1 and 2), "language and user adaptation"  or "data analysis" . Whereas the benefits of sentiment analysis and classification of social media posts are widely acknowledged (Batrinca and Treleaven 2015;Maynard et al. 2012;Mittal and Patidar 2019), the potential of a mixed method approach (see DR 8) -as expected by our interviewees -is outlined in more depth in the next section.

Context Scenarios as a Basis for Method Combination
To specify the mixed method analysis approach (see Table 3 -DR 8), we derived context scenarios from our interviews (Sect. 4.1) which capture the future usage situation of the tool in regards to a combination of social media analysis methods (Schilling 2016). The goal was to describe the purposeful integration of social media analysis methods in form of a "story" to uncover inconsistencies and precisely define the value of a mixed method approach for employees' daily routines (cf. Schilling 2016). In this respect, user personas were also deduced that depict the motivation, practices and preferences of users to integrate social media analysis approaches in more depth (cf. Schilling 2016). As a result, considering the combination of analyses that focus on structured social media data or the posts' semantics -and thus a more sophisticated social media analysis, two context scenarios were stated to be particularly important (see also Table 3), namely "Product Commendation/Criticism" and "Topic Identification": • In terms of "Product Commendation/Criticism" -a company seeks to learn whether its products and services are appreciated by (potential) customers or not. One advantage of using social media analysis for this purpose is that information gathered from social media platforms usually reflects not only the opinion of existing customers but also of other interested parties or the general public. The initial step of an analysis to answer questions such as "Which products are praised/criticized most and why?" is to perform a sentiment analysis on all available social media posts in the context of the company to identify and extract those posts that entail a positive or negative sentiment. Depending on whether product commendation or product criticism is of interest, either only positive or only negative posts are referenced in a further analysis, decreasing efforts for extracting insights from the data. To identify specific products affected by commendations or criticism, the second step in this scenario is a classification of the remaining posts according to product categories. In this context, the categories have to be determined by each company individually. To properly address complex product families and to answer more precise questions as for instance "Which feature of product X gets negative feedback?", this step can be extended by using one or more subcategories (e.g., different versions of a product or specific product features). That way, it is also possible to get valuable information on a company's product offer-ings and to address content-specific aspects. The final step in this scenario is a quantitative analysis of the classified posts to answer additional questions such as "How severe is the criticism and how should the countermeasures be prioritized?". This step assesses, among others, data such as "likes" and "shares" that are related to the considered posts. That way, a better understanding of a particular commendation or criticism is enabled. For example, a negative post on a product that receives a high number of "likes" or "shares" indicates that many people share the negative opinion, an indicator that a quick response fixing the problem is required. The described steps of the analysis as well as optional paths of this scenario are depicted in Fig. 2. • The purpose of "Topic Identification" is to detect those topics that are primarily discussed on a company's social networking sites. Considering an organization's image, it is of special interest how a company, its products and services, managers and employees are mentioned in such discussions. This issue may bring up a vast range of potential topics and, hence, an exhaustive list of predefined categories cannot be provided in advance. For this reason, the first analysis step in this case is a clustering, which does not require any predefined categories but analyzes the social media posts with the purpose of grouping them in clusters on the basis of content similarity (cf. Bär et al. 2012;Mayring 2000). Thus, all relevant topic areas can be detected, even if they may have been previously unknown. Topics of interest may be diverse, ranging from a company's pricing model to employee friendliness or a company's ecological policy, to name but a few examples. After identifying the topics, a sentiment analysis of each cluster is performed to determine to what extent the individual topics have a positive or negative connotation. In this regard, especially the ratio of positive to negative posts is an important indicator. The final step, similar to our first scenario, is the quantitative analysis of "likes" and "shares" associated with the posts in each cluster, to better assess the importance of the respective topic (see Fig. 2).

Fig. 2
Steps of the analyses required by interviewees 1 3

Design Activities
First, a literature review regarding sentiment analysis was conducted to identify relevant approaches and algorithms suitable to analyze the sentiment of customer posts (cf. vom Brocke et al. 2009). An investigation of 196 relevant publications led to a total of 17 potentially suitable approaches for the sentiment analysis. Due to the characteristics of social media posts (e.g., shortness, emojis, company specific language), dictionary-based approaches represent a generally accepted approach for the automated sentiment analysis of such textual content (Feldman 2013). Depending on the sentiment of each single token (e.g., word, emoticon), an aggregated sentiment-value is calculated with the value indicating a positive (> 0), neutral (0) or negative (< 0) post (Feldman 2013). Considering the classification of data, 130 relevant publications were examined during a second literature review, leading to nine potentially suitable approaches. To provide the capability of adapting the classification to individual or quickly changing contexts (e.g., upcoming campaigns or quickly changing trends), UR:SMART focuses on the assembly of data for predefined classes (Feldman and Sanger 2007;Heyer et al. 2006;Read et al. 2012). Therefore, a set of generally valid main categories (e.g., service, product or campaigns), independent of company or branch specifics, were worked out in cooperation with our practice partners (see Sect. 4). Additionally, to handle the individual topics and needs of each company, subcategories for each main category can be acquired. These subcategories are highly specialized and tailored to the companies' specifics as well as the aims of their social media channels.
Furthermore, it was necessary to conceive an overall data model for the underlying database to ensure consistent data formats, the free combination of various analysis methods as well as suitable data treatment for analysis combinations. The data model consists of five classes (see Fig. 3) and each class represents a set of objects: Fig. 3 Class model 1) "social media account" for which an object could be the company's Twitter or Facebook account, 2) "company-post" for which an object could be a post to announce an open quiz, 3) "comment" for which an object could be the customer response to the quiz, 4) "main category" for which an object could be a distinct category (e.g., "product") in which corresponding posts and comments are classified and 5) "subcategory" for which an object could be a credit card or loan (if we think of a financial institution as an example).
Furthermore, all classes have relations to other classes. Two of them can be defined more precisely as compositions and one as a generalization or specialization, respectively. A company post is part of a social media account and each comment is clearly assigned to a company post. Comments cannot exist as separate parts detached from a specific company post and company posts cannot exist separately from a certain social media account. Each company post and each Fig. 4 Wireframe of the GUI comment is classified by a main-and a subcategory. The class "subcategory" is considered a specialized form of the superclass "category".
A further important step focused on the design of the GUI of UR:SMART. For that purpose, we used wireframes to arrive at a first shot of the GUI. The intention was to provide intuitive navigation with only a limited set of buttons and elements. One version of a wireframe that was further specified throughout the development process is shown in Fig. 4. Hence, in the upper area of the screen (see Fig. 4), a design proposal for the selection feature for the social media channels along with an indication of the desired timeframe is shown (see Table 3 -DR No. 1). Further, it was planned to show the analysis results immediately with separate graphics being used for the sentiment analysis as well as the classification of posts (see lower screen of Fig. 4) (DR No. 6 & 7). Hence, a pie chart was chosen to visualize the distribution of posts across the sentiments (e.g., positive, negative, etc.) and bar charts were considered suitable for visualizing the classification results, with the bars indicating the absolute number of posts assigned to a particular category (e.g., service, product, etc.). Such bar charts should be available for all sentiments alike. Further, a rough sketch of the user menu is shown on the left-hand side. Depending on whether the "extraction" or "results" item is selected, only the corresponding elements (selection functionality or results visualization) should be visible on the main screen. Finally, the "logo" was defined as a placeholder for an individual company's emblem.

Development Activities
Generally, we used JAVA for developing UR:SMART to ensure high performance and to provide standardized libraries and interfaces. The functionality of creating graphical representations from the social media analysis was implemented as a platform-independent web application. To realize the data model, we used an H2-mysqldatabase, which guaranteed data compatibility through predefined and documented data interfaces. Accordingly, it is ensured that the input as well as output of an analysis are standardized and it is possible to purposefully reduce the data size, which leads to a significantly faster analysis time.
To enable the analyses, the data extraction and preprocessing functionality needed to be implemented and the relevant steps required for data preprocessing considered (cf. Aggarwal and Zhai 2012). Thereby, tokenization decomposes all textual data into smaller parts, for example single words, and removes unneeded symbols and special characters (Carstensen et al. 2009). Additionally, stop word reduction eliminates words that do not carry opinions by using publicly available stop word lists (Angulakshmi and ManickaChezian 2014). Subsequently, a stemming process eliminates prefixes and suffixes, reducing all words to their stem or basic form (Akaichi et al. 2013). Finally, a normalization algorithm completes the step of Data Preprocessing and transforms all remaining text into lower case characters (Angulakshmi and ManickaChezian 2014).
For the implementation of the sentiment analysis, we used the widely accepted implementation of a dictionary-based approach "SentiWordNet 3.0". SentiWord-Net 3.0 represents a lexical resource for an automated sentiment classification (Baccianella et al. 2010). However, SentiWordNet 3.0 only provides a lexical resource for English. To support German social media posts as well, we used Sen-tiWS, a German language resource for analyzing the sentiment of German texts (Remus et al. 2010). As SentiWS did not match the structural requirements of the SenitwordNet 3.0 approach, we adapted SentiWS by converting the structure of the German dictionary to fit the one of SentiWordNet 3.0 (Remus et al. 2010). Both resources contain lists of words carrying a positive or negative opinion, respectively. Despite the mentioned techniques, most approaches for sentiment analysis cannot handle some special content immediately (e.g., emoticons). Therefore, feature extraction functionality considering the definition of feature types and the selection of specific features (e.g., emoticons, parts of speech, sentiment-carrying expressions) is necessary (Selvam and Abirami 2013). Due to the frequent occurrence of these features in our datasets, we integrated specific dictionaries to meet certain characteristics (e.g., dialect or emojis) of textual data. To establish a proper feature resource, we examined our dataset and extracted the most common features.
To identify irony and slang, the dictionary was extended with expressions pointing to special events (e.g., product launch) of the branches considered as well as irony patterns (cf. Trevisan et al. 2014). To receive irony patterns, we analyzed sample sets of posts of our cooperating partners, which were then added to the dictionary afterwards. A commonly encountered structure for irony patterns in the sample was a composition of an extremely negative and positive expression in a post (e.g.,

"The web shop is down again! Wow, this is great!!").
Moreover, we classified the occurring emoticons as positive or negative to identify the emotions expressed within the posts. The sentiment of each word (as well as each special text component) is expressed by the variable "sentiScore", a number within a predefined range of [-2; + 2], with a high number (near + 2) representing a very positive and a low number (towards -2) a rather negative sentiment (Feldman 2013). In this respect, posts that included words with highly positive and extremely negative sentiments at the same time were marked as potential candidates for irony. Consequently, the overall sentiment of textual data is reportable and ascribed to the categories "strong positive", "positive", "neutral", "negative" and "strong negative". The gathered data is stored in a database and can be graphically displayed in pie charts and an ECG-like representation featuring a temporal scale of sentiment progression.
To realize the classification functionality in UR:SMART, we combined multinomial naïve bayes (MNB) with a dictionary-based seed word library to identify the category of textual data. This algorithm computes the posterior probability of a class, based on the distribution of the words in the social media post. The position of the words in the text is not considered since the MNB works with a "bag of words" assumption. However, in contrast to other naïve bayes variants, the MNB takes the word frequencies into account and thus, also allows duplicates. For details on MNB we refer to Aggarwal and Zhai (2012).
This library includes specific seed words for all acquired main-and subcategories and therefore allows an assignment of posts and comments to these (Zagibalov and Carroll 2008). The main category "product", for example, is extended by the integration of several subcategories, including company-specific product lists, parts lists as well as product accessories. Starting from the preprocessed data, all words are analyzed regarding these seed words, enabling a strong customization of the classification. Additionally, by identifying similar words surrounding existing seed words, the seed word library is constantly enhanced by company-specific expressions (Isoaho et al. 2019;Liu 2012). As a result, topics that are currently popular among customers, e.g., within a social media channel can be identified and graphically displayed. The results of the sentiment analysis and the assignment to the defined classes are then brought together by an overall view, featuring the most represented categories and subcategories within each sentiment section. Additionally, all underlying textual data are obtainable with the help of various sort and filter algorithms.
Screenshots of "UR:SMART" are shown in Fig. 5: • Screenshot 1 shows the tool's login page. As mentioned, the tool can be configured for specific user groups (see Table 3 -DR No. 4) and the access to specific social media channels or certain analysis functionalities can be restricted or granted. • The second screenshot displays the screen to select the social media channels for data extraction (e.g., Facebook) as well as the desired timeframe for the data (e.g., all posts within the last three weeks) (  • The fourth screenshot shows an excerpt of the dashboard, in particular a pie chart, which represents the results of the sentiment analysis, showing the share of posts that have been classified as "strong positive", "positive", "neutral", "negative" and "strong negative" (DR No. 6). Moreover, key indicators such as the total number of posts analyzed, the average and median sentiment score across all posts as well as the average sentiment score for the ten most positive as well as negative posts is presented. In this dashboard screen, the results of the mixed method analyses are also represented to users (DR No. 8). • Screenshot 5 (lower part) shows the classification of posts according to defined categories and subcategories for each sentiment (bar charts) (DR No. 7). Thereby, it becomes evident which categories the (strong) positive, neutral or (strong) negative statements refer to. The posts assigned to each bar of the chart (e.g., positive posts for category "product") may then be referenced (see screenshot 3) and screened individually. Furthermore (upper part), a time diagram (ECG) is shown that explicates the frequency of (strong) positive, neutral and (strong) negative posts over the course of time.

Demonstration of UR:SMART at a German Financial Institution
To demonstrate the practical applicability of UR:SMART, we cooperated with a German financial institution that focuses on private savings, building society savings and credit services. As financial institutions in Germany are struggling with the market pressure exerted by online banks and the start-up driven fintech industry, they see a huge potential in assessing users' data to derive new insights by means of various analysis methods. As a modern way to communicate with customers, the German financial institution built up a social media channel to familiarize their customers with this new form of interaction. Further, the financial institution has been continuously striving to increase user numbers and to foster users' social media activity. To gather detailed insights into the partners' customer data, we extracted 635 datasets from their Facebook account, as Facebook is their prioritized social media account, including the numbers of fans, posts, comments as well as the corresponding metadata (e.g., number of likes, number of shares and number of comments) for each entry from January 2017 to January 2018.
To present the results in a structured manner, we applied UR:SMART to this data set. Therefore, analysis methods that focus on the semantics of the posts (sentiment analysis and classification) were employed and a quantitative analysis of structured data was conducted. An overview of the results is shown in Table 4, which is structured as follows: First, the various categories, which were previously defined, are presented. The first column ("# of posts") indicates the number of posts within each category, sorted by height. Next, the numbers of the corresponding "# of likes", "# of shares" and "# of comments", separated into positive ( +), neutral (O) and negative (-) sentiments are presented. As the performed analysis determines a sentiment score for every post including all comments, the columns " + totalSentiScore" and "-totalSentiScore" show the average tonality Table 4 Overall results To identify commendation or criticism towards specific products as described in the first context scenario in Sect. 4.2, first (1) a sentiment analysis was performed, which resulted in a total of 176 positive posts (avg. score of + 0.73) as well as 93 negative posts (avg. score of -0.43), indicating a large number of positive posts on our cooperating partner's Facebook site. The variables " + totalSentiScore" and "-totalSentiScore" have values within the predefined range of [-2; + 2]. Although this information is useful to measure customers' overall mood, a clear hint at specific commendations or criticism concerning products was still missing. Therefore, in a second step (2), a classification was performed to identify all posts that can be assigned to the category "products". This category was previously defined by analyzing the product portfolio of our cooperating partner. As a result of this combined analysis, we could identify 40 positive posts (avg. score + 1.22) as well as six negative posts (avg. score -0.43) directly related to the category "product", indicating that most customers are very satisfied with the products, although there is occasional criticism. Even though this information is interesting, a clear hint at specific products was still missing. Therefore, as a third step (3), an additional classification into subcategories was performed, resulting in a fine-grained overview (see Table 5).
After the classification into subcategories was done, it became obvious that customers were rating products such as checking accounts, wealth creation as well as loans as positive. For instance, checking accounts are one of our cooperating partner's main products, offered for free in contrast to competitors, resulting in a high average positive score of + 1.09. Additionally, wealth creation and loans are important business fields. As the financial institution offers various saving plans as well as financing plans for both private and business customers, the affiliated scores were positive as well (+ 0.51; + 1,21).
In contrast, it also became evident that products such as credit cards (-0.57) and insurance (-1.58) are discussed in a more critical way. On the one hand, customers complain about credit card fees and missing functionalities (e.g., wireless payment as well as mobile payment) and on the other hand also doubt the necessity of specific types of insurance.
At first glance, this criticism seems negligible based on the rather low number of negative posts. Indeed, single posts can also cause tremendous impact by generating word of mouth within social media. To ensure the results and gain an even better understanding of the findings, a quantitative analysis was performed as a last step (4). As part of this analysis, not only the number of posts with the corresponding sentiment and assigned categories were considered, but also the customers' reactions (likes, shares as well as comments) towards all posts. Therefore, it was possible to determine the specific influence of posts within the social media channel or the community in general. Figure 6 shows the number of positive/negative reactions belonging to various subcategories of the general category "products".

Table 5
Detailed results subcategories "product" When comparing this extended data analysis with the previous results from step three, it now became apparent that the topic "wealth creation" had received a high proportion of positive reactions (537 likes, 16 shares, 8 comments) for example, even though this topic was accountable for only 20% (8 out of 23) of all posts in this category. The same held true when looking at "loans". Although only accountable for less than 1% of all positive posts, 180 shares (more than 80% of all positive shares) were attributable. In contrast to this positive customer feedback, it was also possible to enrich the negative criticism by using the extended data analysis. At first glance, criticism concerning "credit card" seemed balanced with three negative posts (avg. score -0.57) compared to four positive posts (avg. score 0.90). But when taking a closer look at negative reactions to the subcategories for "products", it was striking that one of the key sources of criticism occurred in the subcategory "credit card" (about 57%), mainly because of high fees and missing mobile payment solutions.
To summarize, our cooperation partner identified various positively rated product categories, which are potential candidates for future marketing as well as social media campaigns based on their popularity in this research. Additionally, negatively rated categories such as "credit card" or "insurance" were also identified and are being reworked to better match customer needs. Further, the financial institution stated that all of the mixed method analysis approaches as well as the resulting outcomes are very interesting and highly beneficial for the enhanced analysis of social media data. The practitioners underlined that the approaches "ranking of reactions within a sentiment" as well as "distribution of categories vs. distribution of reactions" were highly beneficial to them, as they had just started to expand their social media activities and, hence, required comprehensive knowledge of their customers' needs and interests.

First Evaluation -SUMI Usability Study
Currently, UR:SMART is in use at five collaboration partners to evaluate the farreaching applicability and usefulness of the tool in practice. The long-term results of this practical application will be acquired in the near future. Generally, the Fig. 6 Positive/negative reactions in the subcategories "products" development of our tool was based on the design requirements as shown in Table 3, which were all fulfilled as described in Sect. 5.2. Accordingly, all expectations posed in regards to UR:SMART were technically realized.
However, as a further aspect of the evaluation, we assessed the tool's usability in a larger laboratory experiment by means of a SUMI study (cf. Kirakowski and Corbett 1993;Wohlin et al. 2012). The SUMI questionnaire -as designed by the Human Factors Research Group (HFRG) 5 at the University College Cork -comprises 50 different items (e.g., "I feel in command of this software when I am using it") to assess users' satisfaction with a software according to the dimensions "efficiency", "affect", "helpfulness", "control", and "learnability" (Kirakowski and Corbett 1993). Likert-scales are defined to rate each item ("agree", "disagree" and "undecided") (van Veenendaal 1998). The "perceived quality of use" is determined on the basis of the so-called "global scale", which is calculated considering 25 selected questionnaire items (van Veenendaal 1998). That way, conclusions regarding the tool's general usability can be drawn (van Veenendaal 1998). To analyze the results, SUMI builds on a normative database, which sets the tool to be evaluated into relation to more than 150 other applications (Sauro and Lewis 2012;Cavallin et al. 2007). An overview of the SUMI questionnaire and method is available at: http:// sumi. uxp. ie.
In the last couple of years manifold metrics, quality dimensions and frameworks have been developed to assess the quality of software (e.g., Dubey et al. 2012;Franke and Weise 2011;ISO 2011;Wang et al. 2012). Thereby, standardized questionnaires as offered by SUMI, the ASQ (American Society for Quality) and the System Usability Scale (SUS) approach (cf. Sauro and Lewis, 2012;Brooke 1996) help to operationalize commonly accepted quality perspectives. We chose SUMI for our investigation because it has been created and validated on a Europe-wide basis (van Veenendaal, 1998) and has established as a widely recognized approach for assessing user satisfaction in recent years (Mansor et al. 2012).
72 Master degree students in business administration and information systems from a German university participated in our usability study of UR:SMART. These students were attending a course dealing with the fundamentals of business process improvement. The material of the experiment was based on a case study that was designed against the background of one of our cooperation partners stemming from the fun sports industry. Students were asked to analyze the comments on the company's Facebook page over a freely selectable timeframe of three months. They were supposed to take the role of a social media manager and prepare the analysis results in the form of a report for the company's decision makers. The students could earn extra credits for the course, which was an incentive to take the experiment seriously (cf., Wohlin et al. 2012). SUMI expects the users to have some experience with the tool to be evaluated (van Veenendaal 1998). Accordingly, the students attending the experiment received an introduction to UR:SMART with accompanying training material. The students were supposed to work on their own and submit their solutions together with the filled-out questionnaires. Contrary to the solutions of the case study, the questionnaires were anonymized and treated independently to mitigate participants' concerns about negative consequences resulting from a poor rating of the tool. 72 solutions and questionnaires were used for the upcoming analysis. The data from the questionnaires was entered into the SUMI online form and the results of the study were made available by the HFRG. Figure 7 provides a summary of the results.
The global scale of our tool (60.31) was clearly above the value of "50", which is considered to be the average value according to the SUMI reference database (cf. Arh and Blažič 2008;Kirakowski and Corbett 1993;van Veenendaal 1998). Further, UR:SMART was judged to purposefully support the analysis of social media data (dimension "efficiency" -mean "58.83") and its graphical user interface was considered attractive by users (dimension "affect" -mean "60.46"). Moreover, the tool was seen as rather self-explanatory (dimension "helpfulness" -mean "58.60"), easy to control (dimension "control" -mean "56.33") and only little effort was required to get acquainted with the tool's functionalities (dimension "learnability" -mean "65.40") (see Fig. 7).
To sum up, the usability study led to encouraging results and the tool was clearly judged to purposefully support social media analyses. With 72 participants attending the experiment, we achieved a considerable sample set to draw valid conclusions. Nevertheless, the evaluation of UR:SMART regarding its applicability and usefulness (e.g., Sonnenberg and Brocke 2012) at companies of different sizes and branches with a comparable large sample set of practitioners is planned for the near future. In this respect, we will apply additional effectiveness measurements from software usability such as "task effectiveness" or "temporal efficiency" (cf. Bevan 1995) for further evaluation. Whereas "task effectiveness" considers the degree to which the tool supports an employee in daily routines (e.g., 60% of the tasks were supported by the tool), "temporal efficiency" relates the task effectiveness to task time, i.e., the time needed for creating the results (Bevan 1995). By that, the tool's usability can be quantified by concrete measurements, which complement subjective user ratings.

Discussion and Contribution
Starting with the drawbacks of many current social media analysis efforts in practice (see Sect. 4), either taking a perspective on the structured social media data or the semantics of the posts, this paper introduces a tool integrating both these aspects. Based on real-world problems as well as characteristics concerning the analysis of social media data, which we identified by interviewing practitioners, beneficial scenarios for a purposeful combination of different analysis methods were introduced. In this context, AI, which describes scalable machine learning techniques, such as recent unsupervised learning techniques for analyzing unstructured data, is a trending topic in recent research and receives more and more attention in digital transformation initiatives (cf. Henning 2018). The promising unsupervised learning methods are currently working, though, as a so-called "black box", as the analysis itself as well as decision-making channels are hidden from the user and cannot be actively influenced. Because of the compliance required for business decisions, however, it is necessary that these remain transparent and comprehensible, especially when handling special contexts such as for instance slang, emojis or branch-specific language (e.g., Stieglitz et al. 2018). For these reasons, our solution favors the integration of lexicon-based approaches to ensure an extensive and flexible adaptation of the analysis approaches to company-specific as well as language-specific characteristics.
Although processing speed as well as memory performance and sizes have grown steadily over the past few years, the reduction of processing expenditure is still a major target when it comes to data analyses, particularly when using scalable processing as a service. Therefore, defining a specific scope of the analysis is a significant step. With our solution, it is possible to purposefully filter the amount of data after every analysis step and thus reduce the amount of data to the necessary minimum.

Contribution for Practice
The research brings up several contributions for practice. First, the efforts and resources required for performing social media analysis that is introduced by more and more companies in the course of their digitalization efforts can be significantly reduced by our solution through an automated and target-oriented analysis. Thereby, specific issues came up (e.g., "Product Commendation/Criticism" and "Topic Identification") for which a combined mixed method approach truly contributes to a better understanding of the social media data, a topic many companies are still struggling with these days (cf. Ried 2019). In this respect, our approach facilitates the identification of the "voice of the customer" (Pande et al. 2014) based on social media data, because access to customers' current moods and expectations is enabled. Based on the findings, process improvement efforts may be triggered, or countermeasures taken to avoid customer dissatisfaction among others.
Second, applying a mixed method approach generates beneficial insights that are superior to those findings that will be received from performing a singular analysis approach that either prioritizes structured or unstructured data (e.g., Chen et al. 2014;Tinati et al. 2014). As an example, by the target-oriented combination of analysis approaches, it is not only possible to cluster user suggestions on how to improve the product or service portfolio by topic and sentiment, but also to easily deduce a ranking of these propositions, e.g., based on "likes" and "shares" immediately. That way, employees receive a more profound and individual foundation for decision-making based on the information captured in social media posts.
Third, customers' reactions to social media marketing or advertising campaigns become evident (cf. Castronovo and Huang 2012;Shareef et al. 2019). This is highly beneficial for companies, because knowledge about the influence of social media campaigns on the success of introducing new products is still limited (Baum et al. 2019). Based on the topics captured in customers' reactions to corresponding advertisement postings -which are recognized by our tool -practitioners directly receive feedback from customers. This helps to better assess the type of information about a new product or service (e.g., availability, functionality, price, etc.), which is truly perceived as relevant by the target group. Further, companies may learn how to present this information via social media channels in an appealing way that is really appreciated by consumers, raises their attention and elicits a reaction. All this will help to design future marketing campaigns more purposefully.
Fourth, the SUMI usability study brought up encouraging results concerning the design of our tool. Hence the "global scale" rating indicated a high "perceived quality of use" from the side of the users. Thereby, the tool was judged to be easy to handle, self-explanatory, visually appealing and helpful to analyse social media posts (see Sect. 7). The proposed design may thus serve as a reference for constructing social media analysis tools. Hence, the arrangement of the elements and menu fields of the GUI, enabling easy extraction of social media posts as well as the graphical presentation of the results in the form of pie charts, histograms and time diagrams put users in the position to easily navigate the tool and derive insights from the data analysis. Additional findings are indicated by the key figures as provided by the quantitative analysis. Furthermore, the functionality captured by the design along with the data model provided those features by which users felt ideally supported when working on a social media analysis case study in our experiment. However, the SUMI study also led to suggestions for further development of the tool. Hence, it was proposed to add, for instance, export functionality for the result presentations to common MS Office packages. Moreover, suggestions to further decompose certain subcategories came up. Finally, users wished for pop-up windows with instructions for use to further improve the handling. This feedback will be considered in future steps to improve "UR:SMART".

Contribution for Research
In addition to practice, our approach also proposes several benefits for academia. First, the combination of data analysis approaches as well as the integration of methods in general has proven useful for various research disciplines (Stieglitz et al. 2014). This concerns the development of scientific procedures such as triangulation and inter-method mixing (cf. Jick 1979;Johnson and Turner 2003) but also topics such as for instance Method Engineering and process-oriented quality management among others (cf. Brinkkemper 1996;Johannsen 2011;Ralyté et al. 2003;Tolvanen et al. 1996). In this paper, we combine different data analysis methods for social media (e.g., Stieglitz et al. 2014). Concretely, we propose to integrate different perspectives on social media data. We could show that this allows the retrieval of additional information from the data (e.g., the reasons for customers' negative attitude towards certain offerings) complementing those findings gained by a pure application of a certain analysis approach in isolation. This fosters a company's "understanding" of the massive amount of data extracted from social media channels, which is a topic firms do have to actively approach according to a current study by CapGemini (2020). Generally, companies strive to harness the large quantity of data to strengthen customer relationships these days (Wedel and Kannan 2016) and therefore increasingly hire data scientists to build up competences in data-driven decision-making (Provost and Fawcett 2013). It is not fully clear yet in academic literature, however, "which types of analytics work for which types of problems and data" and "what new methods are needed for analyzing new types of data" (Wedel and Kannan 2016, p. 97). However, these issues are of vital importance for companies to prevail on the market (Wedel and Kannan 2016). In this respect, social networks such as Facebook and Instagram largely contribute to the fast growth of data volumes within companies (Felt 2016) as they play a decisive role in content marketing and product advertisement (Baltes 2015; Statista 2018b) but also for customer service purposes (Statista 2017). At that point, we contribute to one of the central questions of social media analysis research, namely how data analysis methods can be combined to purposefully derive insights from social media data (cf. Stieglitz et al. 2014Stieglitz et al. , 2018. Second, the automated analysis of social media posts still holds various challenges when it comes to its practical application (Stieglitz et al. 2018). In our research, we built on a dictionary-based approach for the sentiment analysis. While freely available lexical resources exist for that purpose (e.g., SentiWordNet), these are not customized for particular industries or company types. Hence, to raise the accuracy level of our approach, the lexical resource (SentiWordNet 3.0) had to be adapted to match the particular needs of our collaborating partners, who mainly came from southern Germany. So, researchers are called to enhance freely available dictionaries by branch-, regional-and company-specific language peculiarities to help raise the accuracy levels of such resources that can be used by researchers straight away.
Third, while most companies are nowadays able to process and store a large amount of data (e.g., by help of Apache Hadoop), they still do not know how to extract valuable insights from the data sets (CapGemini 2019; Ried 2019). This can also be seen as one reason for the lack of success of digital transformation initiatives (CapGemini 2019). Based on a series of interviews with eleven cooperation partners, we deduced highly relevant application scenarios for analyzing social media content that help companies to purposefully trigger improvement efforts based on the analyses for instance. While the technical side of social media analysis is a livelily discussed topic (Feldman and Sanger 2007;Liu 2012;Medhat et al. 2014), the question of how to use and combine the existing analysis approaches -to truly derive beneficial insights for firms -still lacks a theoretical foundation (cf. Ried 2019; Stieglitz et al. 2014Stieglitz et al. , 2018. Hence, the research brings up valuable indications for promising application scenarios in social media analysis. Fourth, our tool represents a helpful instrument to identify the "voice of the customer" (Pande et al. 2014) based on freely available social media data. This is particularly helpful for business process management (BPM) and quality management (QM) initiatives, as their success largely depends on the proper analysis of consumers' expectations and needs (cf. Pande et al. 2014). Accordingly, process weaknesses can be eliminated, and the process design aligned with customer requirements more purposefully. While BPM research often focuses on the question of how to utilize employees' process knowledge to mitigate process weaknesses (cf. Seethamraju and Marjanovic 2009) by means of systematic procedure models (e.g., Adesola and Baines 2005;Coskun et al. 2008;Harrington 1991;Zellner 2011), of how to use und develop "patterns" to support the "act of improvement" (Bergener et al. 2015;Forster 2006;Höhenberger and Delfmann 2015;Lang et al. 2015) or of how to apply process mining techniques to acknowledge deviations between an as-is and a shouldbe process (van der Aalst 2012), future work may focus on the integration of BPM approaches and social media analysis in more depth.
In view of the ongoing debate in DS about the need for a theoretical contribution to research that goes beyond the technical contribution (i.e., the artifact) as outcomes of a DS research project (cf. Baskerville et al. 2018) we can summarize the following contributions to theories: • For social media theory two contributions emerged: (1) We deduced highly relevant context scenarios for analyzing social media content and helped to identify and structure new questions for analyzing social media data.
(2) We purposefully combined data analysis methods and extended the number of available methods, which made it possible to answer the newly arisen questions. • For BPM and QM, our research is a helpful instrument to identify the "voice of the customer" and therefore, opened access to new data sources. This can be seen as a starting point to explore and possibly extend the techniques and methods of BPM and QM that use customer voices to exploit the full potential of the new data sources. According to the ISO 9000 standard, the establishment of a customer focus is one of the major principles of a quality management system (cf. DIN EN ISO 9000:2005). Hence, firms "depend on their customers and therefore should understand current and future customer needs, should meet customer requirements and strive to exceed customer expectations" (DIN EN ISO 9000:2005, p. 5). As a result, steps to thoroughly analyze the voice of the customer have been integrated into widely-established QM approaches (cf.

Conclusion and Outlook
In the paper at hand, the design and implementation of the social media analysis tool "UR:SMART", which allows the combination of different kinds of analyses as a mixed method approach, was described. We started with an explication of the problem statement. In this regard, based on several interviews with sales and technical experts of diverse tool providers (see Sect. 1), we observed that the currently available social media tools support various kinds of data analysis but neglect the combination of different perspectives. Based on interviews with practitioners (Table 2), we derived requirements for a supporting tool and identified current challenges of social media analyses. Based on these insights, the tool was designed and implemented. The practical applicability of our solution was demonstrated by referring to the example of a German financial institution. As one important aspect of a more comprehensive evaluation of the tool's far-reaching applicability, we assessed the usability of our solution by means of a SUMI study with 72 students at a German university. However, there are also some limitations to this research: so far, we have carried out an in-depth evaluation of our tool at one company only. Further evaluations with firms from various branches are currently being performed as the software is in use at several of our cooperating partners. Our solution is not subject to a branchspecific imprint and its underlying mechanisms are suitable for both service and production settings, assuring its inter-sectoral usability. Potential application scenarios and challenges of social media analyses were derived from interviews conducted with eleven companies. Although this is a sample set of considerable size, completeness of the challenges and requirements cannot be guaranteed. The SUMI usability study was performed with students in the master program, which is also a limitation. Therefore, a corresponding SUMI study with practitioners is an open issue.
In the future, we will further evaluate our solution in real-life social media projects with companies of different size and across branches. The prototype's contribution to supporting the elicitation of business-relevant information is to be precisely assessed for different cases, in addition to the previously described scenarios. Moreover, we will investigate more closely how companies may use the information received from social media channels for entrepreneurial decision-making. In this respect, we will investigate to what degree it is possible to automatically derive action recommendations from the analysis results, e.g., the launch of process improvement initiatives or marketing campaigns. Further, a central problem to address will be the integration of social media data analyses into strategical decision-making. Finally, we plan to extend the software with quality measures such as the number of program calls, frequency of use and the availability rate, among others.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.