Abstract
Intention mining is a promising research area of data mining that aims to determine end-users’ intentions from their past activities stored in the logs, which note users’ interaction with the system. Search engines are a major source to infer users’ past searching activities to predict their intention, facilitating the vendors and manufacturers to present their products to the user in a promising manner. This area has been consistently getting pertinence with an increasing trend for online purchasing. Noticeable research work has been accomplished in this area for the last two decades. There is no such systematic literature review available that provides a comprehensive review in intension mining domain to the best of our knowledge. This article presents a systematic literature review based on 109 high-quality research papers selected after rigorous screening. The analysis reveals that there exist eight prominent categories of intention. Furthermore, a taxonomy of the approaches and techniques used for intention mining have been discussed in this article. Similarly, six important types of data sets used for this purpose have also been discussed in this work. Lastly, future challenges and research gaps have also been presented for the researchers working in this domain.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Social media have become an essential tool in everyone’s life. People are used to sharing their ideas, information, and plans on different social media forums. They express their feelings, suggestions, attitudes, likeness, and dislikes in online social communities. As a result, bulk of data is extracted for different user queries comprised of textual documents, images, videos, and sound [1]. Data mining techniques are used to find out the user patterns from a large amount of data. Intention mining is one of the data mining techniques used to determine the user's implicit or explicit intention from a given set of data. Intent can be defined as an activity, wish, or aim which a user wants to do in the future [2, 3]. Intention mining is a major research area in the field of data mining implemented on a web search environment to expose the future goals of users [3]. Figure 1 shows the process of detecting intention from a given dataset. A human mostly uses natural language (written or oral) medium to express his intentions. As an example, few expressions are categorized in different intentions as described in Table 1. When a user puts any query in the search engine to retrieve related content, it depicts his search query intention [4]. A comment or query expressing the need or wish to buy a product is called purchase intention [5, 6]. Behavioral intentions depict user behavior, e.g., such as hiding and unfriend contacts from Facebook or using a digital smartwatch and free trial based software services on the internet [7]. Continuance intentions is a special type of intention that describes user willingness to continue his e-service business. For example, to predict that a user willing to continue to comment the posts on Facebook [8, 9]. The implicit human intention is expressed indirectly rather than explicit intention. For example, if a user talked about his mobile phone in such a manner, “my phone is disturbing me badly, and the new mobile model of this company is amazing”. It implicitly defined that the user wants to purchase the latest mobile [10].
Datasets are origin to extract intended user goals. Most of the intentions are derived from real-time datasets collected from multiple sources. The most commonly used datasets for search intentions are weblogs, tweets, Facebook comments, and blogs [11, 12]. To detect the query intention, vital dataset sources are search engine query log, click the graph, and model-based experiments. Mobile activities logs are used to find out the user’s behavioral intentions related to mobile usage [9].
In this study, a comprehensive analysis is performed to discuss the frameworks and methods used to mine the intentions. Most of the included studies use algorithms based on machine learning techniques such as supervised, unsupervised, rule-based frameworks, and neural networks [13,14,15,16]. Statistical methods played a vital role in intention detection [16,17,18,19]. According to the papers included in this study, supervised learning is one of the most widely used approaches for intention mining.
This systematic literature review aims to gather the knowledge about intention mining in one place to facilitate future researchers. It will also facilitate the vendors and manufacturers to insight user intentions related to their products. This study's main motivation is to detect the user intentions regarding usage of social media and mobile, either they want to continually use social services (Facebook, Twitter, search engines, online shopping forums). This study's inspiration is to bring all the intention mining techniques and approaches at one place to find out what type of algorithms and frameworks are used to extract user’s intentions. Dataset is one of the most important components in the context of intention mining. A worthy dataset is a key to accurate and promising results, and one of the stimulations of this study is to identify the datasets from different forums (social media, questionnaire surveys, and mobile logs) used to detect user’s intentions.
This systematic literature review (SLR) focuses on presenting a comprehensive knowledge of intention mining related to intention categories, datasets used, applied approaches, and techniques. This study followed the methodology of [8] for an unbiased collection of articles to conduct effective research. Multiple search strings according to the search syntax of digital libraries were formalized to extract relevant studies. The 109 research papers were selected out of 4362 based on screening criteria. The retrieved papers were evaluated qualitatively and empirically from multiple aspects to present comprehensive knowledge. This study's primary focus is to present a systematic literature review of existing studies of intention mining and contribute to the knowledge with eight proposed intention categories, six datasets and also presented a taxonomy of techniques and approaches used to detect user intentions. The novelty of this study is that there is no systematic literature review on this domain to the best of our knowledge. This study distinguished to related studies in such a context that other SLRs focused on detecting users' goals using process mining. The studies [2,3,4,5] classified the intention into categories, but there is no significant contribution to present techniques and approaches used to infer intentions as well as there is no comprehensive literature available on the classification of datasets used to extract intentions of the user.
This manuscript's primary focus is to discover and define the intention mining categories, taxonomy of approaches and highlight research challenges and gaps in intension mining domain.
-
We have proposed eight intention categories such as purchase intention, behavioral intention, search intention, continuous intention, human implicit intention, query intention, mobile usage, and general intentions. These classification types can help the vendors and manufacturers of products enhance their products according to user requirements. Furthermore, it would be beneficial for search engines and mobile phone companies to facilitate users better.
-
We have proposed a taxonomy of state-of-the-art approaches and techniques to mine the user's intention. The researchers can benefit from this taxonomy to select the techniques and approaches according to their dataset and context of use.
-
This study also proposes datasets classification into six categories, such as search engine logs, social media data, model-based generated data, questionnaire survey method, and generic datasets. It can be beneficial for training models and helps to choose a suitable type of dataset for research.
-
Lastly, research challenges and gaps have been identified to help future researchers.
The rest of the paper is structured as follows: “Related work” is about related work. “Methodology” presents the methodology of the paper and the selection process of relevant studies. “Results” approaches detailed answers to research questions. “Discussion” describes the discussion and future challenges. In the end, “Threats to validity” presents threats to validity, and lastly, the article concluded by summarizing the literature.
Related work
According to Oxford dictionary, the word intention is defined as “A thing intended, an aim or plan”. Several frameworks and methods have been proposed and developed to detect user intention from multiple types of datasets [6]. Formal research efforts for the domain of intention mining have few and far between, and most evidence gathered to explore the intention mining from the perspective of categorization, datasets, and techniques.
Epure et al. [1] revealed that foundation of each process is intentional and processes should be modeled from intentional point of view. The authors claimed that in earlier researches event logs have been neglected which is the basis of intention mining research. Khodabandelou et al. [2] described intention mining as an emerging research area of data mining. Intentional process models are the key models of intention mining used to process reasoning behind user activities. Hidden Markov model was used to extract user intention from traces of activities stored in event logs.
Bags et al. [3] covered the literature survey of purchase intention of durable things on surface level. The e-commerce forum datasets were used to identify intentions to facilitate retailers. Social network mining and sentiment analysis was used to predict user intention and brand perception scores. A suitable regression model was identified to predict product attributes. One of their key finding from the mobile dataset is that users are most interested in camera attributes, sensors, and image stabilization. Huang et al. [4] classified intention into four categories using convolution neural network (CNN). Dataset was created manually consisted of 5408 sentences of Github generated reports. The proposed approach was also used to improve an automated software engineering task used to rectify misclassified reports. Mainly, authors automated the classification of professional developer intentions.
Papadimitriou et al. [5] reviewed the approaches and techniques used to infer the user task's intention. The goal aware systems were discussed thoroughly instead of focusing on any specific area. Di Sorbo et al. [6] investigated the concept of intention mining from the perspective of software developers. Intention related to developer’s email contents was classified into six categories. The proposed approach required a lot of manual effort to detect intention. Moreover, the authors have not presented any taxonomy of datasets to help the researchers and developers. Whereas we classified user intentions by generalizing the domain, it worked on social media data, online survey data, mobile usage patterns, and human implicit action to detect intentions. Ghasemi et al. [7] presented goal-oriented mining from the perspective of process mining. A rigorous research was conducted on 24 articles selected from popular search engines. Research questions revealed that process mining in user goals' association has not a coherent line of research, whereas intention mining has more significant and mature goal oriented models. Experimental results represented that combination of process mining, and intention mining might bring more opportunities to system stakeholders.
Table 2 discussed the comparison of related articles with our study concerning three parameters categorization of intentions, dataset classification and techniques used to infer intentions. Authors in [1] revealed goal-oriented mining with respect to process mining. They determined that intention mining has a coherent line of research to detect user’s goal. Still, intentions were not classified to signify that which type of initiatives user can take in the future, and there is no significant discussion on datasets to infer user intention. The study [2] discussed Intention mining along with its processes comprehensively. Article [3] discussed the purchase intentions of durable things especially mobile phones. Still, it didn’t focus on other types of user’s intentions, such as search intention and query intention, while purchasing a product. Four intention categories were analyzed and discussed in [4] on GitHub dataset however, discussion on the dataset classification along with techniques and approaches used to deduce intention was missing. The article [5] addressed techniques and approaches to deduce user intention from goal aware systems. They did not discuss the intention categories and dataset. In [6] six intention categorized were discussed purely related to developer’s email content, but they ignored intention of the common people concerned to email as well as datasets and techniques were also not analyzed in said study. The study [7] revealed intention mining as foundation of process modeling and determined that event logs are the key dataset to infer the user’s intention.
Methodology
The main purpose of a systematic literature review (SLR) is to identify the challenges and gaps which need more investigation and research. SLR covers the quantity, quality, and type of research regarding the addressed topic [8]. This systematic literature review was conducted by following the guidelines of [8] for unbiased data collection and demonstration of obtained results. Figure 2 presents the research methodology of this study. The first step is the core of SLR to define the research objectives and questions then select the digital libraries to extract the relevant literature. The second phase is to design a search string and then exercise the screening phase to filter the retrieved articles. Furthermore, extract and map the articles and report the systematic review.
Research objectives
This study aims to elaborate the intention mining in the context of research problems, solutions, and significance. The objectives to conduct SLR on intention mining are as follow:
O1: To build up a knowledge catalog that will facilitate the other researchers in the context of intention mining.
O2: To classify user intentions in different categories to shape research in intent mining domain.
O3: To characterize the datasets used to detect the implicit or explicit intentions.
O4: To depict existing solutions in the field of Intention Mining (IM), and clarify the similarities and differences between them using a characterization framework.
O5: Develop a taxonomy to highlight the adopted state-of-the-art approaches and methods.
Research questions (RQ)
To obtain the primary aim of SLR, this study defines three research questions as follows:
RQ1: What intention categories have been addressed in the last decade on intention mining?
RQ2: What data mining/machine learning approaches currently exist to support the effectiveness of different intention mining techniques?
RQ3: What types of datasets are used to detect multiple types of human intentions?
Design search string
Intention mining is an emerging research area of data mining. It has substantial effects on social media's world to facilitate users and organizations in multiple aspects [1, 2]. To develop an authentic knowledge base of IM, this study utilized the digital libraries of four major publishers IEEE, ACM, Science direct, and Springerlink.
This study's intended search phrase was "intention mining," but this query was too restricted, and it retrieved only a few results from scientific research databases. Therefore, the synonyms of intention and mining were used to build search strings. Synonyms of intention were not proved very beneficial to access relevant publications because almost all the authors used the term intent or intention to depict the user’s future goals. The secondary keywords chain was designed to complete the phrase with intention. The main secondary word was mining, such alternate words like mine, extract* and discover* were chosen.
The search string was designed with a combination of primary keywords (Kp) and secondary keywords (Ks), as mentioned in Figure 3. Intention or intent is used as primary keywords while mining, mine, discover* and extract* used as secondary keywords. Wild card * is used with discovering and extract to cover other similar terms like discovery, discovered, discovering, extracted, extracting, extracts, etc. Kp was used with any of the KS, i-e ∀ KP ∧ ∀ KS
The guidelines of [8] designed the search string format for each forum. Search strategies were checked and revised until final strings were obtained. Final search strings for four selected forums are illustrated in Table 3. Format of a search string for IEEE Xlopre in the context of primary and secondary keywords is as follows: (PK1 OR PK2) AND (SK1 OR SK2 OR SK3 OR SK3 OR SK4 OR SK5 OR SK6 OR SK7). Search strings with specific keywords are mentioned in Table 3.
Screening phase
The screening step carried out to select such articles using proposed search string which were more relevant to this study's objective. Screening phase filter articles based on inclusion/exclusion criteria, Journal and conference repute and also on quality assessment criteria on article contents. This study strictly focused on Intention Mining, so articles that were not exactly addressed the IM problem were excluded. All repeated studies were also eliminated by screening based on title and abstract.
Screening (inclusion/exclusion criteria)
This SLR examined multiple studies covering quantitative, qualitative research methods, but some publications are not of such quality to add in the review. Therefore, an inclusion/exclusion criterion is defined to select the most relevant papers.
A study was chosen for systematic review if it met the following inclusion criteria.
-
As data mining is a vast field consisting of several mining strategies, the articles mainly focused on intention mining.
-
Literature must be published in a computing journal or conference.
-
Dataset is one of the essential components of intention mining. Articles that focused on social datasets to detect human intention have been included.
-
The language should be English
Article excluded from analysis if it met any of the following exclusion criteria
-
Book chapters, posters, magazines, courses, and early access articles were excluded.
-
Papers address data mining fields other than intention mining.
-
The article presented the models to learn how to detect human intentions to robots.
-
The article presented the general focus on data mining
Screening (journal citation report/ranked conferences)
To ensure the quality of the review, we consider mostly core ranked conferences and Journals included in journal citation report (JCR). The links to check the ranking of conferences are http://www.conferenceranks.com/ and http://portal.core.edu.au/conf-ranks/6784, whereas https://www.scimagojr.com/ used to check the quartile of the journal.
Screening (quality assessment criteria)
Quality assessment criteria (QAC) is one of the important screening phases of the literature review. It is used to make assure the quality of the selected studies [8]. Assessments were performed on multiple parameters of the selected paper. All crucial aspects included the background, methodology, dataset, dataset analysis, implementation techniques, results, and conclusion.
QAC of study [8] was adopted to make sure the significant quality of included studies. It was assessed on the basis of the contents of the section, such as the background section and literature review section, and the methodology is clear to understand, the dataset is valid or not, as the dataset used for IM to make quality descriptive analysis. The implementation of techniques is systematic or partial, and the results are clear or not clearly defined. Finally, the article's conclusion was checked whether it supported the contents of the paper, as mentioned in Table 4. The quality score was assigned against each category to decide on paper inclusion or exclusion. Binary digits (0, 1) were used as quality assessment scores for each criterion quality. As eight assessment parameters were used in the study, therefore the total methodological score was eight. Article quality was considered high, if score ≥ 7, moderate, if 7 < score > 5 and low if score ≤ 4 otherwise excluded from the selected list. The implementation of three screening phases resulted in quality refined and filtered articles. Figure 4 represents the facts and figures of each forum’s extracted scripts—a total of 4362 research articles retrieved as a resultant of search strings given to mentioned online digital libraries. Furthermore, screening of duplicate articles have been performed and removed from the selected list. There was a total of 797 repeated articles retrieved from all repositories.
After completion of phase I screening, a total of 1334 research papers were omitted. Research papers published in conference or journal included in the study; therefore, phase II removed 656 articles not met the JCR or core conference ranking criteria. Phase III screen the quality articles according to the criteria mentioned in Table 3 and removed 1466 articles from the selected paper list. In the end, the remaining 109 articles were used to conduct a systematic literature review.
Results
Most of the selected articles indicated the increasing interest in applying intention mining on the social media platform, search engine logs, robot generated data, mobile usage data, and many other areas to extract users real-time intentions. Researchers developed many efficient frameworks based on supervised learning, unsupervised learning, neural networks, image processing, and statistical methods to determine the user’s intentions. However, the importance of this topic revealed that more research is required in this area. Overall, this study consisted of 109 topic related articles that helped answer three research questions described in the methodology section.
Intention categories
RQ1: what intention categories have been addressed in the last decade on intention mining?
The social media users perform multiple activities in daily routine life to achieve their set goals. Each action revealed an intention of what the User willing to do in the future. According to the current activities and past behaviors, this work characterized the selected intention studies published during last decade into eight categories such as purchase intention, behavioral intention, search intention, continuous intention, implicit human intention, query intention, mobile usage, and general intention. The Table 5 shows eight intention categories along with frequency of selected articles.
Purchase intentions
Online shopping has facilitated users to purchase anything with a single click. The first step of e-shopping is to browse the online store, select and purchase the product. Purchase intentions (PI) are used to detect that either user really wants to buy a product or he is just surfing the web pages. PI facilitates the manufacturers and vendors to improve their products according to user requirements [3]. Wah et al. [9] described that the user intends to purchase a car. Logistic regression, decision trees, and neural networks were used to develop a model “Intention of purchase (IOP)”. The proposed model achieved an accuracy rate of 91.79% to predict whether a car user will purchase it or cancel the order after booking a car user. Guo et al. [13] proposed a new search behavior model to detect that either a searcher has the immediate purchase intention of searched product or research intention to that product. Loyola et al. [14] presented an encoder-decoder neural architecture to detect browse and purchase intention from e-commerce data. Guo et al. [15] proposed deep intent neural network to predict user real time purchase intent. The touch interface of handheld devices has been used to capture interactive behaviors to refine purchase intentions. Studies [16,17,18,19,20] discussed purchase intentions in the context of cell phone app interfaces, web interfaces during the product selection phase.
Behavioral intentions
Social media has encapsulated multiple user behaviors in the purchase history or logs. Although capturing and predicting user behavior is time-consuming and hard to log, numerous studies tried to extract behavioral intentions using classification, clustering, and statistical techniques. Li et al. [21] presented a novel interactive framework to facilitate the communication between human and assistive device. It was used to reduce most elderly and disabled people's effort to interact with machine based on gaze movements. Chen et al. [22] introduced an AIR recommender based on attentional recurrent neural network to predict the user's behavioral intention. Sun et al. and Wang et al. [23, 24] build classifiers based on user feedback data retrieved from user clicks sequence and queries logs using neural networks, statistical techniques, and image processing methods to mine behavioral intentions.
Li et al. [25] presented graph intention network-based model to detect behavioral intentions in click through rate (CTR). Real-world data of e-commerce platform has been used to assess the proposed model, and it delivers promising results. Giannopoulos et al. [26] proposed a client-centered intent-aware query framework to shield user data privacy in personalized web search [25]. Hashemi et al. [27] proposed a multiple intent model to infer users' behavioral intention from America Online (AOL) search query log. Peng et al. [28] proposed a structural equation-based model to discover the factors of discontinuance intention towards social network sites (SNS) concerning autonomous and controlled motivations. In [10], Fan et al. addressed the influential factors on decision making in the context of SNS. Statistical techniques based models were proposed to discover behavioral intentions towards adopting e-learning systems, suggestion intention, participation intention, and switch intention towards social media [29,30,31,32,33,34,35]. Peña et al. [36] detected user intentions to hide and unfriend Facebook contacts from user’s log and identified that people prefer to use hide option instead of using the unfriendly option. Zhu et al. [37] detected the behavioral intentions towards a cloud-based virtual learning environment. Statistical technology acceptance model-based framework has been used to infer the intention of free trial-based technology services. Peng et al. represented the User’s behavioral intention regarding switching relationship with one IT service provider to another. Intention to use smart TV and primary school teachers' behavioral intentions towards mobile usage have been investigated [28, 38, 39]. Kim et al. [40] classify user intentions into multiple categories according to their domain. Ren et al. [41] presented a model that worked on real-time commercial search engine log data. Statistical techniques applied to the questionnaire survey dataset are robust to detect multiple types of human behavioral intentions, such as learning management systems (LMS) [42,43,44,45].
Implicit intentions
Implicit intentions are the type of goals that users are not directly mentioned but hidden in explicit activities. Liu et al. [46] revealed the user's implicit intention to adopt a pension insured program. The fuzzy comprehensive evaluation method has been used to combine with the analytical hierarchy process (AHP) to assess the insured wishes index system [11, 12]. Luong et al. [47] designed a Bayesian network-based context-specific implicit intention recognition model to mine the user's implicit intention. Chen et al. [48] used a semi-supervised user’s question asking framework to detect the user's implicit intention from the community question answering (CQA) Yahoo! Answer dataset. Zhuang et al. [49] proposed an easy life app to perceive implicit intent without any explicit user input on the phone.
Search intentions
Search intentions are the result returned by the query given to the search engines [50]. Children are active users of digital devices, but due to limited vocabulary, they cannot retrieve desired results from the internet. Dragovic et al. [51] presented a search intent module to meet the children's social media requirements. Murata et al. [52] worked to detect search intentions from the Japanese Commercial search engine log. To predict dynamic query and generic search intents, search for a product on the e-commerce portal, and explore search intentions on touch enable digital devices. Qian et al. [53,54,55] presented machine learning-based efficient and effective models. Aljouma et al. [56] proposed a verb ontology and Domain ontology model representing semantic concepts related to verbs and business related vocabulary related to all business domains. These types of services facilitated to connect low-level service description language and web service to achieve business-related goals.
The [57,58,59,60,61] developed a systematic system named SciNet applied on top of scientific databases of above 60 million articles to annotate interactive user modeling. A comprehensive search behavioral model has been proposed to classify search intentions from google log and search intent annotation to work on image dataset. In [62,63,64,65,66] used statistical methods to find the user satisfaction level along with search intention from dynamic behaviors. Search intentions have been classified into three categories: target finding, decision making, and exploration to discover different user interaction patterns with perceived user satisfaction.
Continuance intention
Continuance intentions represent user behavior willing to continue or discontinue a service. This type of intention is detected from real-time user data available on social media to indicate either user is ready to avail of the specific service regularly or in the future wants to discontinue service. Basak et al. [66] used structural Equational Modeling approach to detect the continuance intention of Facebook users. It was concluded that attitude and satisfaction became the reason for continuous Facebook usage. Hong et al. [67] described the continuance intentions towards using Facebook on unethical groups called dangerous virtual communities. Statistical results revealed that online and general social anxiety was negatively correlated the continuance intentions. Lu et al. [68] integrated the Technology Acceptance Model (TAM), Task Fit Technology (TTF) model, Massive Open Online Course (MOOCs) features, and social motivation to investigate the continuance intentions to use MOOCs.
Wu et al. [69] worked to identify the continuance use of current mobile services. Dataset was collected from 512 customers of the Kuwait communication market. Abbas et al. [70] aimed to determine potential reasons as well as examined the moderated role of information overload and social overload. Authors have determined the user's continuous intention in context to use SoLoMo services, government e-learning services and predicted the student’s intention to use mobile cloud storage services [71,72,73]. Li et al. [74] used fuzzy set qualitative comparative analysis (fsQCA) technique to predict the continuance intention towards social media use. Results revealed enjoyment as the most significant factor of continuance use of social media. De Oliveira et al. [75] used statistical techniques to detect the continuance intention to use Facebook. Ifinedo et al. [20] used three theoretical frameworks: social-cognitive theory, technology acceptance model, and motivation theory to detect students' continuance intentions to use blogs. Cao et al. [76] depicted user discontinuance intentions about the usage of social media services. Boakye et al. [77] used mobile location-based services (LBS) characteristics to investigate continuance intentions towards LBS. The compatibility and perceived interactivity were detected as two influential LBS parameters to download and use location-based services on mobile. Swar et al. [78] used information processing theory and theory of planned behavior-based model to extract the factors that caused continuance intention towards online health services usage. Hong et al. [72], Kang et al. [79], Wu et al. [80] and Hur et al. [81] discussed continues intentions regarding mobile usage, online health services, and use of e-learning systems.
Mobile usage intentions
Ling et al. [82] used a classification method to detect mobile users' mobile usage intention. A statistical framework has been proposed to determine primary school teachers' intention to adopt mobile as learning technology, intention towards mobile payment adoption, and mobile shopping continuance intention [83,84,85,86].
Query intentions
Query intentions played a vital role in facilitating users during searching web content on the internet. Search engines suggest queries by matching the keywords given by users. Zhao et al. [87] used statistical techniques to detect user switching intention towards mobile cloud storage using a push-pull mooring framework. A user’s question asking framework was developed to detect user implicit intention regarding Community Question Answer (CQA) from Yahoo question-answer dataset. Query intents facilitate the non-experienced users to search related content on the internet. Lee et al. [88] proposed Similarity-Aware Query Intent Discovery (SQUID) system to mine the internet surfing patterns of such users to detect their query intents. Fariha et al. [89] and Trabelsi et al. [90] described that search engines help the internet surfers to find out the information from a question-answer session by predicting the optimal answer to the user’s question. Jiang et al. [91] proposed a search query log structure to perceive query intent from social media search logs using statistical techniques. Yu et al. [92] proposed Search-Coexistence Knowledge Evolution (SCKE) framework to find out search intents for query patterns from search engine click-through log data. Public demands detailed storyboard right after any news breached out.
General intentions
Gu et al. [93] proposed Cross Domain Random Walk (CDRW) to extract search query patterns from search click through logs. Zan et al. [94] investigated that Intent recommenders have been trending social media to suggest services when the user opens the app without any input. Heterogeneous networks were used to mine the complex and rich content to recommend intents. Liu et al. [95] proposed an intent recommender to analyze user behaviors towards mobile usability. Advertisement has become an essential part of search engines and social media. Advertisement intention aims to adjust the relevance of advertisements with the user query [96, 97]. Izquierdo et al. [98] addressed the issue of relating the advertisement to the user search query using the feature extraction mechanism as well as automate the TV ad schedule by mathematical programming. Social communities share information, ideas, opinions, and attitudes to make a suggestion or propose a solution to anyone’s problem [99,100,101,102,103,104,105,106,107]. The statistical structured model used to detect tourist intentions, intention to use transportation, and intention to participate in online travel companies is addressed [45, 84, 108]. Intention to mine quality event log and intention to recommend using a smartwatch is discussed in [109, 110]. Mishael et al. [111] presented that temporal intentions indicated the periodic change in page status of a twitter post. Stable intention, changing intention from current to past and undefined intention detected from twitter log data to predict the periodic change from twitter post to twitter share. Habib et al. [112] have used deep neural network techniques to extract social media intention from social media logs.
Approaches and techniques used In intention mining
RQ2: What data mining/machine learning approaches currently exist to support the effectiveness of different intention mining techniques?
As illustrated in RQ1 that multiple types of intention have been detected from the datasets. Researchers used various approaches and techniques for intention detection, such as machine learning, statistical techniques, heuristics, image processing methods, fuzzy logic, and deep learning. Machine learning has been considered the most effective and efficient approach to data mining due to its high accuracy performance. Many machine learning techniques such as supervised, unsupervised, semi-supervised, natural language processing, and neural networks have been used to infer intention. The discussion on the techniques and approaches used in 109 selected research articles has been presented in the following sections.
Supervised learning
Supervised learning is a simple approach to machine learning. It is used to train the machine according to training data with known output labels. Classification is one of the most commonly used techniques of supervised learning. In [18], the author used logistic regression, decision trees, and neural networks proposed a model intent of purchase (IOP) to detect the user's car purchase intention. In [20], authors proposed a supervised query intended for kids (QuIK) model to facilitate children of 6–15 age to formulate their query in search engines, which lead to more concise and relevant search results. In [21], a novel approach semi-supervised sequence clustering has been presented to extract and group interaction sequences of users, then assign the predefined task and visualize intuitively. Recommendation (MEIR) was proposed to recommend user intention according to the previous history automatically. In [22], the convolutional neural network and maximum entropy-based model to perceive suggestion intentions from Vietnam text data. An encoder–decoder neural architecture was proposed to mine users browse or purchase intention.
Convolutional attention model [35] was used to compare touch and click interactions with a mouse, keyboard log data to find the similarities and differences of user interaction. Multiple intent modeling has been used to collect candidate intent features efficiently without human supervision [36]. A Bayesian network-based context-specific implicit intention recognition model has been developed to mine the context based user implicit intentions [37]. The mobile touch interaction model [28] was used to detect the user's search intentions by analyzing the touch data of the mobile device.
In [84], a novel software engine EUI having a robust classifier Open Directory Project (ODP), was developed to mine user intentions by analyzing mobile usage data. In [85], authors used the support vector machine classification method to detect commercial intents relevant to the user's research and purchase behavior. A classification-based system semantic similarity aware query intent discovery (SQuID) [88] was proposed to detect the user query intent. It has taken examples from users as input and consults with the database to discover more profound associations. This association revealed the semantic context of the given input and inferred the query intent of the user. Linear discriminant analysis (LDA) [89] was used to classify event-related synchronization (ERS) and desynchronization (ERD) patterns while users were lifting different weights. LDA-CRC and LDA-SRC were used to develop an early warning system to detect a crime activity [99]. In [92], a framework for query feature extraction was developed to mine the advertiser's intentions.
Unsupervised learning
Unsupervised learning is a significant and complex approach of machine learning used to detect hidden patterns from the dataset with unknown output labels. Clustering is the most commonly used technique of unsupervised learning. A data-driven approach automated TV ad scheduling [31] used intention learning on top of mathematical optimization and clustering to imitate scheduling experts' decision-making process. Dynamic intent mining [52] method used to mine query intent from search query logs. The ranking model for user intent [53] exploits user feedback in terms of click data to cluster ranking model for historic queries according to user intent. Another cluster-based model, Heterogenous graph-based soft clustering [54], was developed to collaborate with Wikipedia, web, and queries data to learn search intents. A novel approach map miner method (MMM) [55], used to construct an intentional process model from the process log. MMM used a hidden Markov model to cluster user activities. The study [106] proposed intent sensitive word embedding to lean user satisfaction intention. In [107], an unsupervised sequence query clustering group queries the same interest for producing a pattern consisting of a sequence of semantic concepts and/or lexical items for each intent.
Semi-supervised learning
Semi-supervised learning is a combination of supervised and unsupervised learning techniques. It combines a small amount of labeled data with a large amount of unlabeled data. Discovering query intent patterns from general search has been a challenging task therefore cross-domain random walk (CDRW) [93] is used to extract query patterns from search engine click-through log data. In [29] convolutional neural network and maximum entropy-based model were developed to perceive Vietnam text data's suggestion intentions. In [30], an encoder-decoder neural architecture was proposed to mine users browse or purchase intents and behaviors from large scale datasets of e-commerce.
The study [31] proposed a predictive model based on semi-supervised learning to detect user new question intent in question-answer community. In [32] a novel approach semi-supervised sequence clustering to extract and group interaction sequences of users then assign the predefined task and visualize intuitively. Recommendation (MEIR) was proposed to recommend user intention according to the previous history automatically.
Natural language processing
In [87], a lightweight service composition framework was designed to mine the User intended goal from natural text using natural language processing techniques. Usually, end users have to perform many sub-tasks to achieve a goal therefore, the said system mine the task with non-functional constraints to guide the selection of services.
Fuzzy logic
In [100], the authors revealed the user's implicit intention to adopt the pension insured program. The fuzzy comprehensive evaluation method was used along with the analytical hierarchy process (AHP) to assess the insured wishes index system.
Statistical methods
In [81] authors presented the HMM-AIP model based on hidden Markov model to track and predict the attack intentions. In [82], the authors extracted consumers' mobile reading intention using technology assessment model. Human behavior and attitude were found as the major factors that influenced the said intentions. In [9], the authors developed a model IOP using logistic regression, decision trees, and neural networks to predict whether, after booking a car user will purchase it or cancel the order. In [10], Statistical techniques based collaborative intent nowcasting model was developed to extract the complex relation between intent and context. In [91,92,93,94,95,96,97,98,99,100,101,102,103,104,105], multiple statistical techniques were used to proposed intention detection frameworks such as the S-O-R model base framework, decomposed theory of planned behavior, self-determination theory, TPB, TAM, CTO, regression correlation, composite reliability, variance extracted, AVE, t value, social cognitive theory and many more.
Image processing
In [83], a gaze based intention inference framework was developed to infer elder and disabled people's intentions from gaze movements. The study [84] addressed user query intention to seek any information on search engines. The heuristic based interactive user intention understanding model was developed to help the web surfers to reach their search goals.
Deep learning
In [98], a deep-learning-based methodology was proposed to predict the vehicle's lane change intentions on the road. In deep intent prediction network (DIPN) touch interactive behavioral-based model was presented to predict real-time user-product purchasing intentions. A most comprehensive hierarchal attention interactive method was used to combine the real-time user behaviors more effectively and efficiently. Experiments performed on large scale commercial datasets revealed that DIPN significantly outperforms the baseline methods.
Dataset used in intention mining
RQ3: What types of datasets are used to detect multiple types of human intentions?
Dataset is a combination of related information constructed using social media, weblog, user log, and questionnaire survey methods. As the dataset is a basic element of intention mining; therefore, its accuracy and efficiency matter a lot to get good results from the proposed method/framework. The studies selected in this SLR classified datasets into six types such as search engine logs, social media data, model-based generated data, questionnaire survey method, mobile usage dataset and generic datasets. Figure 5 presents the complete layout of the classified datasets included in this study.
Search engine log data
Search engine logs are the collection of user queries. Such data can use to detect user search patterns and intentions. Before applying techniques, researchers preprocess the dataset to remove all bugs and prepare it according to the method/framework requirements. In [41], an offline and online click-through rate dataset was retrieved from search engines to deduce the behavioral intention. Japanese commercial search engine log data was used to detect search and web advertisement intentions. In [42] search engine clicks through log used to detect query intent patterns. In [44] the authors depicted the intent recommendations using two real-life datasets Movie lens and Tmall. The articles [45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61] used commercial, personal assistant clicks, and views log as a dataset to predict intention with its real-time context.
In [62], search engine log data was used to determine the relevance of advertisement and user search query intention. In [64], Yahoo! Query log was used to retrieve the web results most related to user personalized needs. In [65], the authors used Google and Firefox log data of 440 undergraduate and graduate users who have adequate knowledge to use search engines. In [102], SogouQ Chines query log data was used to perceive user role explicit intent query by the simplified word n-gram role model (SWNR) framework. In [104, 105], the image search dataset was used to mine the intention of disabled people using gaze cues and the influence of image search on human behavior. In [60], the authors presented a probabilistic method to reveal the most optimized searches resulting from the unspecified user query. Multiple users were asked to enter alternate queries with different keywords to form a dataset, which were further used for suggestions in search query intent. In [106], the user question answer framework was proposed to deduce query intention. Yahoo question-answer dataset was used to deduce user upcoming intentions.
Model-based generated data
Another approach to get dataset is by constructing a model according to research requirements. Researchers develop a model to get scenario-based data and implement tools and techniques to get fruitful results. In [58], a task-based model was constructed to develop dataset. Users were given 30 min to solve the research task over 60 million documents. The gathered dataset was used in interactive intent modeling for information discovery. In [53], authors developed a model of multiple users to get their online shopping behavior data and then used the dataset in product search taxonomy to find out that either a user will purchase a product or he just kills his time by browsing.
Questionnaire survey method (QSM)
A questionnaire survey is another method to form datasets for statistical analysis. In QSM, researchers design a questionnaire consists of relevant questions then verified it by experts. The questionnaire was distributed to the relevant people. Their feedback was recorded as the dataset for further usage. . In [70] Pearson’s R and Kendall’s T technique used to model purchase intention by conducting online questionnaire surveys. In [82] questionnaire survey method was used to collect datasets. The sample consisted of students and office workers who are active users of mobile reading. In [100], the questionnaire method was used to collect data to improve the index system's scientific credibility. In [71], 60 questionnaires were filled by 19–60 age participants to collect datasets used to infer strongly related context features.
Another set of questionnaires was used to measure the effectiveness of the proposed model. In [72] online questionnaire was collected to model the differences between the social networking service (SNS) model and the online technologies sage model. In [73] sample consisted of university students, their friends, and a family who used mobile PCSS. In [74] online survey method (from 402 participants) was used to detect the reasons to repost a marketing message on social media. Facebook is one of the most used social media forum on the internet. In [20, 72, 75,76,77,78,79] online surveys were conducted to detect user intentions towards Facebook usage. The addressed topics were the use of Facebook, continuance intention towards Facebook in dangerous virtual communities, and life satisfaction influence to use Facebook.
In [36], the author proposed a model to measure factors to hide or unfriend Facebook contact. Data was collected from the real-time model. Studies [72, 80,81,82,83] proposed a model to perceived user continuance intention towards mobile usage. Proposed models perceived whether cultural values affect continuance intention of mobile shopping, use mobile data service and adopt the mobile payment service for different activities. Continuance intention to use MOOCs, continuance intentions to continue communication in the Kuwait market, continuance intention of SoLoMo services addressed in [84,85,86,87,88]. Datasets collected by online questionnaire surveys consisted of two parts. First were demographic questions about the participants, whereas the second section features questions measuring the research model's constructs [89]. Studies [90,91,92,93,94] addressed the user's intention towards the use of smartwatches, behavioral intentions regarding IT services, continuance intention of pre-service teachers to use the mobile phone, and mobile gamer’s epistemic curiosity.
Generic datasets
There are many other datasets collected from real-life activities named generic dataset. In [11, 12], the car dealer company dataset was used to mine car purchasing intentions. In [10], the authors depict the intent recommendations using two real-life datasets Movie lens and Tmall. Real-world EEG data used in [30, 97] to predict patient intention level and human movement intention, respectively, and in [14], Japanese TV network data used for scheduling of TV advertisements.
Mobile usage dataset
The mobile phone is one of the most used devices of this era. Data generated by mobile through its touch interaction has grasped researchers' attention to detect hidden mobile usage patterns. In [42,43,44,45,46] authors used mobile usage patterns to understand user intent for cell phone, online shopping using mobile and predict the user implicit intention without any explicit input on the mobile phone. In [54] authors used mobile touch interaction dataset to detect the relevant web search on the mobile touch environment.
Social media dataset
Social media datasets are most robust to detect user intentions from daily life activities on different social media forums such as Facebook, Twitter, and e-commerce websites. This study selected many articles that used social media dataset to deduce social media intentions. The studies [10, 87, 95] investigated WikiHow and Wikipedia documents to deduce users' daily living intentions. In [88], Yahoo! Query click daily log was used to make out user intention to get precise knowledge soon after the information was announced.
E-commerce is one of the robust social media services used to facilitate people in the context of online shopping [46]. In [54], three datasets Arnetminer, Patent, and Random, were used to predict interactive user intention. The study [56] extracted the eBay dataset to accurately recommend the user searched product as well, as [56] used nine million sessions of a well-known e-commerce portal to deduce purchase intents. In [57], authors used Sina Weibo, an e-commerce product database, to detect purchase intention, [66] used Yahoo! Search log to see search intention, [67] mined turnover intention from IT job learner’s data, [68] used Chinese dataset NTCIR, spatiotemporal dataset and, Bio-medical microblogs to find out the behavioral intention. Figure 6 depicted the classification of 109 selected studies to the proposed six types of datasets.
Discussion
This section summarized the results deduce from the above discussion. Intention mining is a decade old research field. This study aims to investigate the three main factors of intention mining, such as categories, approaches, along with techniques and, datasets.
Proposed intention categories
Eight intention categories are proposed to classify user intentive activities such as purchase intention, behavioral intention, search intention, continuous intention, human implicit intention, query intention, mobile usage, and general intention. This classification has been made by reviewing each article rigorously to understand the discussed intention type. It was analyzed that the most debated intention category is the generic intention with a 31% ratio. 13% of selected papers were presented behavioral intentions, 13% human implicit intention discussed, whereas 12% search intentions, and the ratio of continuance intention is 10% identified in this review. About 7, 5, and 3% query intention, mobile usage intention, and purchase intention were reported, respectively. It is to be analyzed that generic intention is one of the most discussed intention categories in intention mining research articles.
Proposed taxonomy
This paper identified machine learning, deep learning, image processing, statistical and heuristic as frequently used techniques to intention mining as concerned with approaches and techniques. Machine learning is significant to extract hidden patterns from a huge dataset. Studies included in this review used any one of the multiple approaches of ML, which is the best fit for their framework or method. Discussed approaches of ML are classification, clustering, neural networks, NLP, and semi-supervised learning. In this study, authors analyzed that almost 70% of the included studies used machine learning approaches to detect user intentions. Technology assessment model (TAM) is a statistical technique robust to use for data analysis in intention mining. Many other statistical techniques such as regression, correlation, hidden Markov model, qualitative, quantitative, and search coexistence are used in almost 40% of the papers included in this article. 1% of Fuzzy logic techniques and 1% of image processing techniques used 1% in research articles to mine user intentions. Figure 5 presented a state-of-the-art taxonomy of approaches and techniques used in intention mining.
Proposed types of datasets
Dataset is a core phase of the intention mining process. The questionnaire survey method is one of the major sources to get datasets on multiple topics. This method is used to collect data for statistical techniques-based frameworks. About 38% of included articles used online questionnaire survey methods to collect data from users according to requirement, while 18% of Search engine log data and 22% of selected papers used social media datasets used to extract user intention. Only 4% of studies used model-based, and 9% mobile-based generated data, respectively. About 8% of articles used generic datasets such as email composing data, EEG signal data, images, and PREVENTION datasets.
Table 6 presents the overall classification results of selected papers to quality assessment parameters.
Issues and challenges
According to the reviewed literature, this article identified the following issues and challenges related to intention mining
-
Predict behavioral intention of software users either they will purchase and install the new version of the software after using its beta version
-
Significant research can be performed to identify a chef's behavioral intentions while cooking food; either he will sustain or change the taste.
-
Many approaches identified purchase or buy intentions, but only a few discussed user intentions; therefore, sell intentions need to be addressed.
-
Investigate property dealer behavior intention either he intended to purchase/sell a property or not.
-
Automate the cluster assignment of TV advertisement intentions using natural language processing and deploy such a system on the TV network.
-
Facebook and Twitter data used in a different perspective to mine the user intention along with these forums, WhatsApp and Instagram have also grasped user attention. A vast number of users are using these apps for social connectivity. Therefore, the bulk of data is available to analyze user intention to use WhatsApp and Instagram, intention to share data (images, videos). Chat data of both forums can also be used to infer user intention.
Deep learning is a modern approach to machine learning. It is used for prediction methods as it reduces much of the labor work to extract features and related artifacts. But it is being observed that usage of deep learning techniques in intention mining is far less than machine learning; therefore, it is suggested that the researchers should use deep learning to extract intention patterns.
Threats to validity
Threats to validity are important to recognize in research to make a robust study [113,114,115,116]. There have been three kinds of threats to validity identified in this section
Construct validity
In the context of SLR, threats to validity refer to the classification of selected articles [114]. In this study, two authors identified primary and secondary search keywords. Six terms related to intention mining have been used to construct search strings. The search string was performed using well-reputed digital libraries such as IEEE Xplore, ACM, Sciencedirect, and Springerlink. We have found most of the research articles related to intention mining. We have searched the relevant papers in data mining and machine learning research venues to reduce the risk of related publications. We have selected JCR journals and raked conference articles, which indicates the good quality of included articles.
Internal validity
Internal validity handles the extraction data analysis process, in which two authors worked on the classification of selected studies and the data extraction process, whereas one author reviewed the results [114]. The Kappa coefficient value is 0.92 between two authors who have worked on the related articles collection and classification. The Kappa value has been indicated the high level of agreement and confidence of authors on selected studies.
Conclusion validity
The conclusion validity in SLR is related to recognize the improper relationship that may lead to an incorrect conclusion. To decrease this threat, a proper data extraction and selection process have been discussed in internal validity.
Conclusions
Intention mining is a promising field of research that aims to detect future actions of the user. This article has presented a systematic literature review on intention mining by comprehensively reviewing the 109 best quality articles of well-reputed forums selected by employing a systematic methodology. This study's primary focus is to discuss intention mining by its categories, approaches, techniques, and datasets. The contribution of this study is to classify user intention into eight categories such as purchase intention, behavioral intention, search intention, continuous intention, implicit human intention, query intention, mobile usage, and general intention. It has been analyzed by reviewing the included studies that behavioral intention is the most discussed intention type, which aims to detect human behavior to purchase a product, use Facebook, or search daily news. After that, a comprehensive review was performed on proposed approaches and techniques to deduce intention from past human activities. As contrasted to the other studies, this SLR presented a taxonomy to map the state-of-the-art techniques such as machine learning, deep learning, image processing, and statistical techniques for intention mining. It has been observed that statistical and machine learning techniques are frequently used for intention detection. Furthermore, a detailed discussion was accomplished on the classification of datasets used to infer the user intention. The datasets' perceived group was classified into six major categories: search engine logs, social media data, model-based generated data, questionnaire survey method, mobile usage dataset, and generic datasets. The questionnaire survey method was observed as a major source of data collection. Finally, promising future directions have been discussed for the researchers working in the domain of intention mining. This study is an effort to gather intention mining knowledge in intention categories, techniques, approaches, and datasets.
Abbreviations
- IC:
-
Intention category
- LDA:
-
Latent Dirichlet allocation
- CRC:
-
Collaborative representation classifier
- SRC:
-
Sparse representation classifier
- HIS:
-
Interactive heuristic search
- QUIK:
-
Query intended for kids
- DFT:
-
Discrete Fourier transform
- TPB:
-
Theory of planned behavior
- LSTM:
-
Long-short term memory
- HMM:
-
Hidden Markov model
- TAM:
-
Technology acceptance model
References
Epure EV, Hug C, Deneckère R, Brinkkemper S (2013) Intention-mining: a solution to process participant support in process aware information systems. Department of Information and Computing Sciences Utrecht University, Utrecht
Khodabandelou G, Hug C, Deneckere R, Salinesi C (2013) Process mining versus intention mining. Enterprise, business-process and information systems modeling. Springer, Berlin, pp 466–480
Bag S, Tiwari MK, Chan FT (2019) Predicting the consumer’s purchase intention of durable goods: an attribute-level analysis. J Bus Res 94:408–419
Huang Q, Xia X, Lo D, Murphy GC (2018) Automating intention mining. IEEE Trans Softw Eng 46:1098–1119
Papadimitriou D, Koutrika G, Mylopoulos J, Velegrakis Y (2016) The goal behind the action: toward goal-aware systems and applications. ACM Trans Database Syst 41(4):1–43
Di Sorbo A, Panichella S, Visaggio CA, Di Penta M, Canfora G, Gall HC (2015) Development emails content analyzer: intention mining in developer discussions (T). In: 2015 30th IEEE/ACM international conference on automated software engineering (ASE), pp 12–23. IEEE
Ghasemi M, Amyot D (2019) From event logs to goals: a systematic literature review of goal-oriented process mining. Requir Eng 25:1–27
Alammary A (2019) Blended learning models for introductory programming courses: a systematic review. PLoS ONE 14(9):e0221765
Wah YB, Ismail NH, Fong S (2011) Predicting car purchase intent using data mining approach. In: 2011 eighth international conference on fuzzy systems and knowledge discovery (FSKD), vol 3. IEEE, pp 1994–1999
Fan S, Zhu J, Han X, Shi C, Hu L, Ma B, Li Y (2019) Metapath-guided heterogeneous graph neural network for intent recommendation. In: Proceedings of the 25th ACM SIGKD international conference on knowledge discovery & data mining, pp 2478–2486
Lai C, Chen X, Chen X, Wang Z, Wu X, Zhao S (2015) A fuzzy comprehensive evaluation model for flood risk based on the combination weight of game theory. Nat Hazards 77(2):1243–1259
Guo Q, Agichtein E (2010) Exploring searcher interactions for distinguishing types of commercial intent. In: Proceedings of the 19th international conference on World wide web. ACM, pp 1107–1108
Guo Q, Agichtein E (2010). Exploring searcher interactions for distinguishing types of commercial intent. In: Proceedings of the 19th international conference on World wide web. ACM, pp 1107–1108
Loyola P, Liu C, Hirate Y (2017). Modeling user session and intent with an attention-based encoder-decoder architecture. In: Proceedings of the eleventh ACM conference on recommender systems. ACM, pp 147–151
Guo L, Hua L, Jia R, Zhao B, Wang X, Cui B (2019). Buying or browsing?: predicting real-time purchasing intent using attention-based deep network with multiple behavior. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1984–1992
Zhao XW, Guo Y, He Y, Jiang H, Wu Y, Li X (2014). We know what you want to buy: a demographic-based system for product recommendation on microblogs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1935–1944
Kimura A, Mukawa N, Yuasa M, Masuda T, Yamamoto M, Oka T, Wada Y (2014) Clerk agent promotes consumers’ ethical purchase intention in unmanned purchase environment. Comput Hum Behav 33:1–7
Hsu MH, Chang CM, Chu KK, Lee YJ (2014) Determinants of repurchase intention in online group-buying: the perspectives of DeLone and McLean IS success model and trust. Comput Hum Behav 36:234–245
Gao L, Waechter KA, Bai X (2015) Understanding consumers’ continuance intention towards mobile purchase: a theoretical framework and empirical study—a case of China. Comput Hum Behav 53:249–262
Ifinedo P (2017) Examining students’ intention to continue using blogs for learning: perspectives from technology acceptance, motivational, and social-cognitive frameworks. Comput Hum Behav 72:189–199
Li S, Zhang X (2014) Implicit human intention inference through gaze cues for people with limited motion ability. In: 2014 IEEE international conference on mechatronics and automation. IEEE, pp 257–262
Chen T, Yin H, Chen H, Yan R, Nguyen QVH, Li X (2019) AIR: attentional intention-aware recommender systems. In: 2019 IEEE 35th international conference on data engineering (ICDE). IEEE, pp 304–315
Sun Y, Yuan NJ, Xie X, McDonald K, Zhang R (2017) Collaborative intent prediction with real-time contextual data. ACM Trans Inf Syst (TOIS) 35(4):30
Wang HJ, Lo J (2010) Exploring citizens' intention to use government websites in Taiwan: an empirical study. In: Proceedings of the 12th international conference on information integration and web-based applications and services. ACM, pp 524–531
Li F, Chen Z, Wang P, Ren Y, Zhang D, Zhu X (2019) Graph intention network for click-through rate prediction in sponsored search. In: Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, pp 961–964
Giannopoulos G, Brefeld U, Dalamagas T, Sellis T (2011) Learning to rank user intent. In: Proceedings of the 20th ACM international conference on Information and knowledge management. ACM, pp 195–200
Hashemi SH, Williams K, El Kholy A, Zitouni I, Crook PA (2018) Measuring user satisfaction on smart speaker intelligent assistants using intent sensitive query embeddings. In: Proceedings of the 27th ACM international conference on information and knowledge management. ACM, pp 1183–1192
Peng X, Zhao YC, Zhu Q (2016) Investigating user switching intention for mobile instant messaging application: taking WeChat as an example. Comput Hum Behav 64:206–216
Lettner F, Grossauer C, Holzmann C (2014) Mobile interaction analysis: towards a novel concept for interaction sequence mining. In: Proceedings of the 16th international conference on human–computer interaction with mobile devices and services. ACM, pp 359–368
Ngo TL, Pham KL, Takeda H, Pham SB, Phan XH (2017) On the identification of suggestion intents from Vietnamese conversational texts. In: Proceedings of the eighth international symposium on information and communication technology. ACM, pp 417–424
Zhang D, Yao L, Chen K, Wang S (2018) Ready for use: subject-independent movement intention recognition via a convolutional attention model. In: Proceedings of the 27th ACM international conference on information and knowledge management. ACM, pp 1763–1766
Dermentzi E, Papagiannidis S, Toro CO, Yannopoulou N (2016) Academic engagement: differences between intention to adopt social networking sites and other online technologies. Comput Hum Behav 61:321–332
Cheng S, Lee SJ, Choi B (2019) An empirical investigation of users’ voluntary switching intention for mobile personal cloud storage services based on the push–pull–mooring framework. Comput Hum Behav 92:198–215
Wang W, Chen RR, Ou CX, Ren SJ (2019) Media or message, which is the king in social commerce?: an empirical study of participants’ intention to repost marketing messages on social media. Comput Hum Behav 93:176–191
Agudo-Peregrina ÁF, Hernández-García Á, Pascual-Miguel FJ (2014) Behavioral intention, use behavior and the acceptance of electronic learning systems: differences between higher education and lifelong learning. Comput Hum Behav 34:301–314
Peña J, Brody N (2014) Intentions to hide and unfriend Facebook connections based on perceptions of sender attractiveness and status updates. Comput Hum Behav 31:143–150
Zhu DH, Chang YP (2014) Investigating consumer attitude and intention toward free trials of technology-based services. Comput Hum Behav 30:328–334
Choi J, Kim S (2016) Is the smartwatch an IT product or a fashion product? A study on factors affecting the intention to use smartwatches. Comput Hum Behav 63:77
Sánchez-Prieto JC, Olmos-Migueláñez S, García-Peñalvo FJ (2017) MLearning and pre-service teachers: an assessment of the behavioral intention using an expanded TAM model. Comput Hum Behav 72:644–654
Kim YB, Lee SH (2017) Mobile gamer’s epistemic curiosity affecting continuous play intention. Focused on players’ switching costs and epistemic curiosity. Comput Hum Behav 77:32–46
Ren P, Chen Z, Ma J, Wang S, Zhang Z, Ren Z (2015) Mining and ranking users’ intents behind queries. Inf Retriev J 18(6):504–529
Yi F, Yin L, Wen H, Zhu H, Sun L, Li G (2018) Mining human periodic behaviors using mobility intention and relative entropy. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Cham, pp 488–499
Cui C, Mao W, Zheng X, Zeng D (2017). Mining user intents in online interactions: applying to discussions about medical event on SinaWeibo platform. In: International conference on Smart Health. Springer, Cham, pp 177–183
Cigdem H, Topcu A (2015) Predictors of instructors’ behavioral intention to use learning management system: a Turkish vocational college example. Comput Hum Behav 52:22–28
Santoso AS, Nelloh LAM (2017) User satisfaction and intention to use peer-to-peer online transportation: a replication study. Procedia Comput Sci 124:379–387
Liu R, Zhang X, Li S (2014) Use context to understand user's implicit intentions in activities of daily living. In: 2014 IEEE international conference on mechatronics and automation. IEEE, pp 1214–1219
Luong TL, Truong QT, Dang HT, Phan XH (2016) Domain identification for intention posts on online social media. In: Proceedings of the seventh symposium on information and communication technology. ACM, pp 52–57
Chen L, Zhang D, Mark L (2012) Understanding user intent in community question answering. In: Proceedings of the 21st international conference on World Wide Web. ACM, pp 823–828
Zhuang J, Mei T, Hoi SC, Xu YQ, Li S (2011) When recommendation meets mobile: contextual and personalized recommendation on the go. In: Proceedings of the 13th international conference on ubiquitous computing. ACM, pp 153–162
Yang Y, Tang J (2015) Beyond query: interactive user intention understanding. In: 2015 IEEE international conference on data mining. IEEE, pp 519–528
Dragovic N, Madrazo Azpiazu I, Pera MS (2016) Is sven seven?: a search intent module for children. In: Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 885–888
Murata M, Toda H, Matsuura Y, Kataoka R, Mochizuki T (2010) Detecting periodic changes in search intentions in a search engine. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, pp 1525–1528
Qian Y, Sakai T, Ye J, Zheng Q, Li C (2013) Dynamic query intent mining from a search log stream. In: Proceedings of the 22nd ACM international conference on information and knowledge management. ACM, pp 1205–1208
Ren X, Wang Y, Yu X, Yan J, Chen Z, Han J (2014) Heterogeneous graph-based intent learning with queries, web pages and wikipedia concepts. In: Proceedings of the 7th ACM international conference on Web search and data mining. ACM, pp 23–32
Guo Q, Jin H, Lagun D, Yuan S, Agichtein E (2013) Mining touch interaction data on mobile devices to predict web search result relevance. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval. ACM, pp 153–162
Aljoumaa K, Assar S, Souveyet C (2010) Publishing intentional services using new annotation for WSDL. In: Proceedings of the 12th international conference on information integration and web-based applications and services. ACM, pp 881–884
Guo Q, Agichtein E (2010) Ready to buy or just browsing?: detecting web searcher goals from interaction data. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. ACM, pp 130–137
Ruotsalo T, Peltonen J, Eugster MJ, Głowacka D, Reijonen A, Jacucci G, Kaski S et al (2015) Scinet: interactive intent modeling for information discovery. In: Proceedings of the 38th International ACM SIGIR conference on research and development in information retrieval. ACM, pp 1043–1044
Wu Z, Liu Y, Zhang Q, Wu K, Zhang M, Ma S (2019) The influence of image search intents on user behavior and satisfaction. In: Proceedings of the twelfth ACM international conference on web search and data mining. ACM, pp 645–653
Kato MP, Tanaka K (2016) To suggest, or not to suggest for queries with diverse intents: optimizing search result presentation. In: Proceedings of the ninth ACM international conference on web search and data mining. ACM, pp 133–142
Cheung JCK, Li X (2012) Sequence clustering and labeling for unsupervised query intent discovery. In: Proceedings of the fifth ACM international conference on Web search and data mining. ACM, pp 383–392
Su N, He J, Liu Y, Zhang M, Ma S (2018) User intent, behaviour, and perceived satisfaction in product search. In: Proceedings of the eleventh ACM international conference on web search and data mining. ACM, pp 547–555
Ashkan A, Clarke CL (2013) Impact of query intent and search context on clickthrough behavior in sponsored search. Knowl Inf Syst 34(2):425–434
Wang CJ, Chen HH (2014) Intent mining in search query logs for automatic search script generation. Knowl Inf Syst 39(3):513–542
Chapelle O, Ji S, Liao C, Velipasaoglu E, Lai L, Wu SL (2011) Intent-based diversification of web search results: metrics and algorithms. Inf Retriev 14(6):572–592
Basak E, Calisir F (2015) An empirical study on factors affecting continuance intention of using Facebook. Comput Hum Behav 48:181–189
Hong JC, Hwang MY, Hsu CH, Tai KH, Kuo YC (2015) Belief in dangerous virtual communities as a predictor of continuance intention mediated by general and online social anxiety: the Facebook perspective. Comput Hum Behav 48:663–670
Lu J, Yu CS, Liu C, Wei J (2017) Comparison of mobile shopping continuance intention between China and USA from an espoused cultural perspective. Comput Hum Behav 75:130–146
Wu B, Chen X (2017) Continuance intention to use MOOCs: Integrating the technology acceptance model (TAM) and task technology fit (TTF) model. Comput Hum Behav 67:221–232
Abbas HA, Hamdy HI (2015) Determinants of continuance intention factor in Kuwait communication market: case study of Zain-Kuwait. Comput Hum Behav 49:648–657
Yang HL, Lin RX (2017) Determinants of the intention to continue use of SoLoMo services: consumption values and the moderating effects of overloads. Comput Hum Behav 73:583–595
Hong JC, Tai KH, Hwang MY, Kuo YC, Chen JS (2017) Internet cognitive failure relevant to users’ satisfaction with content and interface design to reflect continuance intention to use a government e-learning system. Comput Hum Behav 66:353–362
Arpaci I (2016) Understanding and predicting students’ intention to use mobile cloud storage services. Comput Hum Behav 58:150–157
Li H, Li L, Gan C, Liu Y, Tan CW, Deng Z (2018) Disentangling the factors driving users’ continuance intention towards social media: a configurational perspective. Comput Hum Behav 85:175–182
de Oliveira MJ, Huertas MKZ (2015) Does life satisfaction influence the intention (We-Intention) to use Facebook? Comput Hum Behav 50:205–210
Cao X, Sun J (2018) Exploring the effect of overload on the discontinuous intention of social media users: an SOR perspective. Comput Hum Behav 81:10–18
Boakye KG (2015) Factors influencing mobile data service (MDS) continuance intention: an empirical study. Comput Hum Behav 50:125–131
Swar B, Hameed T, Reychav I (2017) Information overload, psychological ill-being, and behavioral intention to continue online healthcare information search. Comput Hum Behav 70:416–425
Kang JYM, Mun JM, Johnson KK (2015) In-store mobile usage: Downloading and usage intention toward mobile location-based retail apps. Comput Hum Behav 46:210–217
Wu K, Vassileva J, Zhao Y (2017) Understanding users’ intention to switch personal cloud storage services: evidence from the Chinese market. Comput Hum Behav 68:300–314
Hur HJ, Lee HK, Choo HJ (2017) Understanding usage intention in innovative mobile app service: comparison between millennial and mature consumers. Comput Hum Behav 73:353–361
Ling M, Lin D (2010) Empirical research of mobile reading consumers' behavior intention. In: 2010 international conference on multimedia information networking and security. IEEE, pp 301–305
Ha J, Lee JH, Shim KS, Lee S (2010) EUI: an embedded engine for understanding user intents from mobile devices. In: Proceedings of the 19th ACM international conference on Information and knowledge management. ACM, pp 1935–1936
Agag G, El-Masry AA (2016) Understanding consumer intention to participate in online travel community and effects on consumer intention to purchase travel online and WOM: an integration of innovation diffusion theory and TAM with trust. Comput Hum Behav 60:97–111
Zhou Q, Xu Z, Yen NY (2019) User sentiment analysis based on social network information and its application in consumer reconstruction intention. Comput Hum Behav 100:177–183
Chung N, Han H, Joun Y (2015) Tourists’ intention to visit a destination: the role of augmented reality (AR) application for a heritage site. Comput Hum Behav 50:588–599
Zhao Y, Wang S, Zou Y, Ng J, Ng T (2016) Mining user intents to compose services for end-users. In: 2016 IEEE international conference on web services (ICWS). IEEE, pp 348–355
Lee YK, Hsieh PH, Huang CH, Chuang KT (2016) On instant knowledge evolution from learning user search intent. In: 2016 ieee international conference on web services (ICWS). IEEE, pp 562–569
Fariha A, Sarwar SM, Meliou A (2018) SQuID: semantic similarity-aware query intent discovery. In: Proceedings of the 2018 international conference on management of data. ACM, pp 1745–1748
Trabelsi C, Moulahi B, Yahia SB (2011) Folksonomy query suggestion via users’ search intent prediction. In: International conference on flexible query answering systems. Springer, Berlin, pp 388–399
Jiang D, Yang L (2016) Query intent inference via search engine log. Knowl Inf Syst 49(2):661–685
Yu W, Yu M, Zhao T, Jiang M (2020) Identifying referential intention with heterogeneous contexts. In: Proceedings of the web conference 2020, pp 962–972
Gu S, Yan J, Ji L, Yan S, Huang J, Liu N, Chen Z et al (2011) Cross domain random walk for query intent pattern mining from search engine log. In: 2011 IEEE 11th international conference on data mining. IEEE, pp 221–230
Zan X, Gao F, Han J, Sun Y (2009) A hidden Markov model based framework for tracking and predicting of attack intention. In: 2009 international conference on multimedia information networking and security. IEEE, vol 2, pp 498–501
Liu R, Zhang X, Webb J, Li S (2015) Context-specific intention awareness through web query in robotic caregiving. In: 2015 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1962–1967
Rastogi A (2015) Contributor's performance, participation intentions, influencers, and project performance. In: Proceedings of the 37th international conference on software engineering, vol 2. IEEE Press, pp 919–922
Koyas E, Hocaoglu E, Patoglu V, Cetin M (2013) Detection of intention level in response to task difficulty from EEG signals. In: 2013 IEEE international workshop on machine learning for signal processing (MLSP). IEEE, pp 1–6
Izquierdo R, Quintanar A, Parra I, Fernández-Llorca D, Sotelo MA (2019) Experimental validation of lane-change intention prediction methodologies based on CNN and LSTM. In: 2019 IEEE intelligent transportation systems conference (ITSC). IEEE, pp 3657–3662
Chen SH, Santoso A, Lee YS, Wang JC (2015) Latent dirichlet allocation based blog analysis for criminal intention detection system. In: 2015 international carnahan conference on security technology (ICCST). IEEE, pp. 73–76
Guo Y (2012). Research on the evaluation for intention of the insured of new rural pension insurance program. In: 2012 fourth international conference on multimedia information networking and security. IEEE, pp 736–739
Vattikonda BC, Kodipaka S, Zhou H, Dave V, Guha S, Snoeren AC (2015) Interpreting advertiser intent in sponsored search. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 2177–2185
Yu H, Ren F (2012) Role-explicit query identification and intent role annotation. In: Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, pp 1163–1172
Suzuki Y, Wee WM, Nishioka I (2019) TV advertisement scheduling by learning expert intentions. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 3071–3081
Khodabandelou G, Hug C, Deneckère R, Salinesi C (2014) Unsupervised discovery of intentional process models from event logs. In: Proceedings of the 11th working conference on mining software repositories. ACM, pp 282–291
Mishra A, Jain SK (2015) An approach for intention mining of complex comparative opinion why type questions asked on product review sites. In: International conference on intelligent text processing and computational linguistics. Springer, Cham, pp 257–271
Forman G, Nachlieli H, Keshet R (2015) Clustering by intent: a semi-supervised method to discover relevant clusters incrementally. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Cham, pp 20–36
Ahmad IS, Bakar AA, Yaakub MR (2020) Movie revenue prediction based on purchase intention mining using YouTube trailer reviews. Inf Process Manag 57(5):102278
Sintonen S, Immonen M (2013) Telecare services for aging people: assessment of critical factors influencing the adoption intention. Comput Hum Behav 29(4):1307–1317
Septiani R, Handayani PW, Azzahro F (2017) Factors that affecting behavioral intention in online transportation service: case study of GO-JEK. Procedia Comput Sci 124:504–512
Díaz-Rodriguez OE, Hernández MGP (2020) Quality event log to intention mining: a study case. In: 2020 international conference on computer science, engineering and applications (ICCSEA). IEEE, pp 1–6
Muzaffar SI, Shahzad K, Malik K, Mahmood K (2020) Intention mining: a deep learning-based approach for smart devices. J Ambient Intell Smart Environ 12:1–13 (Preprint)
Mishael Q, Ayesh A (2020) Investigating classification techniques with feature selection for intention mining from Twitter feed. arXiv preprint arXiv: 2001.1038
Habib A, Jelani N, Khattak AM, Akbar S, Asghar MZ (2020) Exploiting deep neural networks for intention mining. In: Proceedings of the 2020 9th international conference on software and computer applications, pp 26–30
Farooq MS, Riaz S, Abid A, Abid K, Naeem MA (2019) A survey on the role of IoT in agriculture for the implementation of smart farming. IEEE Access 7:156237–156271
Farooq MS, Salam M, Ur Rehman S, Fayolle A, Jaafar N, Ayupp K (2018) Impact of support from social network on entrepreneurial intention of fresh business graduates. Educ Train 60:335–353
Lytras MD, Visvizi A, Jussila J (2020) Social media mining for smart cities and smart villages research. Soft Comput 24:10983–10987
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
See Table 6.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Rashid, A., Farooq, M.S., Abid, A. et al. Social media intention mining for sustainable information systems: categories, taxonomy, datasets and challenges. Complex Intell. Syst. 9, 2773–2799 (2023). https://doi.org/10.1007/s40747-021-00342-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40747-021-00342-9