Abstract
Consumer reviews online may contain suggestions useful for improving commercial products and services. Mining suggestions is challenging due to the absence of large labeled and balanced datasets. Furthermore, most prior studies attempting to mine suggestions, have focused on a single domain such as Hotel or Travel only. In this work, we introduce a novel over-sampling technique to address the problem of class imbalance, and propose a multi-task deep learning approach for mining suggestions from multiple domains. Experimental results on a publicly available dataset show that our over-sampling technique, coupled with the multi-task framework outperforms state-of-the-art open domain suggestion mining models in terms of the F-1 measure and AUC.
M. Leekha, M. Goswami and M. Jain—Contributed equally and would like to be consider as joint first authors.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Consumers often express their opinions towards products and services through online reviews and discussion forums. These reviews may include useful suggestions that can help companies better understand consumer needs and improve their products and services. However, manually mining suggestions amid vast numbers of non-suggestions can be cumbersome, and equated to finding needles in a haystack. Therefore, designing systems that can automatically mine suggestions is essential. The recent SemEval [6] challenge on Suggestion Mining saw many researchers using different techniques to tackle the domain-specific task (in-domain Suggestion Mining). However, open-domain suggestion mining, which obviates the need for developing separate suggestion mining systems for different domains, is still an emerging research problem. We formally define the problem of open-domain suggestion mining as follows:
Definition 1
(Open-domain Suggestion Mining). Given a set of reviews \(\mathcal {R} = \{ r_1, r_2 \ldots r_n\}\) from multiple domains in \(\mathcal {D} = d_1 \cup d_2 \cup \ldots d_m \), train a classifier C using \(\mathcal {D}\) to predict the nature of each review \(r_i\).
Building on the work of [5], we design a framework to detect suggestions from multiple domains. We formulate a multitask classification problem to identify both the domain and nature (suggestion or non-suggestion) of reviews. Furthermore, we also propose a novel language model-based text over-sampling approach to address the class imbalance problem.
2 Methodology
2.1 Dataset and Pre-processing
We use the first publicly available and annotated dataset for suggestion mining from multiple domains created by [5]. It comprises of reviews from four domains namely, hotel, electronics, travel and software. During pre-processing, we remove all URLs (eg. https:// ...) and punctuation marks, convert the reviews to lower case and lemmatize them. We also pad the text with start \({{\mathbf {\mathtt{{{S}}}}}}\) and end \({{\mathbf {\mathtt{{{E}}}}}}\) symbols for over-sampling.
2.2 Over-Sampling Using Language Model: LMOTE
One of the major challenges in mining suggestions is the imbalanced distribution of classes, i.e. the number of non-suggestions greatly outweigh the number of suggestions (refer Table 1). To this end, studies frequently utilize Synthetic Minority Over-sampling Technique (SMOTE) [1] to over-sample the minority class samples using the text embeddings as features. However, SMOTE works in the euclidean space and therefore does not allow an intuitive understanding and representation of the over-sampled data, which is essential for qualitative and error analysis of the classification models. We introduce a novel over-sampling technique, Language Model-based Over-sampling Technique (LMOTE), exclusively for text data and note comparable (and even slightly better sometimes) performance to SMOTE. We use LMOTE to over-sample the number of suggestions before training our classification model. For each domain, LMOTE uses the following procedure to over-sample suggestions:
Find Top \(\eta \) \(\texttt {n}\)-Grams: From all reviews labelled as suggestions (positive samples), sample the top \(\eta =100\) most frequently occurring \(\texttt {n}\)-grams (\(\texttt {n}=5\)). For example, the phrase “nice to be able to” occurred frequently in many domains.
Train Language Model on Positive Samples: Train a BiLSTM language model on the positive samples (suggestions). The BiLSTM model predicts the probability distribution of the next word (\(w_t\)) over the whole vocabulary (\(V \cup {{\mathbf {\mathtt{{{E}}}}}}\)) based on the last \(\texttt {n}=5\) words (\(w_{t-5},\ldots , w_{t-1}\)), i.e., the model learns to predict the probability distribution , such that \(w_t = \underset{w_i}{{{\,\mathrm{arg\,max}\,}}} \, P(w_i \ | \ w_{t-5} \ w_{t-4} \ w_{t-3} \ w_{t-2} \ w_{t-1})\).
Generate Synthetic Text Using Language Model and Frequent \(\texttt {n}\)-Grams: Using the language model and a randomly chosen frequent 5-gram as the seed, we generate text by repeatedly predicting the most probable next word (\(w_t\)), until the end symbol \({{\mathbf {\mathtt{{{E}}}}}}\) is predicted.
Table 2 comprises of the most frequent 5-grams and their corresponding suggestions ‘sampled’ using LMOTE. In our study, we generate synthetic positive reviews till the number of suggestion and non-suggestion class samples becomes equal in the training set.
Algorithm 1 summarizes the LMOTE over-sampling methodology. Following is a brief description of the sub-procedures used in the algorithm:
-
NGrams\((\mathcal {D}_{sugg}, \eta , n)\): It returns the top \(\eta \) n-grams from the set of suggestions, \(D_{sugg}\).
-
TrainLanguageModel\((\mathcal {D}_{sugg}, n)\): This procedure trains an n-gram BiLSTM Language Model on \(D_{sugg}\).
-
random\((n\_grams)\)- Randomly selects an n-gram from the input set.
-
LMOTEGenerate\((language\_model, seed)\): The procedure takes as input the trained language model and a randomly chosen n-gram from the set of top \(\eta \) n-grams as seed, and starts generating a review till the end tag, E is produced. The procedure is repeated until we have a total of \(\mathcal {N}\) suggestion reviews.
2.3 Mining Suggestion Using Multi-task Learning
Multi-task learning (MTL) has been successful in many applications of machine learning since sharing representations between auxiliary tasks allows models to generalize better on the primary task. Figure 1B illustrates 3-dimensional UMAP [4] visualization of text embeddings of suggestions, coloured by their domain. These embeddings are outputs of the penultimate layer (dense layer before the final softmax layer) of the Single task (STL) ensemble baseline. It can be clearly seen that suggestions from different domains may have varying feature representations. Therefore, we hypothesize that we can identify suggestions better by leveraging domain-specific information using MTL. Therefore, in the MTL setting, given a review \(r_i\) in the dataset, D, we aim to identify both the domain of the review, as well as its nature.
2.4 Classification Model
We use an ensemble of three architectures namely, CNN [2] to mirror the spatial perspective and preserve the n-gram representations; Attention Network to learn the most important features automatically; and a BiLSTM-based text RCNN [3] model to capture the context of a text sequence (Fig. 2). In the MTL setting, the ensemble has two output softmax layers, to predict the domain and nature of a review. The STL baselines on the contrary, only have a singe softmax layer to predict the nature of the review. We use ELMo [7] word embeddings trained on the dataset, as input to the models.
3 Results and Discussion
We conducted experiments to assess the impact of over-sampling, the performance of LMOTE and the multi-task model. We used the same train-test split as provided in the dataset for our experiments. All comparisons have been made in terms of the F-1 score of the suggestion class for a fair comparison with prior work on representational learning for open domain suggestion mining [5] (refer Baseline in Table 3). For a more insightful evaluation, we also compute the Area under Receiver Operating Characteristic (ROC) curves for all models used in this work. Tables 3, 4 and Figs. 3 and 1A summarize the results of our experiments, and there are several interesting findings:
Over-Sampling Improves Performance. To examine the impact of over-sampling, we compared the performance of our ensemble classifier with and without over-sampling i.e. we compared results under the STL, STL + SMOTE and STL + LMOTE columns. Our results confirm that in general, over-sampling suggestions to obtain a balanced dataset improves the performance (F-1 score & AUC) of our classifiers.
LMOTE Performs Comparably to SMOTE. We compared the performance of SMOTE and LMOTE in the single task settings (STL + SMOTE and STL + LMOTE) and found that LMOTE performs comparably to SMOTE (and even outperforms it in the electronics and software domains). LMOTE also has the added advantage of resulting in intelligible samples which can be used to qualitatively analyze and troubleshoot deep learning based systems. For instance, consider suggestions created by LMOTE in Table 2. While the suggestions may not be grammatically correct, their constituent phrases are nevertheless semantically sensible.
Multi-task Learning Outperforms Single-Task Learning. We compared the performance of our classifier in single and multi-task settings (STL + LMOTE and MTL + LMOTE) and found that by multi-task learning improves the performance of our classifier. We qualitatively analysed the single and multi task models, and found many instances where by leveraging domain-specific information the multi task model was able to accurately identify suggestions. For instance, consider the following review: “Bring a Lan cable and charger for your laptop because house-keeping doesn’t provide it.” While the review appears to be an assertion (non-suggestion), by predicting its domain (hotel), the multi-task model was able to accurately classify it as a suggestion.
4 Conclusion
In this work, we proposed a Multi-task learning framework for Open Domain Suggestion Mining along with a novel language model based over-sampling technique for text–LMOTE. Our experiments revealed that Multi-task learning combined with LMOTE over-sampling outperformed considered alternatives in terms of both the F1-score of the suggestion class and AUC.
References
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014). https://doi.org/10.3115/v1/d14-1181
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Negi, S.: Suggestion mining from text. Ph.D. thesis, National University of Ireland Galway (NUIG) (2019)
Negi, S., Daudert, T., Buitelaar, P.: SemEval-2019 task 9: suggestion mining from online reviews and forums. In: SemEval@NAACL-HLT (2019)
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Leekha, M., Goswami, M., Jain, M. (2020). A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling. In: Jose, J., et al. Advances in Information Retrieval. ECIR 2020. Lecture Notes in Computer Science(), vol 12036. Springer, Cham. https://doi.org/10.1007/978-3-030-45442-5_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-45442-5_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45441-8
Online ISBN: 978-3-030-45442-5
eBook Packages: Computer ScienceComputer Science (R0)