A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling

Leekha, Maitree; Goswami, Mononito; Jain, Minni

doi:10.1007/978-3-030-45442-5_28

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12036))

Included in the following conference series:

European Conference on Information Retrieval

6211 Accesses
1 Citations

Abstract

Consumer reviews online may contain suggestions useful for improving commercial products and services. Mining suggestions is challenging due to the absence of large labeled and balanced datasets. Furthermore, most prior studies attempting to mine suggestions, have focused on a single domain such as Hotel or Travel only. In this work, we introduce a novel over-sampling technique to address the problem of class imbalance, and propose a multi-task deep learning approach for mining suggestions from multiple domains. Experimental results on a publicly available dataset show that our over-sampling technique, coupled with the multi-task framework outperforms state-of-the-art open domain suggestion mining models in terms of the F-1 measure and AUC.

M. Leekha, M. Goswami and M. Jain—Contributed equally and would like to be consider as joint first authors.

You have full access to this open access chapter, Download conference paper PDF

A semi-supervised method to generate a persian dataset for suggestion classification

Article 29 September 2023

Leila Safari & Zanyar Mohammady

Ensemble Approach for Suggestion Mining Using Deep Recurrent Convolutional Networks

Specialized Review Selection Using Topic Models

Keywords

1 Introduction

Consumers often express their opinions towards products and services through online reviews and discussion forums. These reviews may include useful suggestions that can help companies better understand consumer needs and improve their products and services. However, manually mining suggestions amid vast numbers of non-suggestions can be cumbersome, and equated to finding needles in a haystack. Therefore, designing systems that can automatically mine suggestions is essential. The recent SemEval [6] challenge on Suggestion Mining saw many researchers using different techniques to tackle the domain-specific task (in-domain Suggestion Mining). However, open-domain suggestion mining, which obviates the need for developing separate suggestion mining systems for different domains, is still an emerging research problem. We formally define the problem of open-domain suggestion mining as follows:

Definition 1

(Open-domain Suggestion Mining). Given a set of reviews \(\mathcal {R} = \{ r_1, r_2 \ldots r_n\}\) from multiple domains in \(\mathcal {D} = d_1 \cup d_2 \cup \ldots d_m \), train a classifier C using \(\mathcal {D}\) to predict the nature of each review \(r_i\).

Building on the work of [5], we design a framework to detect suggestions from multiple domains. We formulate a multitask classification problem to identify both the domain and nature (suggestion or non-suggestion) of reviews. Furthermore, we also propose a novel language model-based text over-sampling approach to address the class imbalance problem.

2 Methodology

2.1 Dataset and Pre-processing

We use the first publicly available and annotated dataset for suggestion mining from multiple domains created by [5]. It comprises of reviews from four domains namely, hotel, electronics, travel and software. During pre-processing, we remove all URLs (eg. https:// ...) and punctuation marks, convert the reviews to lower case and lemmatize them. We also pad the text with start \({{\mathbf {\mathtt{{{S}}}}}}\) and end \({{\mathbf {\mathtt{{{E}}}}}}\) symbols for over-sampling.

Table 1. Datasets and their sources used in our study [5]. The class ratio column highlights the extent of class imbalance in the datasets. The travel datasets have lower inter-annotator agreement than the rest, indicating that they may contain confusing reviews which are hard to confidently classify as suggestions or non-suggestions. This also reflects in our classification results.

Full size table

Table 2. Most frequent 5-grams and their corresponding suggestions sampled using LMOTE. While the suggestions as a whole may not be grammatically correct, their constituent phrases are nevertheless semantically sensible.

Full size table

2.2 Over-Sampling Using Language Model: LMOTE

One of the major challenges in mining suggestions is the imbalanced distribution of classes, i.e. the number of non-suggestions greatly outweigh the number of suggestions (refer Table 1). To this end, studies frequently utilize Synthetic Minority Over-sampling Technique (SMOTE) [1] to over-sample the minority class samples using the text embeddings as features. However, SMOTE works in the euclidean space and therefore does not allow an intuitive understanding and representation of the over-sampled data, which is essential for qualitative and error analysis of the classification models. We introduce a novel over-sampling technique, Language Model-based Over-sampling Technique (LMOTE), exclusively for text data and note comparable (and even slightly better sometimes) performance to SMOTE. We use LMOTE to over-sample the number of suggestions before training our classification model. For each domain, LMOTE uses the following procedure to over-sample suggestions:

Find Top \(\eta \) \(\texttt {n}\)-Grams: From all reviews labelled as suggestions (positive samples), sample the top \(\eta =100\) most frequently occurring \(\texttt {n}\)-grams (\(\texttt {n}=5\)). For example, the phrase “nice to be able to” occurred frequently in many domains.

Train Language Model on Positive Samples: Train a BiLSTM language model on the positive samples (suggestions). The BiLSTM model predicts the probability distribution of the next word (\(w_t\)) over the whole vocabulary (\(V \cup {{\mathbf {\mathtt{{{E}}}}}}\)) based on the last \(\texttt {n}=5\) words (\(w_{t-5},\ldots , w_{t-1}\)), i.e., the model learns to predict the probability distribution , such that \(w_t = \underset{w_i}{{{\,\mathrm{arg\,max}\,}}} \, P(w_i \ | \ w_{t-5} \ w_{t-4} \ w_{t-3} \ w_{t-2} \ w_{t-1})\).

Generate Synthetic Text Using Language Model and Frequent \(\texttt {n}\)-Grams: Using the language model and a randomly chosen frequent 5-gram as the seed, we generate text by repeatedly predicting the most probable next word (\(w_t\)), until the end symbol \({{\mathbf {\mathtt{{{E}}}}}}\) is predicted.

Table 2 comprises of the most frequent 5-grams and their corresponding suggestions ‘sampled’ using LMOTE. In our study, we generate synthetic positive reviews till the number of suggestion and non-suggestion class samples becomes equal in the training set.

Algorithm 1 summarizes the LMOTE over-sampling methodology. Following is a brief description of the sub-procedures used in the algorithm:

NGrams\((\mathcal {D}_{sugg}, \eta , n)\): It returns the top \(\eta \) n-grams from the set of suggestions, \(D_{sugg}\).
TrainLanguageModel\((\mathcal {D}_{sugg}, n)\): This procedure trains an n-gram BiLSTM Language Model on \(D_{sugg}\).
random\((n\_grams)\)- Randomly selects an n-gram from the input set.
LMOTEGenerate\((language\_model, seed)\): The procedure takes as input the trained language model and a randomly chosen n-gram from the set of top \(\eta \) n-grams as seed, and starts generating a review till the end tag, E is produced. The procedure is repeated until we have a total of \(\mathcal {N}\) suggestion reviews.

2.3 Mining Suggestion Using Multi-task Learning

Multi-task learning (MTL) has been successful in many applications of machine learning since sharing representations between auxiliary tasks allows models to generalize better on the primary task. Figure 1B illustrates 3-dimensional UMAP [4] visualization of text embeddings of suggestions, coloured by their domain. These embeddings are outputs of the penultimate layer (dense layer before the final softmax layer) of the Single task (STL) ensemble baseline. It can be clearly seen that suggestions from different domains may have varying feature representations. Therefore, we hypothesize that we can identify suggestions better by leveraging domain-specific information using MTL. Therefore, in the MTL setting, given a review \(r_i\) in the dataset, D, we aim to identify both the domain of the review, as well as its nature.

2.4 Classification Model

We use an ensemble of three architectures namely, CNN [2] to mirror the spatial perspective and preserve the n-gram representations; Attention Network to learn the most important features automatically; and a BiLSTM-based text RCNN [3] model to capture the context of a text sequence (Fig. 2). In the MTL setting, the ensemble has two output softmax layers, to predict the domain and nature of a review. The STL baselines on the contrary, only have a singe softmax layer to predict the nature of the review. We use ELMo [7] word embeddings trained on the dataset, as input to the models.

3 Results and Discussion

We conducted experiments to assess the impact of over-sampling, the performance of LMOTE and the multi-task model. We used the same train-test split as provided in the dataset for our experiments. All comparisons have been made in terms of the F-1 score of the suggestion class for a fair comparison with prior work on representational learning for open domain suggestion mining [5] (refer Baseline in Table 3). For a more insightful evaluation, we also compute the Area under Receiver Operating Characteristic (ROC) curves for all models used in this work. Tables 3, 4 and Figs. 3 and 1A summarize the results of our experiments, and there are several interesting findings:

Over-Sampling Improves Performance. To examine the impact of over-sampling, we compared the performance of our ensemble classifier with and without over-sampling i.e. we compared results under the STL, STL + SMOTE and STL + LMOTE columns. Our results confirm that in general, over-sampling suggestions to obtain a balanced dataset improves the performance (F-1 score & AUC) of our classifiers.

LMOTE Performs Comparably to SMOTE. We compared the performance of SMOTE and LMOTE in the single task settings (STL + SMOTE and STL + LMOTE) and found that LMOTE performs comparably to SMOTE (and even outperforms it in the electronics and software domains). LMOTE also has the added advantage of resulting in intelligible samples which can be used to qualitatively analyze and troubleshoot deep learning based systems. For instance, consider suggestions created by LMOTE in Table 2. While the suggestions may not be grammatically correct, their constituent phrases are nevertheless semantically sensible.

Table 3. Performance evaluation using F-1 score. Multi-task learning with LMOTE outperforms other alternatives in open-domain suggestion mining. Furthermore, owing to potentially confusing reviews in the travel domain (Table 1), its F-1 scores are significantly lower than the other domains.

Full size table

Table 4. Performance evaluation using area under ROC with \(95\%\) confidence intervals. Multi-task learning with LMOTE outperforms other alternatives in open-domain suggestion mining. Multi-task learning leads to a significant improvement in AUC over its single task counterpart. (AUCs for baseline models proposed by [5] were unavailable.)

Full size table

Multi-task Learning Outperforms Single-Task Learning. We compared the performance of our classifier in single and multi-task settings (STL + LMOTE and MTL + LMOTE) and found that by multi-task learning improves the performance of our classifier. We qualitatively analysed the single and multi task models, and found many instances where by leveraging domain-specific information the multi task model was able to accurately identify suggestions. For instance, consider the following review: “Bring a Lan cable and charger for your laptop because house-keeping doesn’t provide it.” While the review appears to be an assertion (non-suggestion), by predicting its domain (hotel), the multi-task model was able to accurately classify it as a suggestion.

4 Conclusion

In this work, we proposed a Multi-task learning framework for Open Domain Suggestion Mining along with a novel language model based over-sampling technique for text–LMOTE. Our experiments revealed that Multi-task learning combined with LMOTE over-sampling outperformed considered alternatives in terms of both the F1-score of the suggestion class and AUC.

References

Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014). https://doi.org/10.3115/v1/d14-1181
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Negi, S.: Suggestion mining from text. Ph.D. thesis, National University of Ireland Galway (NUIG) (2019)
Google Scholar
Negi, S., Daudert, T., Buitelaar, P.: SemEval-2019 task 9: suggestion mining from online reviews and forums. In: SemEval@NAACL-HLT (2019)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)

Download references

Author information

Authors and Affiliations

Delhi Technological University, New Delhi, India
Maitree Leekha, Mononito Goswami & Minni Jain

Authors

Maitree Leekha
View author publications
You can also search for this author in PubMed Google Scholar
Mononito Goswami
View author publications
You can also search for this author in PubMed Google Scholar
Minni Jain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maitree Leekha .

Editor information

Editors and Affiliations

University of Glasgow, Glasgow, UK
Joemon M. Jose
University College London, London, UK
Emine Yilmaz
Universidade NOVA de Lisboa, Lisbon, Portugal
João Magalhães
Universidad Autónoma de Madrid, Madrid, Spain
Pablo Castells
University of Padua, Padua, Italy
Nicola Ferro
Universidade de Lisboa, Lisbon, Portugal
Mário J. Silva
Universidade NOVA de Lisboa, Lisbon, Portugal
Flávio Martins

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Leekha, M., Goswami, M., Jain, M. (2020). A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling. In: Jose, J., et al. Advances in Information Retrieval. ECIR 2020. Lecture Notes in Computer Science(), vol 12036. Springer, Cham. https://doi.org/10.1007/978-3-030-45442-5_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-45442-5_28
Published: 08 April 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45441-8
Online ISBN: 978-3-030-45442-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Multi-task Approach to Open Domain Suggestion Mining Using Language Model for Text Over-Sampling

Abstract

Similar content being viewed by others

A semi-supervised method to generate a persian dataset for suggestion classification

Ensemble Approach for Suggestion Mining Using Deep Recurrent Convolutional Networks

Specialized Review Selection Using Topic Models

Keywords