Identifying intentions in forum posts with cross-domain data

Phuong, Tu Minh; Linh, Le Cong; Bach, Ngo Xuan

doi:10.1007/s10732-019-09410-3

Identifying intentions in forum posts with cross-domain data

Published: 04 April 2019

Volume 28, pages 171–192, (2022)
Cite this article

Journal of Heuristics Aims and scope Submit manuscript

Tu Minh Phuong¹,
Le Cong Linh² &
Ngo Xuan Bach¹

212 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we present a method to identify forum posts expressing user intentions in online discussion forums. The results of this task, for example buying intentions, can be exploited for targeted advertising or other marketing tasks. Our method utilizes labeled data from other domains to help the learning task in the target domain by using a Naive Bayes (NB) framework to combine the data statistics . Because the distributions of data vary from domain to domain, it is important to adjust the contributions of different data sources when constructing the learning model, to achieve accurate results. Here, we propose to adjust the parameters of the NB classifier by optimizing an objective, which is equivalent to maximizing the between-class separation, using stochastic gradient descent. Experimental results show that our method outperforms several competitive baselines on a benchmark dataset consisting of forum posts from four domains: Cellphone, Electronics, Camera, and TV. In addition, we explore the possibility of combining NB posteriors computed during the optimization process with another classifier, namely Support Vector Machines. Experimental results show the usefulness of optimized NB class posteriors when using as features for SVMs in the cross-domain settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Customized Training of Pretrained Language Models to Detect Post Intents in Online Health Support Groups

Prior Shift Using the Ratio Estimator

Unraveling the Elements of Effective Altruistic Appeals Through Machine Learning and Natural Language Processing

Notes

In experiments, we set \(\gamma = 0.01\).
Cellphone: http://www.howardforums.com/forums.php.
Electronics: http://www.avsforum.com/avs-vb/.
Camera: http://forum.digitalcamerareview.com/.
TV: http://www.avforums.com/forums/tvs/.
As shown in Chen et al. (2013), Naive Bayes is the suitable method for the task of intention detection in discussion forums.
We used LIBSVM (Chang and Lin 2011) with linear kernel. Software available at: https://www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

Bach, N.X., Phuong, T.M.: Leveraging user ratings for resource-poor sentiment classification. In: Proceedings of the 19th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES), pp. 322–331 (2015)
Bach, N.X., Hai, N.D., Phuong, T.M.: Personalized recommendation of stories for commenting in forum-based social media. Inf. Sci. 352–353, 48–60 (2016a)
Article Google Scholar
Bach, N.X., Hai, V.T., Phuong, T.M.: Cross-domain sentiment classification with word embeddings and canonical correlation analysis. In: Proceedings of the 7th International Symposium on Information and Communication Technology (SoICT), pp. 159–166 (2016b)
Bach, N.X., Linh, L.C., Phuong, T.M.: Cross-domain intention detection in discussion forums. In: Proceedings of the Eighth International Symposium on Information and Communication Technology (SoICT), pp. 173–180 (2017)
Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 440–447 (2007)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory (COLT) (1998)
Chang, C., Lin, C.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)
Article Google Scholar
Chen, Z., Liu, B.: Topic modeling using topics from many domains, lifelong learning and big data. In: Proceedings of the 31st International Conference on Machine Learning (ICML) (2014)
Chen, Z., Liu, B.: Lifelong Machine Learning. Morgan and Claypool, San Rafael (2017)
Google Scholar
Chen, Z., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Identifying intention posts in discussion forums. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 1041–1050 (2013)
Chen, Z., Ma, N., Liu, B.: Lifelong learning for sentiment classification. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 750–756 (2015)
Ding, X., Liu, T., Duan, J., Nie, J.Y.: Mining user consumption intention from social media using domain adaptive convolutional neural network. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2389–2395 (2015)
Easley, D., Kleinberg, J.: Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, Cambridge (2010)
Book Google Scholar
Ghani, R.: Using error-correcting codes for text classification. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 303–310 (2000)
Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.: Part-of-speech tagging for twitter: annotation, features, and experiments. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 42–47 (2011)
Hamrouna, M., Gouider, M.S., Said, L.B.: Large scale microblogging intentions analysis with pattern based approach. In: Proceedings of International Conference on Knowledge Based and Intelligent Information and Engineering Systems (KES), pp. 1249–1257 (2016)
Hollerit, B., Kroll, M., Strohmaier, M.: Towards linking buyers and sellers: detecting commercial intent on twitter. In: Proceedings of the World Wide Web Conference (WWW), pp. 629–632 (2013)
Jiang, J.: A literature survey on domain adaptation of statistical classifiers. Technical report, University of Illinois Urbana-Champaign (2008)
Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180(24), 4929–4939 (2010)
Article Google Scholar
Li, L., Wang, D., Li, T., Knox, D., Padmanabhan, B.: Scene: a scalable two-stage personalized news recommendation system. In: Proceedings of the Thirty-Fourth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 125–134 (2011)
Li, C.X., Du, Y.J., Liu, J., Zheng, H., Wang, S.D.: A novel approach of identifying user intents in microblog. In: Proceedings of International Conference on Intelligent Computing (ICIC), pp. 391–400 (2016)
Liu, B.: Sentiment Analysis and Opinion Mining: Synthesis Lectures on Human Languages Technologies. Morgan and Claypool, San Rafael (2012)
Book Google Scholar
Luong, T.L., Tran, T.H., Truong, Q.T., Truong, T.M.N., Phi, T.T., Phan, X.H.: Learning to filter user explicit intents in online Vietnamese social media texts. In: Proceedings of the Asian Conference on Intelligent Information and Database Systems (ACIIDS), pp. 13–24 (2016)
Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: sentiment analysis in twitter. In: Proceedings of SemEval-2016, pp. 1–18 (2016)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)
Article Google Scholar
Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011)
Article Google Scholar
Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)
Wang, S., Manning, C.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL): Short Papers, vol. 2, pp. 90–94 (2012)
Wang, J., Cong, G., Zhao, W.X., Li, X.: Mining user intents in twitter: a semi-supervised approach to inferring intent categories for tweets. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 339–345 (2015)
Zhu, X.: Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison (2008)

Download references

Author information

Authors and Affiliations

Department of Computer Science and Machine Learning and Applications Lab, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam
Tu Minh Phuong & Ngo Xuan Bach
FPT Software Research Lab, Hanoi, Vietnam
Le Cong Linh

Authors

Tu Minh Phuong
View author publications
You can also search for this author in PubMed Google Scholar
Le Cong Linh
View author publications
You can also search for this author in PubMed Google Scholar
Ngo Xuan Bach
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tu Minh Phuong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Phuong, T.M., Linh, L.C. & Bach, N.X. Identifying intentions in forum posts with cross-domain data. J Heuristics 28, 171–192 (2022). https://doi.org/10.1007/s10732-019-09410-3

Download citation

Received: 12 March 2018
Revised: 01 February 2019
Accepted: 01 April 2019
Published: 04 April 2019
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10732-019-09410-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identifying intentions in forum posts with cross-domain data

Abstract

Access this article

Similar content being viewed by others

Customized Training of Pretrained Language Models to Detect Post Intents in Online Health Support Groups

Prior Shift Using the Ratio Estimator

Unraveling the Elements of Effective Altruistic Appeals Through Machine Learning and Natural Language Processing

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identifying intentions in forum posts with cross-domain data

Abstract

Access this article

Similar content being viewed by others

Customized Training of Pretrained Language Models to Detect Post Intents in Online Health Support Groups

Prior Shift Using the Ratio Estimator

Unraveling the Elements of Effective Altruistic Appeals Through Machine Learning and Natural Language Processing

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation