Skip to main content
Log in

Identifying intentions in forum posts with cross-domain data

  • Published:
Journal of Heuristics Aims and scope Submit manuscript

Abstract

In this paper, we present a method to identify forum posts expressing user intentions in online discussion forums. The results of this task, for example buying intentions, can be exploited for targeted advertising or other marketing tasks. Our method utilizes labeled data from other domains to help the learning task in the target domain by using a Naive Bayes (NB) framework to combine the data statistics . Because the distributions of data vary from domain to domain, it is important to adjust the contributions of different data sources when constructing the learning model, to achieve accurate results. Here, we propose to adjust the parameters of the NB classifier by optimizing an objective, which is equivalent to maximizing the between-class separation, using stochastic gradient descent. Experimental results show that our method outperforms several competitive baselines on a benchmark dataset consisting of forum posts from four domains: Cellphone, Electronics, Camera, and TV. In addition, we explore the possibility of combining NB posteriors computed during the optimization process with another classifier, namely Support Vector Machines. Experimental results show the usefulness of optimized NB class posteriors when using as features for SVMs in the cross-domain settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. In experiments, we set \(\gamma = 0.01\).

  2. Cellphone: http://www.howardforums.com/forums.php.

  3. Electronics: http://www.avsforum.com/avs-vb/.

  4. Camera: http://forum.digitalcamerareview.com/.

  5. TV: http://www.avforums.com/forums/tvs/.

  6. As shown in Chen et al. (2013), Naive Bayes is the suitable method for the task of intention detection in discussion forums.

  7. We used LIBSVM (Chang and Lin 2011) with linear kernel. Software available at: https://www.csie.ntu.edu.tw/~cjlin/libsvm/.

References

  • Bach, N.X., Phuong, T.M.: Leveraging user ratings for resource-poor sentiment classification. In: Proceedings of the 19th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES), pp. 322–331 (2015)

  • Bach, N.X., Hai, N.D., Phuong, T.M.: Personalized recommendation of stories for commenting in forum-based social media. Inf. Sci. 352–353, 48–60 (2016a)

    Article  Google Scholar 

  • Bach, N.X., Hai, V.T., Phuong, T.M.: Cross-domain sentiment classification with word embeddings and canonical correlation analysis. In: Proceedings of the 7th International Symposium on Information and Communication Technology (SoICT), pp. 159–166 (2016b)

  • Bach, N.X., Linh, L.C., Phuong, T.M.: Cross-domain intention detection in discussion forums. In: Proceedings of the Eighth International Symposium on Information and Communication Technology (SoICT), pp. 173–180 (2017)

  • Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 440–447 (2007)

  • Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory (COLT) (1998)

  • Chang, C., Lin, C.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)

    Article  Google Scholar 

  • Chen, Z., Liu, B.: Topic modeling using topics from many domains, lifelong learning and big data. In: Proceedings of the 31st International Conference on Machine Learning (ICML) (2014)

  • Chen, Z., Liu, B.: Lifelong Machine Learning. Morgan and Claypool, San Rafael (2017)

    Google Scholar 

  • Chen, Z., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Identifying intention posts in discussion forums. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 1041–1050 (2013)

  • Chen, Z., Ma, N., Liu, B.: Lifelong learning for sentiment classification. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 750–756 (2015)

  • Ding, X., Liu, T., Duan, J., Nie, J.Y.: Mining user consumption intention from social media using domain adaptive convolutional neural network. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 2389–2395 (2015)

  • Easley, D., Kleinberg, J.: Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, Cambridge (2010)

    Book  Google Scholar 

  • Ghani, R.: Using error-correcting codes for text classification. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML), pp. 303–310 (2000)

  • Gimpel, K., Schneider, N., O’Connor, B., Das, D., Mills, D., Eisenstein, J., Heilman, M., Yogatama, D., Flanigan, J., Smith, N.: Part-of-speech tagging for twitter: annotation, features, and experiments. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 42–47 (2011)

  • Hamrouna, M., Gouider, M.S., Said, L.B.: Large scale microblogging intentions analysis with pattern based approach. In: Proceedings of International Conference on Knowledge Based and Intelligent Information and Engineering Systems (KES), pp. 1249–1257 (2016)

  • Hollerit, B., Kroll, M., Strohmaier, M.: Towards linking buyers and sellers: detecting commercial intent on twitter. In: Proceedings of the World Wide Web Conference (WWW), pp. 629–632 (2013)

  • Jiang, J.: A literature survey on domain adaptation of statistical classifiers. Technical report, University of Illinois Urbana-Champaign (2008)

  • Li, Q., Wang, J., Chen, Y., Lin, Z.: User comments for news recommendation in forum-based social media. Inf. Sci. 180(24), 4929–4939 (2010)

    Article  Google Scholar 

  • Li, L., Wang, D., Li, T., Knox, D., Padmanabhan, B.: Scene: a scalable two-stage personalized news recommendation system. In: Proceedings of the Thirty-Fourth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 125–134 (2011)

  • Li, C.X., Du, Y.J., Liu, J., Zheng, H., Wang, S.D.: A novel approach of identifying user intents in microblog. In: Proceedings of International Conference on Intelligent Computing (ICIC), pp. 391–400 (2016)

  • Liu, B.: Sentiment Analysis and Opinion Mining: Synthesis Lectures on Human Languages Technologies. Morgan and Claypool, San Rafael (2012)

    Book  Google Scholar 

  • Luong, T.L., Tran, T.H., Truong, Q.T., Truong, T.M.N., Phi, T.T., Phan, X.H.: Learning to filter user explicit intents in online Vietnamese social media texts. In: Proceedings of the Asian Conference on Intelligent Information and Database Systems (ACIIDS), pp. 13–24 (2016)

  • Nakov, P., Ritter, A., Rosenthal, S., Sebastiani, F., Stoyanov, V.: Semeval-2016 task 4: sentiment analysis in twitter. In: Proceedings of SemEval-2016, pp. 1–18 (2016)

  • Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  • Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. IEEE Trans. Neural Netw. 22(2), 199–210 (2011)

    Article  Google Scholar 

  • Ritter, A., Clark, S., Mausam, Etzioni, O.: Named entity recognition in tweets: an experimental study. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534 (2011)

  • Wang, S., Manning, C.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL): Short Papers, vol. 2, pp. 90–94 (2012)

  • Wang, J., Cong, G., Zhao, W.X., Li, X.: Mining user intents in twitter: a semi-supervised approach to inferring intent categories for tweets. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 339–345 (2015)

  • Zhu, X.: Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tu Minh Phuong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Phuong, T.M., Linh, L.C. & Bach, N.X. Identifying intentions in forum posts with cross-domain data. J Heuristics 28, 171–192 (2022). https://doi.org/10.1007/s10732-019-09410-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10732-019-09410-3

Keywords

Navigation