Skip to main content
Log in

Cloud-based learning system for answer ranking

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Community question answering (Q&A) is a new knowledge-sharing model where a large number of questions and answers are accumulated through the user’s submission. When the user submits a new question, the Q&A system can provide the accurate answers list by the learning model. The traditional ranking algorithm mainly uses a large number of labeled data to train the model. However, a ranking model trained in the source domain may lead to poor performance in the target domain because of the lack of labeled training samples in the new domain. To address this challenge, this paper proposes a transfer learning algorithm based on feature selection for ranking. Suppose that the source domain and the target domain share the low-dimensional feature representation, and due to the user features exist share knowledge in source domain and target, so we use the user features are integrated into the answer space. Then the features of the target domain are shared for knowledge transfer. Furthermore, to improve the computational efficiency for the huge amount of data in the community Q&A, the learning model is distributed and processed by the Spark technology. Experimental results show that the proposed method could effectively exploit the cross-domain knowledge to enhance the effect of ranking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Mao, X.L., Li, X.M.: A survey of question–answer system. J. Comput. Sci. Explor. 6(3), 193–207 (2012)

    Article  Google Scholar 

  2. Lian, X.: Research on some problems in community question-answering system, pp. 2–3. School of Computer and Control Engineering, Nankai University (2014)

    Google Scholar 

  3. You, L., Zhou, Y.Q., Huang, X.Q., Wu, L.D.: Confidence score algorithm for OA system based on maximum entropy model. J. Softw. 16(8), 1407–1414 (2005)

    Article  MATH  Google Scholar 

  4. Quoc, C., Le, V.: Learning to rank with nonsmooth cost functions. Proc. Adv. Neural Inform. Process. Syst. 19, 193–200 (2007)

    Google Scholar 

  5. Cao, Z., Qin, T., Liu, T.Y., et al.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th International Conference on Machine Learning, pp. 129–136 (2007)

  6. Qin, T., Zhang, X.D., Tsai, M.F., et al.: Query-level loss functions for information retrieval. Inform. Process. Manag. 44(2), 838–855 (2008)

    Article  Google Scholar 

  7. Suzuki, J., Sasaki, Y., Maeda, E.: SVM answer selection for open-domain question answering. In: Proceedings of the 19th International Conference on Computational linguistics, vol. 1, pp. 1–7 (2002)

  8. He, Y., Alani, H.: Automatic identification of best answers in online enquiry communities. In: 9th Extended Semantic Web Conference (2012)

  9. Dalip, D.H., Gonalves, M.A., Cristo, M., et al.: Exploiting user feedback to learn to rank answers in Q&A forums: a case study with stack overflow. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 43–552 (2013)

  10. Zhiltsov, N., Kotov, A.,. Nikolaev, F.: Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 253–262 (2015)

  11. Xu, X., He, L., Lu, H., Taniguchi, R.: Non-linear Matrix Completion for Social Image Tagging. Piscataway, IEEE (2016)

    Google Scholar 

  12. Zong, H.Y.: Research on the Ranking of Answers in Field Question Answering System, pp. 1–20. Kunming University of, Science and Technology (2011)

    Google Scholar 

  13. Salton, G., Lesk, M.E.: Computer evaluation of indexing and text processing. J. ACM 15(1), 8–36 (1968)

    Article  MATH  Google Scholar 

  14. Salton, G.: The Smart Retrieval System-Experiments in Automatic Document Processing, vol. 556. Prentice-Hall Inc, Upper Saddle River (1971)

    Google Scholar 

  15. Robertson, S.E., Jones, K.S.: Relevance weighting of search terms. J. Am. Soc. Inform. Sci. 27(3), 129–146 (1976)

    Article  Google Scholar 

  16. Ravichandran, D., Hovy, E., Josef Och, F.: Statistical QA-classifier vs. re-ranker: what’s the difference? In: Proceedings of the ACI Workshop on Muhilingual Summarization and Question Answering Machine Learning, pp. 69–75 (2003)

  17. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  18. Arnold, A., Nallapati, R., Cohen, W.: Exploiting feature hierarchy for transfer learning in named entity recognition. In: Proceedings of ACL (2008)

  19. Richman, A.E., Schone, P.: Mining wiki resources for multilingual named entity recognition. In: proceedings of 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1–9 (2008)

  20. Goldwasser, D., Roth, D.: Active sample selection for named entity transliteration. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pp. 53–56 (2008)

  21. Pan, S.J., Yang, Q.: A survey on transfer learning. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  22. Zaharia, M., Chowdhury, M., Das, T. et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, vol. 70(2), pp. 141–146 (2012)

  23. Chan, Y., Ng, H.T.: Word sense disambiguation with distribution estimation. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, vol. 4, pp. 1010–1015 (2005)

  24. Zhang, Y., Wang, S., et al.: Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. In: Knowledge-Based Systems (2014)

  25. Chen, D., Xiong, Y., Yan, J., Xue, G.-R., Chen, Z.: Knowledge transfer for cross domain learning to rank. Inform. Retriev. J. 13(3), 236–253 (2010)

  26. Chen, D., Yan, J., Wang, G., Xiong, Y., Fan, W.: A novel algorithm for transfer of rank learning. In: IEEE 13th International Conference on Data Mining Workshops, pp. 106–115 (2008)

  27. Wenyuan, D., Qiang, Y., Gui-Rong, X., Yong, Y.: Boosting for transfer learning. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning, pp. 20–24 (2007)

  28. Xue, G.-R., Dai, W., Yang, Q., Yu, Y.: Topic-bridged PLSA for cross-domain text classification. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 20–24 (2008)

  29. Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Adv. Neural Inform. Process. Syst. 73(3), 243–272 (2008)

    Google Scholar 

  30. Caruana, R.: Multitask learning. In: The 28th International Conference on Machine Learning, vol. 28(1), pp. 41–45 (1997)

  31. Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers, pp. 115–132. MIT Press, Cambridge, MA (2000)

  32. Sun, A., Jiang, M., Ma, Y.: An instance-based approach for pinpointing answers in Chinese question answering. In: Signal Processing 8th International Conference on IEEE, pp. 16–20(2006)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61365010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Su.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, L.W., Su, L., Zhang, Y. et al. Cloud-based learning system for answer ranking. Cluster Comput 20, 2253–2266 (2017). https://doi.org/10.1007/s10586-017-0888-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-0888-2

Keywords

Navigation