Skip to main content
Log in

A graph model with integrated pattern and query-based technique for extracting answer to questions in community question answering system

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

This paper presents an extension of the community question-answering (CQA) system we developed previously. This graph-based system method that builds ranked answers for related questions using nKullback–Leibler (KL) divergence. The process of extracting answers to questions in this work involves; question core, building question query, query-based answer extraction (QBAE), pattern-based answer extraction (PBAE), and combined answer extraction. The source data for this work were existing data from ResearchGate, a socio-academic networking website that provides researchers the platform to collaborate, ask questions, and offer answers to questions. The performance for answer extraction for 2786 questions shows that when 80% of patterns and keywords were considered, QBAE and PBAE extracted 2765 and 2766 correct answers, respectively, while the QBAE + PBAE method extracted 2782 correct answers. Also, when 90% of patterns and keywords were utilized, QBAE and PBAE extracted 2782 and 2784 correct answers, whereas the QBAE + PBAE method extracted 2786 correct answers. Our method was able to identify 229 questions without answers. Finally, the evaluation of our model reveals high-performance accuracy and precision.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Not applicable.

References

  • Abdiansah A, Utami AS (2020) Information Extraction from Web as Knowledge Resources for Indonesian Question Answering System. In Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019) (pp. 419–425). Atlantis Press

  • Ahmed W, Anto DB (2016) Answer Extraction Technique for Question Answering Systems. International Journal of Innovative Research in Computer and Communication Engineering (An ISO 3297: 2007 Certified Organization), 4(11)

  • Albaham AT, Salim N (2012) Quality-biased retrieval in online forums. Faculty of Computer Science and Information System, UTM, Johor, Malaysia-81310 Journal of Theoretical and Applied Information Technology 15th, 38 (pp. 1992–8645).

  • Allahbakhsh M, Benatallah B, Ignjatovic A, Motahari-Nezhad HR, Bertino E, Dustdar S (2013) Quality control in crowdsourcing systems: Issues and directions, Internet Computing, IEEE 17 (pp. 76–81).

  • Andresel M, Stepanova D, Tran TK, Domokos C, Minervini P (2021) Neuro-Symbolic Ontology-Mediated Query Answering

  • Beaver K (2012) Hacking for Dummies. John Wiley & Sons, ISBN 9781118380963 (pp. 278,280–281).

  • Bodke S, Meher A, Shirsat K (2019). Evaluating Answer Qualities on Q&A Community Sites (StackOverFlow). In 2nd International Conference on Advances in Science & Technology (ICAST)

  • Bouguessa M, Dumoulin B, Wang S (2008) Identifying authoritative actors in question-answering forums: The case of Yahoo! answers. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY(pp. 866–874).

  • Cong G, Wang L, Lin CY, Song YI, Sun Y (2008) Finding question-answer pairs from online forums. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY (pp. 467–474).

  • Cooper RJ, Ruger SM (2001) A simple question answering system. M.Sc work at the Department of Computing, Imperial College of Science, Technology and Medicine, 180 Queen’s way London, SW7 2BZ England.

  • Cooper RJ (2000) High precision information retrieval. M.Sc thesis, Imperial College of Science, Technology and Medicine, 180 Queen’s way London, England.

  • Dali L, Rusu D, Fortuna B, Mladenić D, Grobelnik M (2009) Question answering based on semantic graphs. Journal of Jožef Stefan Institute Jamova cesta 39 1000 Ljubljana, Slovenia.

  • Dalip DH, Goncalves MA, Cristo M, Calado P (2009) Automatic quality assessment of content created collaboratively by web communities: a case study of Wikipedia. In Proceedings of the Joint International Conference on Digital libraries. Austin, TX, USA, (pp. 295–304).

  • Dan S, Geert-Jan MK, Dietrich K (2005) Exploring syntactic relation patterns for question answering. In Robert Dale, Kam-Fai Wong, Jian Su, & Oi Yee Kwong, editors, Natural Language Processing IJCNLP 2005: Second International Joint Conference, Jeju Island, Korea, October 11–13, Proceedings. Springer-Verlag.

  • Davis CI, Moldovan D (2010) Feasibility of automatically bootstrapping a persian WordNet. Proceeding of the 7th Language Resources and Evaluation Conference, (LREC) 2010, May 17–23, Valletta, Malta.

  • Engebretson P (2011) The basics of hacking and penetration testing. Elsevier. ISBN 9781597496568 (pp. 19–22).

  • Fox C (1992) Information retrieval data structures and algorithms. Lexical Analysis and Stop lists, (pp. 102–130).

  • G´omez-Adorno H, Sidorov G, Vilarino D, Pinto D (2015) Graph-Based Approach for Answer Selection in Community Question Answering Task. In the Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 18–22, Denver, Colorado, June 4–5, 2015. Association for Computational Linguistics.

  • Gao L, Zeng P, Song J, Li YF, Liu W, Mei T, Shen HT (2019) Structured two-stream attention network for video question answering. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 6391–6398).

  • Green B, Wolf A, Chomsky C, Laughery K (1961) BASEBALL: An automatic question answerer. In: proceedings of the Western Joint Computer Conference.

  • Gupta P, Gupta V (2012) A survey of text question answering techniques. International Journal of Computer Applications, 53 (pp. 1–8).

  • Han J, Yu J, Jin X, Luo J (2011) Collection-based sparse label propagation and its application on social group suggestion from photos. ACM Trans. Intell. Syst. Technol. 2, 12 (pp. 21).

  • Heylighen F, Dewaele JM (2002) Variation in the contextuality of language: An empirical measure. Context in Context. Special issue Foundations of Science, 7 (pp. 293–340).

  • Hong L, Davison BD (2009) A classification-based approach to question answering in discussion boards. Proceeding of the SIGIR’09, Boston, Massachusetts, USA., CM 978–1–60558–483–6/09/0

  • Hong R, Wang M, Li G, Nie L, Zha VX, Chua Z-J (2012) Multimedia question answering. IEEE Multimedia 19 (pp. 72–78).

  • Jurafsky D, Manning C (2015) Natural Language Processing. Instructor 212(998):3482

    Google Scholar 

  • Jurczyk P, Agichtein E (2007) Discovering authorities in question answer communities by using link analysis. In Proceedings of the sixteenth ACM Conference on Information and Knowledge Management (CIKM), New York (pp. 919–922).

  • Kamp H (1984) A theory of truth and semantic representation. In: J. Groenendijk, T. M. Janssen, M. Stokhof (Eds.), Truth, Interpretation and Information: Selected Papers from the 3rd Amsterdam Colloquium, Dordrecht – Holland/ Cinnaminson – USA Foris, (pp. 1–41).

  • Karmakar S (2011) Syntactic and semantic analysis and visualization of unstructured English texts. Scholar work at Georgia University, Computer Science dissertations.

  • Kaur J, Gupta V (2010) Effective approaches for extraction of keywords. International Journal of Computer Science Issues (IJCSI), 7(6).

  • Lafferty J, Zhai C (2001) A study of smoothing methods for language models applied to Ad Hoc information retrieval. In proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. 2001, ACM: New Orleans, Louisiana, United States (pp. 334–342).

  • Lin T (2012) Cracking open the scientific process. The New York Times. Retrieved 2014–06–26.

  • Liu Q, Agichtein E (2011) Modelling answerer behaviour in collaborative question answering systems. In: Proceedings of ECIR.

  • Liu X, Croft WB, Koll M (2005) Finding experts in community-based question-answering services. In ACM Conference on Information and Knowledge Management (CIKM).

  • Liu Y, Li S, Cao Y, Lin CY, Han D, Yu Y (2009) Understanding and summarizing answers in community-based question answering services. In Proc. of COLING 2009.

  • Martinez-Romo J, Araujo L (2009) Web spam identification through language model analysis. Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web (pp. 21–28).

  • Miller G, Beckwith R, Fellbaum C, Gross D, Miller K (1990) Five papers on wordnet. International Journal of Lexicography.

  • Molla D (2006) Learning of graph-based question answering rules. Department of Computing Macquarie University Sydney 2109, Australia.

  • Mudgal R, Madaan R, Sharma AK, Dixit A (2013) A Novel architecture for question classification based indexing scheme for efficient question answering. arXiv preprint arXiv:1307.6937.

  • Nakov P, Marquez L, Magdy W, Moschitti A, Glass J, Randeree B (2015) Answer selection in community question answering. In Proceedings of the 9th International Workshop on Semantic Evaluation, Denver, Colorado, June. Association for Computational Linguistics (pp. 269–281).

  • Ojokoh B, Igbe T, Araoye A (2017) Feature-based Model for Extraction and Classification of High Quality Questions in Online Forum. British Journal of Mathematics & Computer Science 22(1):1–21

    Article  Google Scholar 

  • Ojokoh B, Igbe T, Araoye A, Ameh F (2016) Question identification and classification on an academic question answering site. In Digital Libraries (JCDL), 2016 IEEE/ACM Joint Conference on (pp. 223–224).

  • Ospanova BR (2013) Calculating information entropy of language texts. World Applied Sciences Journal, ISSN 1818–4952 (pp. 41–45).

  • Peñas A, Forner P, Rodrigo A, Sutcliffe R, Forascu C, Mota C (2010) Overview of ResPubliQA 2010: Question Answering Evaluation over European Legislation. Working Notes CLEF Labs.

  • Radev D, Fan W, Qi H, Wu H, Grewal A (2005) Probabilistic question answering on the Web. Journal of the American Society for Information Science and Technology (pp. 571–583).

  • Rahman MZ, Rahman MHH, Aziz MFBA (2019) Information Extraction from WWW using Structural Approach. In 2019 International Conference on Sustainable Technologies for Industry 4.0 (STI) (pp. 1–5). IEEE

  • Ramprasath M, Hariharan S (2016) A survey on question answering system. International Journal of Research and Reviews in Information Sciences (IJRRIS), United Kingdom (pg. 171–179).

  • Salloum W (2015) A Question Answering System based on Conceptual Graph Formalism. University of Kalamoon, UOK, Damascus, Syria, In the Faculty of Applied Sciences

    Google Scholar 

  • Sharma LK, Mittal N (2018) Answer Extraction in Question Answering using Structure Features and Dependency Principles. arXiv preprint arXiv:1810.03918

  • Song J, Zeng P, Gao L, Shen HT (2022) From pixels to objects: Cubic visual attention for visual question answering. arXiv preprint arXiv:2206.01923.

  • Suta P, Mongkolnam P, Fung CC, Chan JH (2018) Matching question and answer using similarity: an experiment with stack overflow. In 2018 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE) (pp. 51–54). IEEE

  • Toba H, Ming ZY, Adriani M, Chua T (2013) Discovering high quality answers in community question answering archives using a hierarchy of classifiers. Information Sciences journal homepage: www.elsevier.com/locate/ins Contents lists available at Science Direct.

  • Tomokiyo T, Hurst M (2003) A language model approach to keyphrase extraction. MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment (pp. 33–40).

  • Voorhees EM (2001) The TREC question answering track. Nat Lang Eng 7(4):361–378

    Article  MathSciNet  Google Scholar 

  • Voorhees EM (2004) Overview of the TREC 2004 question answering track. In Proceedings of the Thirteenth Text Retrieval Conference.

  • Wang XJ, Tu X, Feng D, Zhang L (2009) Ranking community answers by modeling question–answer relationships via analogical reasoning. In Proceedings of SIGIR.

  • Yen S, Wu Y, Yang J, Lee Y, Lee C, Liu J (2013) A support vector machine-based context-ranking model for question answering. Information Sciences, Science Direct, pp. 77–87.

  • Zeng P, Zhang H, Gao L, Song J, Shen HT (2022) Video question answering with prior knowledge and object-sensitive learning. IEEE Trans Image Process 31:5936–5948

    Article  Google Scholar 

  • Zhang L, Luo Y, Gao P, Zhang D, Wan F (2021) Answer Extraction Method Based on BiLSTM and CRF in QA System. In 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA) (pp. 734–738). IEEE

  • Zoidi O, Fotiadou E, Nikolaidis N, Pitas I (2015) Graph-based label propagation in digital media: A Review ACM Comput. Surv. V, N, Article A (January), 35.

Download references

Funding

The research was partly funded by the COMSTEC-TWAS Joint Research Grants Programme 14–014 RG/TC/ITC/AF/AC_C- UNESCO FR: 3240283404.

Author information

Authors and Affiliations

Authors

Contributions

BO: conceived and designed the work, and edited the manuscript, Tobore Igbe carried out the implementation of the design, and wrote part of the main manuscript, BA: contributed to the design and wrote part of the main manuscript, OD: edited the manuscript

Corresponding author

Correspondence to Bolanle Ojokoh.

Ethics declarations

Conflict of interest

The authors do not have any conflicts of interest in this submission.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ojokoh, B., Igbe, T., Afolabi, B. et al. A graph model with integrated pattern and query-based technique for extracting answer to questions in community question answering system. Soc. Netw. Anal. Min. 13, 45 (2023). https://doi.org/10.1007/s13278-023-01046-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-023-01046-3

Keywords

Navigation