Skip to main content
Log in

Improving the clarity of questions in Community Question Answering networks

  • Research
  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Every day, thousands of questions are asked on the Community Question Answering network, making these questions and answers extremely valuable for information seekers around the world. However, a significant proportion of these questions do not elicit proper answers. There are several reasons for this, with the lack of clarity in questions being one of the most crucial factors. In this study, our primary focus is on enhancing the clarity of unclear questions in Community Question Answering networks. In the first step, DistilBERT, which uses Siamese and triplet network structures for meaningful sentence embeddings, is combined with HDBSCAN, effective in diverse noise datasets and less sensitive to density variations, to extract unique features from each question. Questions were then categorized as clear or unclear using an Extremely Randomized Trees ensemble model, known for its robust resistance to class imbalance, with more than 90% accuracy. Next, efforts were made to extract information that could enhance the clarity of unclear questions by comparing them with similar, clearer questions using Dynamic Time Warping, a versatile technique suitable for time series analyses in information systems and applicable across various domains. Finally, the extracted information was incorporated into the feature vector of unclear questions based on histogram-coverage methods to enhance their clarity. When a question is made clearer, the missing information and its importance are shown to the questioner. This enables the questioner to be aware of the missing information and facilitates them in clarifying the question.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Data Availability

No datasets were generated or analysed during the current study.

Notes

  1. Datasets link: https://gustav1.ux.uis.no/downloads/ecir2019-qac/ecir2019-qac-data.zip

References

Download references

Funding

This research has been done under the research project QG.21.58 “Researching and developing clustering integrating constraints and deep learning algorithms” of Vietnam National University, Hanoi.

Author information

Authors and Affiliations

Authors

Contributions

Alireza Khabbazan designed the model settings, collected the data, conceived the experiments, analyzed the results, and prepared the original draft. Ahmad Ali Abin conceptualized the problem, conceived the original idea, and defined the problem. Viet-Vu Vu administrated the research and carried out the experiments. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ahmad Ali Abin.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

This study was conducted in accordance with general ethical guidelines applicable to community question-answering research. It did not involve human or animal subjects, hence specific ethical committee approval was not required.

Availability of supporting data

The datasets used during the current study are available. Additionally, the datasets generated, model settings, and training processes are available from the corresponding author upon reasonable request.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khabbazan, A., Abin, A.A. & Vu, VV. Improving the clarity of questions in Community Question Answering networks. J Intell Inf Syst (2024). https://doi.org/10.1007/s10844-024-00847-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10844-024-00847-y

Keywords

Navigation