Skip to main content

Retrieving Rising Stars in Focused Community Question-Answering

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 9622)

Abstract

In Community Question Answering (CQA)‘ forums, there is typically a small fraction of users who provide high-quality posts and earn a very high reputation status from the community. These top contributors are critical to the community since they drive the development of the site and attract traffic from Internet users. Identifying these individuals could be highly valuable, but this is not an easy task. Unlike publication or social networks, most CQA sites lack information regarding peers, friends, or collaborators, which can be an important indicator signaling future success or performance. In this paper, we attempt to perform this analysis by extracting different sets of features to predict future contribution. The experiment covers 376,000 users who remain active in Stack Overflow for at least one year and together contribute more than 21 million posts. One of the highlights of our approach is that we can identify rising stars after short observations. Our approach achieves high accuracy, 85 %, when predicting whether a user will become a top contributor after a few weeks of observation. As a slightly different problem in which we could observe a few posts by a user, our method achieves accuracy higher than 90 %. Our approach provides higher accuracy than baselines methods including a popular time series analysis. Furthermore, our methods are robust to different classifier algorithms. Identifying the rising stars early could help CQA administrators gain an overview of the site’s future and ensure that enough incentive and support is given to potential contributors.

Keywords

  • Random Forest
  • Classification Algorithm
  • Information Seek
  • Random Forest Method
  • Feature Importance

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-662-49390-8_3
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-662-49390-8
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.

Notes

  1. 1.

    http://highscalability.com/blog/2014/7/21/stackoverflow-update-560m-pageviews-a-month-25-servers-and-i.html.

  2. 2.

    https://archive.org/details/stackexchange.

References

  1. Adamic, L.A., Zhang, J., Bakshy, E., Ackerman, M.S.: Knowledge sharing and yahoo answers: everyone knows something. In: WWW, pp. 665–674 (2008)

    Google Scholar 

  2. Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, New York (2006)

    MATH  Google Scholar 

  3. Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook, pp. 853–867. Springer, Heidelberg (2005)

    CrossRef  Google Scholar 

  4. Daud, A., Abbasi, R., Muhammad, F.: Finding rising stars in social networks. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part I. LNCS, vol. 7825, pp. 13–24. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  5. Dror, G., Maarek, Y., Szpektor, I.: Will my question be answered? predicting “question answerability” in community question-answering sites. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013, Part III. LNCS, vol. 8190, pp. 499–514. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  6. Harper, F.M., Raban, D., Rafaeli, S., Konstan, J.A.: Predictors of answer quality in online q&a sites. In: CHI, pp. 865–874 (2008)

    Google Scholar 

  7. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer Series in Statistics. Springer, New York (2009)

    CrossRef  MATH  Google Scholar 

  8. Le, L.T., Eliassi-Rad, T., Provost, F., Moores, L.: Hyperlocal: inferring location of ip addresses in real-time bid requests for mobile ads. In: SIGSPATIAL LBSN 2013, pp. 24–33 (2013)

    Google Scholar 

  9. Li, B., Jin, T., Lyu, M.R., King, I., Mak, B.: Analyzing and predicting question quality in community question answering services. In: WWW, pp. 775–782 (2012)

    Google Scholar 

  10. Li, X.-L., Foo, C.S., Tew, K.L., Ng, S.-K.: Searching for rising stars in bibliography networks. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds.) DASFAA 2009. LNCS, vol. 5463, pp. 288–292. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  11. Liu, Q., Agichtein, E., Dror, G., Maarek, Y., Szpektor, I.: When web search fails, searchers become askers: understanding the transition. In: SIGIR, pp. 801–810 (2012)

    Google Scholar 

  12. Movshovitz-Attias, D., Movshovitz-Attias, Y., Steenkiste, P., Faloutsos, C.: Analysis of the reputation system and user contributions on a question answering website: stackoverflow. In: ASONAM, pp. 886–893 (2013)

    Google Scholar 

  13. Ngonmang, B., Viennet, E., Tchuente, M.: Churn prediction in a real online social network using local community analysis. In: ASONAM, pp. 282–288 (2012)

    Google Scholar 

  14. Oentaryo, R.J., Lim, E.-P., Lo, D., Zhu, F., Prasetyo, P.K.: Collective churn prediction in social network. In: ASONAM, pp. 210–214 (2012)

    Google Scholar 

  15. Pal, A., Chang, S., Konstan, J.A.: Evolution of experts in question answering communities. In: ICWSM, pp. 274–281 (2012)

    Google Scholar 

  16. Pudipeddi, J.S., Akoglu, L., Tong, H.: User churn in focused question answering sites: characterizations and prediction. In: WWW Companion 2014, pp. 469–474 (2014)

    Google Scholar 

  17. Shah, C., Kitzie, V.: Social q&a and virtual reference - comparing apples and oranges with the help of experts and users. J. Am. Soc. Inf. Sci. Technol. 63, 2020–2036 (2012)

    CrossRef  Google Scholar 

  18. Shah, C., Oh, S., Oh, J.S.: Research agenda for social q&a. Libr. Inf. Sci. Res. 31(4), 205–209 (2009)

    CrossRef  Google Scholar 

  19. Shah, C., Pomerantz, J.: Evaluating and predicting answer quality in community qa. In: SIGIR, pp. 411–418 (2010)

    Google Scholar 

  20. Shah, C., Radford, M., Connaway, L., Choi, E., Kitzie, V.: How much change do you get from 40\(\$\)? analyzing and addressing failed questions on social q&a. In: ASIST, pp. 1–10 (2012)

    Google Scholar 

  21. Shumway, R.H., Stoffer, D.S.: Time Series Analysis and Its Applications: With R Examples. Springer Texts in Statistics. Springer, New York (2011)

    CrossRef  MATH  Google Scholar 

  22. Surowiecki, J.: The Wisdom of Crowds. Anchor, New York (2005)

    Google Scholar 

  23. White, R.W., Richardson, M.: Effects of expertise differences in synchronous social q&a. In: SIGIR, pp. 1055–1056 (2012)

    Google Scholar 

  24. Yang, L., Bao, S., Lin, Q., Wu, X., Han, D., Su, Z., Yu, Y.: Analyzing and predicting not-answered questions in community-based question answering services. In: AAAI (2011)

    Google Scholar 

Download references

Acknowledgments

This work is partially funded by the US National Science Foundation (NSF) BCC-SBE award no. 1244704.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Long T. Le .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Le, L.T., Shah, C. (2016). Retrieving Rising Stars in Focused Community Question-Answering. In: Nguyen, N.T., Trawiński, B., Fujita, H., Hong, TP. (eds) Intelligent Information and Database Systems. ACIIDS 2016. Lecture Notes in Computer Science(), vol 9622. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49390-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49390-8_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49389-2

  • Online ISBN: 978-3-662-49390-8

  • eBook Packages: Computer ScienceComputer Science (R0)