Words Worth Attention: Predicting Words of the Week on the Russian Wiktionary

  • Dmitry Ustalov
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 468)

Abstract

Such collaborative lexicography projects as Wiktionary are becoming strong competitors for traditional semantic resources just as Wikipedia has already become for expert-built knowledge bases. Keeping the data obtained from the general public crowd in good quality is a very challenging problem because of the fuzzy nature of the crowdsourcing phenomena. The presented study focuses on predicting the word of the week articles on the Russian Wiktionary by treating this problem as a binary classification task. The best proposed model is based on the Naïve Bayes classifier and has weighted average precision, recall, and F1-measure values of 87% by evaluating on the provided dataset.

Keywords

Russian Wiktionary semantic resources word of the week quality assessment 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arazy, O., Nov, O.: Determinants of Wikipedia Quality: The Roles of Global and Local Contribution Inequality. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 233–236. ACM (2010)Google Scholar
  2. 2.
    van Assem, M., Malaisé, V., Miles, A., Schreiber, G.: A Method to Convert Thesauri to SKOS. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 95–109. Springer, Heidelberg (2006)Google Scholar
  3. 3.
    Blumenstock, J.E.: Size Matters: Word Count as a Measure of Quality on Wikipedia. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1095–1096. ACM (2008)Google Scholar
  4. 4.
    De la Calzada, G., Dekhtyar, A.: On Measuring the Quality of Wikipedia Articles. In: Proceedings of the 4th Workshop on Information Credibility, pp. 11–18. ACM (2010)Google Scholar
  5. 5.
    Dalip, D.H., Gonçalves, A.M., Cristo, M., Calado, P.: Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304. ACM (2009)Google Scholar
  6. 6.
    Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring Article Quality in Wikipedia: Models and Evaluation. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 243–252. ACM (2007)Google Scholar
  7. 7.
    Kittur, A., Kraut, R.E.: Harnessing the Wisdom of Crowds in Wikipedia: Quality Through Coordination. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, pp. 37–46. ACM (2008)Google Scholar
  8. 8.
    Krizhanovsky, A., Smirnov, A.: An approach to automated construction of a general-purpose lexical ontology based on Wiktionary. Journal of Computer and Systems Sciences International 52(2), 215–225 (2013)CrossRefGoogle Scholar
  9. 9.
    Lyashevskaya, O., Sharov, S.: The frequency dictionary of modern Russian language. Azbukovnik, Moscow (2009)Google Scholar
  10. 10.
    Meyer, C.M., Gurevych, I.: Wiktionary: A new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography. Electronic Lexicography, 259–291 (2012)Google Scholar
  11. 11.
    Saengthongpattana, K., Soonthornphisaj, N.: Assessing the Quality of Thai Wikipedia Articles Using Concept and Statistical Features. In: Rocha, Á., Correia, A.M., Tan, F., Stroetmann, K. (eds.) New Perspectives in Information Systems and Technologies, Volume 1. AISC, vol. 275, pp. 513–523. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  12. 12.
    Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L.: Information Quality Work Organization in Wikipedia. Journal of the American Society for Information Science and Technology 59(6), 983–1001 (2008)CrossRefGoogle Scholar
  13. 13.
    Wilkinson, D.M., Huberman, B.A.: Cooperation and Quality in Wikipedia. In: Proceedings of the 2007 International Symposium on Wikis, pp. 157–164. ACM (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Dmitry Ustalov
    • 1
    • 2
    • 3
  1. 1.Krasovsky Institute of Mathematics and MechanicsEkaterinburgRussia
  2. 2.Ural Federal UniversityEkaterinburgRussia
  3. 3.NLPubEkaterinburgRussia

Personalised recommendations