Abstract
Such collaborative lexicography projects as Wiktionary are becoming strong competitors for traditional semantic resources just as Wikipedia has already become for expert-built knowledge bases. Keeping the data obtained from the general public crowd in good quality is a very challenging problem because of the fuzzy nature of the crowdsourcing phenomena. The presented study focuses on predicting the word of the week articles on the Russian Wiktionary by treating this problem as a binary classification task. The best proposed model is based on the Naïve Bayes classifier and has weighted average precision, recall, and F 1-measure values of 87% by evaluating on the provided dataset.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Arazy, O., Nov, O.: Determinants of Wikipedia Quality: The Roles of Global and Local Contribution Inequality. In: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 233–236. ACM (2010)
van Assem, M., Malaisé, V., Miles, A., Schreiber, G.: A Method to Convert Thesauri to SKOS. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 95–109. Springer, Heidelberg (2006)
Blumenstock, J.E.: Size Matters: Word Count as a Measure of Quality on Wikipedia. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1095–1096. ACM (2008)
De la Calzada, G., Dekhtyar, A.: On Measuring the Quality of Wikipedia Articles. In: Proceedings of the 4th Workshop on Information Credibility, pp. 11–18. ACM (2010)
Dalip, D.H., Gonçalves, A.M., Cristo, M., Calado, P.: Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304. ACM (2009)
Hu, M., Lim, E.P., Sun, A., Lauw, H.W., Vuong, B.Q.: Measuring Article Quality in Wikipedia: Models and Evaluation. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 243–252. ACM (2007)
Kittur, A., Kraut, R.E.: Harnessing the Wisdom of Crowds in Wikipedia: Quality Through Coordination. In: Proceedings of the 2008 ACM Conference on Computer Supported Cooperative Work, pp. 37–46. ACM (2008)
Krizhanovsky, A., Smirnov, A.: An approach to automated construction of a general-purpose lexical ontology based on Wiktionary. Journal of Computer and Systems Sciences International 52(2), 215–225 (2013)
Lyashevskaya, O., Sharov, S.: The frequency dictionary of modern Russian language. Azbukovnik, Moscow (2009)
Meyer, C.M., Gurevych, I.: Wiktionary: A new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography. Electronic Lexicography, 259–291 (2012)
Saengthongpattana, K., Soonthornphisaj, N.: Assessing the Quality of Thai Wikipedia Articles Using Concept and Statistical Features. In: Rocha, Á., Correia, A.M., Tan, F., Stroetmann, K. (eds.) New Perspectives in Information Systems and Technologies, Volume 1. AISC, vol. 275, pp. 513–523. Springer, Heidelberg (2014)
Stvilia, B., Twidale, M.B., Smith, L.C., Gasser, L.: Information Quality Work Organization in Wikipedia. Journal of the American Society for Information Science and Technology 59(6), 983–1001 (2008)
Wilkinson, D.M., Huberman, B.A.: Cooperation and Quality in Wikipedia. In: Proceedings of the 2007 International Symposium on Wikis, pp. 157–164. ACM (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ustalov, D. (2014). Words Worth Attention: Predicting Words of the Week on the Russian Wiktionary. In: Klinov, P., Mouromtsev, D. (eds) Knowledge Engineering and the Semantic Web. KESW 2014. Communications in Computer and Information Science, vol 468. Springer, Cham. https://doi.org/10.1007/978-3-319-11716-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-11716-4_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11715-7
Online ISBN: 978-3-319-11716-4
eBook Packages: Computer ScienceComputer Science (R0)