Assessing the Quality of Thai Wikipedia Articles Using Concept and Statistical Features
The quality evaluation of Thai Wikipedia articles relies on user consideration. There are increasing numbers of articles every day therefore the automatic evaluation method is needed for user. Components of Wikipedia articles such as headers, pictures, references, and links are useful to indicate the quality of articles. However readers need complete content to cover all of concepts in that article. The concept features are investigated in this work. The aim of this research is to classify Thai Wikipedia articles into two classes namely high-quality and low-quality class. Three article domains (Biography, Animal, and Place) are testes with decision tree and Naïve Bayes. We found that Naïve Bayes gets high TP Rate compared to decision tree in every domain. Moreover, we found that the concept feature plays an important role in quality classification of Thai Wikipedia articles.
KeywordsQuality of Thai Wikipedia articles Naïve Bayes Decision tree Concept feature Statistical feature
Unable to display preview. Download preview PDF.
- 1.Dalip, D.H., Gonçalves, M.A., Cardoso, T., Cristo, M., Calado, P.: A Multiview Approach for the Quality Assessment of Wiki Articles. Information and Data Management 3(1), 73–83 (2012)Google Scholar
- 2.Hu, M., Lim, E.-P., Sun, A., Lauw, H.W., Vuong, B.-Q.: Measuring article quality in wikipedia: models and evaluation. In: 16th ACM Conference on Conference on Information and Knowledge Management, pp. 243–252. ACM, Lisbon (2007)Google Scholar
- 3.Rassbach, L., Pincock, T., Mingus, B.: Exploring the Feasibility of Automatically Rating Online Article Quality. In: 9th Joint Conference on Digital Libraries (2007)Google Scholar
- 4.Saengthongpattana, K., Soonthornphisaj, N.: Thai Wikipedia Quality Measurement using Fuzzy Logic. In: 26th Annual Conference of the Japanese Society for Artificial Intelligence (2012)Google Scholar
- 7.Xu, Y., Luo, T.: Measuring article quality in Wikipedia: Lexical clue model. In: 3rd Symposium on Web Society (SWS), pp. 141–146 (2011)Google Scholar
- 8.Liu, J., Ram, S.: Who does what: Collaboration patterns in the wikipedia and their impact on article quality. ACM Trans. Manage. Inf. Syst. 2(2), 1–23 (2011)Google Scholar
- 10.WIKIPEDIA, Featured article candidates, http://en.wikipedia.org/wiki/Wikipedia:Featured_article_candidates
- 11.Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc. (1993)Google Scholar
- 12.Daniela, X., Hinde, C.J., Stone, R.G.: Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages. Int. Journal of Computer Science 4(1), 16–23 (2009)Google Scholar