Thai Wikipedia Link Suggestion Framework

  • Arnon Rungsawang
  • Sompop Siangkhio
  • Athasit Surarerk
  • Bundit Manaskasemsak
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 253)


The paper presents a framework that exploits the Thai Wikipedia articles as a knowledge source to train the machine learning classifier for link suggestion purpose. Given an input document, important concepts in the text have been automatically extracted, and the chosen corresponding Wikipedia pages have been determined and suggested to be the destination links for additional information. Preliminary experiments from the prototype running on a test set of Thai Wikipedia articles show that this automatic link suggestion framework provides reasonably up to 90 % link suggestion accuracy.


Thai Wikipedia Wikify Wikification Sense disambiguation Keyword extraction Link suggestion Machine learning 



We would like to thank all anonymous reviewers for their comments and suggestions to improve the final version of the paper. We also would like to thank to both departments of computer engineering in Kasetsart University and Chulalongkorn University for the excellent research environment.


  1. 1.
    Kulkarni S, Singh A, Ramakrishnan G, Chakrabarti S (2009) Collective annotation on wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD conference on knowledge discovery and data mining. ACM, New York, pp 457–465Google Scholar
  2. 2.
    Mihalcea R, Csomai A (2007) Wikify! Linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM conference on information and knowledge management. ACM, New York, pp 233–241Google Scholar
  3. 3.
    Milne D, Witten IH (2008) Learning to link with wikipedia. In: Proceedings of the 17th ACM conference on information and knowledge management. ACM, New York, pp. 509–518Google Scholar
  4. 4.
    Patwardhan S, Banerjee S, Paderson T (2003) Using measure of semantic relatedness for word sense disambiguation. In: Proceedings of the 4th international conference on computational linguistics and intelligent text processing, LNCS, vol 2588, pp 241–257Google Scholar
  5. 5.
    Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San FranciscoGoogle Scholar
  6. 6.
    Ratinov L, Roth D, Downey D, Anderson M (2011) Local and global algorithms for disambiguation to wikipedia. In: Proceedings of the 49th annual meeting on the association for computational linguistics. ACL, Oregon, pp. 1375–1384Google Scholar
  7. 7.
    Salton G, McGrill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, New YorkGoogle Scholar
  8. 8.
    Thai Lexitron (Lexto),
  9. 9.
    Witten IH, Frank E, Hall MA (2011) Data mining, practical machine learning tools and techniques. Morgan Kaufmann, San FranciscoGoogle Scholar
  10. 10.
    Wu H, Luk R, Wong K, Kwok K (2008) Interpreting TF-IDF term weights as making relevance decision. ACM Trans Inform Syst 26(3):1–37Google Scholar
  11. 11.

Copyright information

© Springer Science+Business Media Dordrecht 2013

Authors and Affiliations

  • Arnon Rungsawang
    • 1
  • Sompop Siangkhio
    • 2
  • Athasit Surarerk
    • 2
  • Bundit Manaskasemsak
    • 1
  1. 1.Massive Information and Knowledge Engineering, Department of Computer Engineering, Faculty of EngineeringKasetsart UniversityBangkokThailand
  2. 2.Engineering Laboratory in Theoretical Enumerable System, Department of Computer Engineering, Faculty of EngineeringChulalongkorn UniversityBangkokThailand

Personalised recommendations