Abstract
This paper presents the result of initial study about implementation of rough sets theory in generating a thesaurus automatically from a corpus. The main objective of this study is to investigate the relation between keywords (defined by human experts as highly related with particular topic) and the sets generated based on rough sets theory. Analysis was conducted into comparison results of all available sets. We concluded that implementing rough sets theory is a rational way to automatically construct a thesaurus, as it can enrich a concept and proved to be able to cover the keywords given by the human experts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Crouch, C., Yang, B.: Experiments in automatic statistical thesaurus construction. In: Proc. The 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 77–88. ACM Publisher, New York (1992)
Ho, T.B., Nguyen, N.B.: Nonhierarchical document clustering based on a tolerance rough set model. International Journal of Intelligent System 17, 199–212 (2002)
Imran, H., Sharan, A.: Thesaurus and query expansion. International Journal of Computer Science & Information Technology (IJCSIT) 1, 89–97 (2009)
Kawasaki, S., Nguyen, N.B., Ho, T.-B.: Hierarchical Document Clustering Based on Tolerance Rough Set Model. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 458–463. Springer, Heidelberg (2000)
Komorowski, J., Pawlak, Z., Polkowski, L., Skowron, A.: Rough sets: a tutorial. In: Rough Fuzzy Hybridization: A New Trend in Decision-Making, pp. 3–98. Springer, Singapore (1998)
Lassila, O., McGuinness, D.: The role of frame-based representation on the semantic web. Technical Report KSL-01-02, Knowledge System Laboratory, Standford University
Lee, H., Lin, S., Huang, C.: Interactive query expansion based on fuzzy association thesaurus for web information retrieval. In: Proc. of the 10th IEEE International Conference on Fuzzy Systems, vol. 3, pp. 724–727 (2001)
Maron, M.E., Kuhns, J.K.: On relevance, probabilistic indexing and information retrieval. Journal of the ACM 7, 216–244 (1960), doi:10.1145/321033.321035
Nguyen, H.S., Ho, T.B.: Rough document clustering and the Internet. In: Pedrycz, W., Skowron, A., Kreinovich, V. (eds.) Handbook of Granular Computing, ch. 47, pp. 987–1003. John Wiley & Sons Ltd. (2008), doi:10.1002/9780470724163
Patry, A., Langlais, P.: Corpus-based terminology extraction. In: 7th International Conference on Terminology and Knowledge Engineering (TKE 2005), pp. 313–321 (2005)
Pawlak, Z.: Rough sets. International Journal of Computer and Information Science 11, 341–356 (1982)
Pawlak, Z.: Some Issues on Rough Sets. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M.S. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 1–58. Springer, Heidelberg (2004)
Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundam. Inf. 27, 245–253 (1996)
Vega, V.B.: Information retrieval for the indonesian language. Master thesis. National University of Singapore (2001) (unpublished)
Virginia, G., Nguyen, H.S.: Automatic ontology constructor for Indonesian language. In: Proc. 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT 2010), pp. 440–443. IEEE Press (2010), doi:10.1109/WI-IAT.2010.122
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Virginia, G., Nguyen, H.S. (2012). Investigating the Potential of Rough Sets Theory in Automatic Thesaurus Construction. In: Gaol, F. (eds) Recent Progress in Data Engineering and Internet Technology. Lecture Notes in Electrical Engineering, vol 157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28798-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-28798-5_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28797-8
Online ISBN: 978-3-642-28798-5
eBook Packages: EngineeringEngineering (R0)