Advertisement

Investigating the Effectiveness of Thesaurus Generated Using Tolerance Rough Set Model

  • Gloria Virginia
  • Hung Son Nguyen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6804)

Abstract

We considered the tolerance matrix generated using tolerance rough set model as a kind of an associative thesaurus. The effectiveness of the thesaurus was measured using performance measures commonly used in information retrieval, recall and precision, where they were used for the terms rather than documents. A corpus consists of keywords defined as highly related with particular topic by human experts become the ground truth of this study. Analysis was conducted based on comparison values of all available sets created. Above all findings, this paper was thought as the fundamental basis that generating an automatic thesaurus using rough sets theory is a promising way. We also mentioned some directions for future study.

Keywords

rough sets tolerance rough set model thesaurus 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adriani, M., Asian, J., Nazief, B., Tahaghogi, S.M.M., Williams, H.E.: Stemming Indonesian: A Confix-Stripping Approach. ACM Transactions on Asian Language Information Processing 6(4), 1–33 (2007), Article 13CrossRefGoogle Scholar
  2. 2.
    Asian, J.: Effective Techniques for Indonesian Text Retrieval. Doctor of Philosophy Thesis. School of Computer Science and Information Technology. RMIT University (2007)Google Scholar
  3. 3.
    Gaoxiang, Y., Heling, H., Zhengding, L., Ruixuan, L.: A Novel Web Query Automatic Expansion Based on Rough Set. Wuhan University Journal of Natural Sciences 11(5), 1167–1171 (2006)CrossRefzbMATHGoogle Scholar
  4. 4.
    Kawasaki, S., Nguyen, N.B., Ho, T.B.: Hierarchical Document Clustering Based on Tolerance Rough Set Model. In: 4th European Conference on Principles of Data Mining and Knowledge Discovery, pp. 458–463. Springer, London (2000)CrossRefGoogle Scholar
  5. 5.
    Komorowski, J., Pawlak, Z., Polkowski, L., Skowron, A.: Rough Sets: A Tutorial. In: Rough Fuzzy Hybridization: A New Trend in Decision-Making, pp. 3–98. Springer, Singapore (1998)Google Scholar
  6. 6.
    Lassila, O., McGuinness, D.: The Role of Frame-Based Representation on the Semantic Web. Technical Report KSL-01-02, Knowledge System Laboratory, Standford University (2001)Google Scholar
  7. 7.
    Manning, C.D., Raghavan, P., Schutze, H.: An Introduction to Information Retrieval. Cambridge University Press, England (2009)zbMATHGoogle Scholar
  8. 8.
    Nguyen, H.S., Ho, T.B.: Rough Document Clustering and the Internet. In: Pedrycz, W., Skowron, A., Kreinovich, V. (eds.) Handbook of Granular Computing, pp. 987–1003. John Wiley & Sons Ltd., Chichester (2008)CrossRefGoogle Scholar
  9. 9.
    National Institute of Standards and Technology, http://www.nist.gov/srd/nistsd23.cfm
  10. 10.
    Nguyen, H.S.: Approximate boolean reasoning: Foundations and applications in data mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets V. LNCS, vol. 4100, pp. 334–506. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  11. 11.
    Pawlak, Z.: Rough Sets. International Journal of Computer and Information Science 11(5), 341–356 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Pawlak, Z.: Some issues on rough sets. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M.S. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 1–58. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  13. 13.
    Skowron, A., Stepaniuk, J.: Tolerance Approximation Spaces. Fundam. Inf. 27(2-3), 245–253 (1996)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Vega, V.B.: Information Retrieval for the Indonesian Language. Master thesis. National University of Singapore (2001) (unpublished)Google Scholar
  15. 15.
    Virginia, G., Nguyen, H.S.: Automatic Ontology Constructor for Indonesian Language. In: 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, pp. 440–443. IEEE Press, Los Alamitos (2010)CrossRefGoogle Scholar
  16. 16.
    Virginia, G., Nguyen, H.S.: Investigating the Potential of Rough Sets Theory in Automatic Thesaurus Construction. In: 2011 International Conference on Data Engineering and Internet Technology, pp. 882–885. IEEE, Los Alamitos (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gloria Virginia
    • 1
  • Hung Son Nguyen
    • 1
  1. 1.Faculty of Mathematics, Informatics and MechanicsUniversity of WarsawWarsawPoland

Personalised recommendations