Skip to main content

Investigating the Potential of Rough Sets Theory in Automatic Thesaurus Construction

  • Conference paper
Recent Progress in Data Engineering and Internet Technology

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 157))

Abstract

This paper presents the result of initial study about implementation of rough sets theory in generating a thesaurus automatically from a corpus. The main objective of this study is to investigate the relation between keywords (defined by human experts as highly related with particular topic) and the sets generated based on rough sets theory. Analysis was conducted into comparison results of all available sets. We concluded that implementing rough sets theory is a rational way to automatically construct a thesaurus, as it can enrich a concept and proved to be able to cover the keywords given by the human experts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Crouch, C., Yang, B.: Experiments in automatic statistical thesaurus construction. In: Proc. The 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 77–88. ACM Publisher, New York (1992)

    Chapter  Google Scholar 

  2. Ho, T.B., Nguyen, N.B.: Nonhierarchical document clustering based on a tolerance rough set model. International Journal of Intelligent System 17, 199–212 (2002)

    Article  MATH  Google Scholar 

  3. Imran, H., Sharan, A.: Thesaurus and query expansion. International Journal of Computer Science & Information Technology (IJCSIT) 1, 89–97 (2009)

    Google Scholar 

  4. Kawasaki, S., Nguyen, N.B., Ho, T.-B.: Hierarchical Document Clustering Based on Tolerance Rough Set Model. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 458–463. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Komorowski, J., Pawlak, Z., Polkowski, L., Skowron, A.: Rough sets: a tutorial. In: Rough Fuzzy Hybridization: A New Trend in Decision-Making, pp. 3–98. Springer, Singapore (1998)

    Google Scholar 

  6. Lassila, O., McGuinness, D.: The role of frame-based representation on the semantic web. Technical Report KSL-01-02, Knowledge System Laboratory, Standford University

    Google Scholar 

  7. Lee, H., Lin, S., Huang, C.: Interactive query expansion based on fuzzy association thesaurus for web information retrieval. In: Proc. of the 10th IEEE International Conference on Fuzzy Systems, vol. 3, pp. 724–727 (2001)

    Google Scholar 

  8. Maron, M.E., Kuhns, J.K.: On relevance, probabilistic indexing and information retrieval. Journal of the ACM 7, 216–244 (1960), doi:10.1145/321033.321035

    Article  Google Scholar 

  9. Nguyen, H.S., Ho, T.B.: Rough document clustering and the Internet. In: Pedrycz, W., Skowron, A., Kreinovich, V. (eds.) Handbook of Granular Computing, ch. 47, pp. 987–1003. John Wiley & Sons Ltd. (2008), doi:10.1002/9780470724163

    Google Scholar 

  10. Patry, A., Langlais, P.: Corpus-based terminology extraction. In: 7th International Conference on Terminology and Knowledge Engineering (TKE 2005), pp. 313–321 (2005)

    Google Scholar 

  11. Pawlak, Z.: Rough sets. International Journal of Computer and Information Science 11, 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  12. Pawlak, Z.: Some Issues on Rough Sets. In: Peters, J.F., Skowron, A., Grzymała-Busse, J.W., Kostek, B.z., Świniarski, R.W., Szczuka, M.S. (eds.) Transactions on Rough Sets I. LNCS, vol. 3100, pp. 1–58. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  13. Skowron, A., Stepaniuk, J.: Tolerance approximation spaces. Fundam. Inf. 27, 245–253 (1996)

    MathSciNet  MATH  Google Scholar 

  14. Vega, V.B.: Information retrieval for the indonesian language. Master thesis. National University of Singapore (2001) (unpublished)

    Google Scholar 

  15. Virginia, G., Nguyen, H.S.: Automatic ontology constructor for Indonesian language. In: Proc. 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT 2010), pp. 440–443. IEEE Press (2010), doi:10.1109/WI-IAT.2010.122

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gloria Virginia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Virginia, G., Nguyen, H.S. (2012). Investigating the Potential of Rough Sets Theory in Automatic Thesaurus Construction. In: Gaol, F. (eds) Recent Progress in Data Engineering and Internet Technology. Lecture Notes in Electrical Engineering, vol 157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28798-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28798-5_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28797-8

  • Online ISBN: 978-3-642-28798-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics