Skip to main content

Rough and Fuzzy Sets for Data Mining of a Controlled Vocabulary for Textual Retrieval

  • Chapter
Soft Computing in Information Retrieval

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 50))

Abstract

We present an approach to text retrieval, incorporating data mining of a controlled i.e., vocabulary mining, in order to improve retrieval Performance. In gener al, formal queries presented to a retrieval System axe not optimized for retrieval efficiency or effectiveness. Vocabulary mining allows us to transform the query via Operations such as generalization or specialization. We offer a new framework for vocabulary mining, combining rough sets and fuzzy sets, allowing us to use rough set approximations when the documents and queries are described us-ing weighted, i.e., fuzzy, representations. We also explore generalized rough sets, variable precision models, and coordinating multiple vocabulary views. Finally, we present a preliminary analysis of the application of our proposed framework to a modern controlled vocabulary, the Unified Medical Language System. The proposed framework supports the systematic study and application of different vocabulary views within the textual Information retrieval model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bookstein, A. (1986) Probability and Puzzy-set Applications to Information Retrieval. Annual Review of Information Science and Technolog, 29, 275–279.

    Google Scholar 

  2. Cooper, W. S. (1988) Getting beyond Boole. Information Processing and Management, 24, 243–248.

    Article  Google Scholar 

  3. Das-Gupta, P. (1988) Rough Sets and Information Retrieval. In Chiaramella, Y. (Ed.), Proceedings of the llth International Conference of the Association for Computing Machinery Special Interest Group on Information Retrieval (ACM SIGIR), Grenoble, France. 567–582.

    Google Scholar 

  4. Dubois, D. and Prade, H. (1990) Rough Puzzy Sets and Puzzy Rough Sets. International Journal of General Systems, 17, 191–209.

    Article  MATH  Google Scholar 

  5. Dubois, D. and Prade, H. (1992) Putting rough sets and fuzzy sets together. In Slowinski, R. (Ed.), Intelligent Decision Support: Handbook of Applications and Advances ofthe Rough Sets Theory, Boston, MA: Kluwer Academic Publishers, Boston, 204–232.

    Google Scholar 

  6. Harley, R. J., Keen, E. M., Large, J.A., Tedd, L.A. Online Searching: Principles and Practice. London: Bowker Säur.

    Google Scholar 

  7. Hu, X., Cercone, N. (1995) Mining knowledge rules from databases: A rough set approach. In Proceedings of the 12ih International Conference on Data Engineering, New Orleans. 96–105.

    Google Scholar 

  8. Krusinska, E., Slowinski, R., and Stefanowski. (1992) Discriminant versus rough set approach to vague data analysis. Appl. Stochastic Models and Data Anal, 8, 43–56.

    Article  MATH  Google Scholar 

  9. Lin, T.Y. (1989) Neighbourhood Systems and approximation in database and knowledge base Systems. In Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems.

    Google Scholar 

  10. Lin, T.Y. (1992) Topological and Fuzzy Rough Sets. In Slowinski, R. (Ed.), Intelligent Decision Support: handbook of Applications and Advances in Rough Sets Theory. Boston, MA: Kluwer Academic Publishers, Boston, 287–304.

    Chapter  Google Scholar 

  11. Lin, T.Y. and Liu, Q. (1993) Rough Approximate Operators. In Proceedings of the International Workshop on Rough Sets and Knowledge Discovery, First Edition, 255–257.

    Google Scholar 

  12. Lingras, P. J. and Yao, Y.Y. (1998) Data mining using extensions of the rough set model. Journal of the American Society for Information Science, 49(5), 415–422.

    Article  Google Scholar 

  13. Millan, M. and Machuca, F. (1997) Using the rough set theory to exploit the data mining potential in relational databases Systems. In RSSC’97, 344–347.

    Google Scholar 

  14. Miyamoto, S., (1990) Fuzzy sets in information retrieval and Cluster analysis. Dordrecht, The Netherlands: Kluwer Press.

    Book  MATH  Google Scholar 

  15. Miyamoto, S. (1998) Application of Rough Sets to Information Retrieval. Journal of the American Society for Information Science, 49(3), 195–205.

    Article  Google Scholar 

  16. National Library of Medicine. (1998) Unified Medical Language System (UMLS) Knowledge Sources, 9th edition. MD:NLM.

    Google Scholar 

  17. Nguyen, S. Hoa, Skowron, A., Synak, R, O’blewski, J. (1997) Knowledge dis-covery in data bases: Rough set approach. In: Mares, M., Meisar, R., Novak, V., and Ramik, J. (Eds.), Proceedings of ihe Seventh International Fuzzy Systems Association World Congress (IFSA’97), June 25–29, Prague, 2, 204–209.

    Google Scholar 

  18. Ohrn, A., Vinterbo, S., Szyma’nski, R, and Komorowski, J. (1997) Modeling cardiac patient set residuals using rough sets. In Proceedings of AMIA Annual Fall Symposium (formerly SCAMC), Nashville, TN, USA, October 25–29, 203–207.

    Google Scholar 

  19. Pawlak, Z. (1982) Rough Sets. International Journal of Computer and Information Science. 11, 341–356.

    Article  MathSciNet  MATH  Google Scholar 

  20. Pawlak, Z. and Skowron, A. (1994) Rough membership functions. In Yager, R.R., Fedrizzi, M., and Kacprzyk, J., (Eds.), Advances in ihe Dempster-Shafer Theory of Evidence. New York, NY: John Wiley & Sons, Inc., 251–271.

    Google Scholar 

  21. Robertson, S. E. (1977) The Probability Ranking Principle in IR. Journal of Documentation, 33, 294–304.

    Article  Google Scholar 

  22. Salton G, (Ed.). (1971) The SMART Retrieval System-Experiments in Automatic Document Processing, NJ: Prentice-Hall.

    Google Scholar 

  23. Salton, G. (1988) A Simple Blueprint for Automatic Boolean Query Processing. Information Processing and Management, 24, 269–280.

    Article  Google Scholar 

  24. Skowron, A., and Grzymala-Busse, J. W. (1994) Prom rough set theory to evidence theory. In Yaeger, R.R., Fedrizzi, M., and Kacprzyk, J., (Eds.), Advances in the Dempster-Shafer Theory of Evidence. New York, NY: John Wiley & Sons, Inc., 193–236.

    Google Scholar 

  25. Srinivasan, P. (1989) Intelligent Information Retrieval using Rough Set Ap-proximations. Information Processing and Management, 25(4), 347–361.

    Google Scholar 

  26. Srinivasan, P. (1991) The Importance of Rough Approximations for Information Retrieval. International Journal of Man-Machine Studies, 34, 657–671.

    Article  Google Scholar 

  27. Wong, S.K.M., and Ziarko, W. (1987) Comparison of the probabilistic approx-imate Classification and the fuzzy set model. Fuzzy Sets and Systems, 21, 357–362.

    Article  MathSciNet  MATH  Google Scholar 

  28. Yao, Y.Y., and Wong, S.K.M. (1992) A decision theoretic framework for approx-imating concepts. International Journal of Man-Machine Studies, 37,793–809.

    Article  Google Scholar 

  29. Yao, Y.Y., Li, X., Lin, T.Y., and Liu, Q. (1994) Representation and Classification of rough set models. In Lin, T.Y. and Wildberger, A.M. (Eds.), Soft Computing: Proceedings of the Third International Workshop on Rough Sets and Soft Computing (RSSC ‘94), San Jose, CA. Nov. 10–12. San Diego, CA: The Society for Computer Simulation, 44–47.

    Google Scholar 

  30. Yao, Y.Y. (1997) Combination of Rough and Fuzzy Sets based on alpha-level sets. In Lin, T.Y. and Cerone, N. (Eds.), Rough Sets and Data Mining: Analysis for Imprecise Data, Boston, MA: Kluwer Academic Publishers, 301–321.

    Chapter  Google Scholar 

  31. Zakowski, W. (1983) Approximations in the Space (U,II). Demonstratio Math-ematica, XVI, 761–769.

    MathSciNet  Google Scholar 

  32. Ziarko, W. (1993) Variable precision rough set model. Journal of Computer and System Sciences, 46, 39–59.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Srinivasan, P., Kraft, D., Chen, J. (2000). Rough and Fuzzy Sets for Data Mining of a Controlled Vocabulary for Textual Retrieval. In: Crestani, F., Pasi, G. (eds) Soft Computing in Information Retrieval. Studies in Fuzziness and Soft Computing, vol 50. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1849-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-7908-1849-9_15

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-7908-2473-5

  • Online ISBN: 978-3-7908-1849-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics