Skip to main content

A Hybrid Framework Using SOM and Fuzzy Theory for Textual Classification in Data Mining

  • Chapter
Modelling with Words

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2873))

Abstract

This paper presents a hybrid framework combining self-organising map (SOM) and fuzzy theory for textual classification. Clustering using self-organizing maps is applied to produce multiple targets. In this paper, we propose that an amalgamation of SOM and association rule theory may hold the key to a more generic solution, less reliant on initial supervision and redundant user interaction. The results of clustering stem words from text documents could be utilised to derive association rules which designate the applicability of documents to the user. A four stage process is consequently detailed, demonstrating a generic example of how a graphical derivation of associations may be derived from a repository of text documents, or even a set of synopses of many such repositories. This research demonstrates the feasibility of applying such processes for data mining and knowledge discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pendharkar, P.C., et al.: Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Systems with Applications 17(3), 223–232 (1999)

    Article  Google Scholar 

  2. Ahonen, H., et al.: Applying Data Mining Techniques in Text Analysis, pp. 1–12. University of Helsinki, Helsinki (1997)

    Google Scholar 

  3. Kaski, S.: The Self-organizing Map (SOM), p. 1. Helsinki University of Technology, Helsinki (1999)

    Google Scholar 

  4. Kohonen, T., et al.: Self Organization of a Massive Document Collection. IEEE Transactions on Neural Networks 11(3), 574–585 (2000)

    Article  Google Scholar 

  5. Vesanto, J.: SOM-based data visualization methods. Intelligent Data Analysis 3(2), 111–126 (1999)

    Article  MATH  Google Scholar 

  6. Klose, A., et al.: Interactive Text Retrieval Based on Document Similarities. Phys. Chem. Earch (A) 25(8), 649–654 (2000)

    Article  Google Scholar 

  7. Merkl, D.: Text classification with self-organizing maps: Some lessons learned. Neurocomputing 21(1-3), 61–77 (1998)

    Article  Google Scholar 

  8. Savoy, J.: Statistical Inference in Retrieval Effectiveness Evaluation. Information Processing and Management 33(4), 495–512 (1997)

    Article  Google Scholar 

  9. Riloff, E., Lehnert, W.: Information extraction as a basis for high-precision text classification. ACM Transactions on Information Systems 12(3), 296–333 (1994)

    Article  Google Scholar 

  10. Chang, C.-H., Hsu, C.-C.: Enabling Concept-Based Relevance Feedback for Information Retrieval on the WWW. IEEE Transactions on Knowledge and Data Engineering 11(4), 595–608 (1999)

    Article  Google Scholar 

  11. O’Donnell, R., Smeaton, A.: A Linguistic Approach to Information Retrieval. In: 16th Research Colloquium of the British Computer Society Information Retrieval Specialist Group, Taylor Graham Publishing, London (1996)

    Google Scholar 

  12. Srinivasan, P., et al.: Vocabulary mining for information retrieval: rough sets and fuzzy sets. Information Processing and Management 37(1), 15–38 (2001)

    Article  MATH  Google Scholar 

  13. Kaski, S., et al.: WEBSOM - Self-organizing maps of document collections. Neurocomputing 21(1-3), 101–117 (1998)

    Article  MATH  Google Scholar 

  14. Vesanto, J., Alhoniemi, E.: Clustering of the Self-Organizing Map. IEEE Transactions on Neural Networks 11(3), 586–600 (2000)

    Article  Google Scholar 

  15. Alahakoon, D., Halgamuge, S.K., Srinivasan, B.: Dynamic Self Organizing Maps with Controlled Growth for Knowledge Discovery. IEEE Transactions on Neural Networks 11(3), 601–614 (2000)

    Article  Google Scholar 

  16. De Ketelaere, B., et al.: A hierarchical Self-Organizing Map for classification problems, pp. 1–5. K.U. Leuven, Belgium (1997)

    Google Scholar 

  17. Cervera, E., del Pobil, A.P.: Multiple self-organizing maps: A hybrid learning scheme. Neurocomputing 16(4), 309–318 (1997)

    Article  Google Scholar 

  18. Wan, W., Fraser, D.: Multisource Data Fusion with Multiple Self-Organizing Maps. IEEE Transactions on Geoscience and Remote Sensing 37(3), 1344–1349 (1999)

    Article  Google Scholar 

  19. Kohonen, T., Somervuo, P.: Self-organizing maps of symbol strings. Neurocomputing 21(1-3), 19–30 (1998)

    Article  MATH  Google Scholar 

  20. Chen, H., et al.: Internet Browsing and Searching: User Evaluations of Cate- gory Map and Concept Space Techniques. Journal of the American Society for Information Science 49(7), 582–603 (1998)

    Google Scholar 

  21. De Backer, S., Naud, A., Scheunders, P.: Non-linear dimensionality reduc- tion techniques for unsupervised feature extraction. Pattern Recognition Letters 19(8), 711–720 (1998)

    Article  MATH  Google Scholar 

  22. Yin, H., Allinson, N.M.: Interpolating self-organising map (iSOM). Electronics Letters 35(19), 1649–1650 (1999)

    Article  Google Scholar 

  23. Hämäläinen, T., et al.: Mapping of SOM and LVQ algorithms on a tree shape parallel computer system. Parallel Computing 23(3), 271–289 (1997)

    Article  MATH  Google Scholar 

  24. Walter, J., Ritter, H.: Rapid learning with parametrized self-organizing maps. Neurocomputing 12(2-3), 131–153 (1996)

    Article  MATH  Google Scholar 

  25. Kangas, J., Kohonen, T.: Developments and applications of the self-organizing map and related algorithms. Mathematics and Computers in Simulation 41(1-2), 3–12 (1996)

    Article  Google Scholar 

  26. Joshi, K.P.: Analysis of Data Mining Algorithms, 1–19 (1997), http://www.gl.umbc.edu/~kjoshi1/data-mine/proj_rpt.htm

  27. Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering 12(3), 372–390 (2000)

    Article  MathSciNet  Google Scholar 

  28. Boley, D., et al.: Partioning-based clustering for Web document categorization. Decision Support Systems 27(3), 329–341 (1999)

    Article  Google Scholar 

  29. Pudi, V., Haritsa, J.R.: Quantifying the Utility of the Past in Mining Large Databases. Information Systems 25(5), 323–343 (2000)

    Article  Google Scholar 

  30. Gunther, P., Chen, P.: A Framework to Hybrid SOM Performance for Textual Classification. In: Proceedings of the 10th International IEEE conference on Fuzzy Systems, pp. 968–971. IEEE CS Press, Los Alamitos (2001)

    Google Scholar 

  31. Prade, H., Testemale, C.: Generalizing Database Relational Algebra for the Treatment of Incomplete/Uncertain Information and Vague Queries. Information Sciences 34, 115–143 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  32. Bosc, P., Galibourg, M.: Indexing Principles for a Fuzzy Data Base. Information Systems 14, 493–499 (1989)

    Article  Google Scholar 

  33. Pirolli, P., Schank, P., Hearst, M.A., Diehl, C.: Scatter/ Gather Browsing Communicates the Topic Structure of a Very large Text Collection. In: Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI) (May 1996)

    Google Scholar 

  34. Drobics, M., Bodenhofer, U., Winiwarter, W.: Interpretation of Self- Organizing Maps with Fuzzy Rules (2000). In: Proceedings of ICTAI 2000, Vancouver, pp. 304-311, (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Chen, YP.P. (2003). A Hybrid Framework Using SOM and Fuzzy Theory for Textual Classification in Data Mining. In: Lawry, J., Shanahan, J., L. Ralescu, A. (eds) Modelling with Words. Lecture Notes in Computer Science(), vol 2873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39906-3_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39906-3_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20487-9

  • Online ISBN: 978-3-540-39906-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics