Using Concept Lattices for Text Retrieval and Mining

  • Claudio Carpineto
  • Giovanni Romano
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3626)


The potentials of formal concept analysis (FCA) for information retrieval (IR) have been highlighted by a number of research studies since its inception. With the proliferation of small-size specialised text databases available in electronic format and the advent of Web-based graphical interfaces, FCA has then become even more appealing and practical for searching text collections. The main advantage of FCA for IR is the possibility of eliciting context, which may be used both to improve the retrieval of specific items from a text collection and to drive the mining of its contents. In this paper, we will focus on the unique features of FCA for building contextual IR applications as well as on its most critical aspects. The development of a FCA-based application for mining the web results returned by a major search engine is envisaged as the next big challenge for the field.


Concept Lattice Document Lattice Test Collection Formal Concept Analysis Semistructured Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agosti, M., Melucci, M., Crestani, F.: Automatic authoring and construction of hypertexts for information retrieval. ACM Multimedia Systems 3, 15–24 (1995)CrossRefGoogle Scholar
  2. 2.
    Amati, G., Carpineto, C., Romano, G.: FUB at TREC-10 Web Track: A Probabilistic Framework for Topic Relevance Term Weighting. In: Proceedings of the 10th Text REtrieval Conference (TREC-10), NIST Special Publication 500-250, pp. 182–191, Gaithersburg, MD, USA (2001)Google Scholar
  3. 3.
    Berenci, E., Carpineto, C., Giannini, V., Mizzaro, S.: Effectiveness of keywordbased display and selection of retrieval results for interactive searches. International Journal on Digital Libraries 3(3), 249–260 (2000)CrossRefGoogle Scholar
  4. 4.
    Bordat, J.P.: Calcul pratique du treillis de Galois d’une correspondance. Math. Sci. Hum. 96, 31–47 (1986)zbMATHMathSciNetGoogle Scholar
  5. 5.
    Card, S., Moran, T., Newell, A.: The psychology of human-computer interaction. Lawrence Erlbaum Associates, London (1983)Google Scholar
  6. 6.
    Carpineto, C., De Mori, R., Romano, G., Bigi, B.: An information theoretic approach to automatic query expansion. ACM Transactions on Information Systems 19(1), 1–27 (2001)CrossRefGoogle Scholar
  7. 7.
    Carpineto, C., Romano, G.: An order-theoretic approach to conceptual clustering. In: Proceedings of the 10th International Conference on Machine Learning, Amherst, MA, USA, pp. 33–40 (1993)Google Scholar
  8. 8.
    Carpineto, C., Romano, G.: Dynamically bounding browsable retrieval spaces: an application to Galois lattices. In: Proceedings of RIAO 1994: Intelligent Multimedia Information Retrieval Systems and Management, New York, New York USA, pp. 520–533 (1994)Google Scholar
  9. 9.
    Carpineto, C., Romano, G.: ULYSSES: A lattice-based multiple interaction strategy retrieval interface. In: Blumenthal, U., Gornostaev (eds.) Human- Computer Interaction, 5th International Conference, EWHCI, Selected Papers, pp. 91–104. Springer, Berlin (1995)Google Scholar
  10. 10.
    Carpineto, C., Romano, G.: Information retrieval through hybrid navigation of lattice representations. International Journal of Human-Computer Studies 45(5), 553–578 (1996)CrossRefGoogle Scholar
  11. 11.
    Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Machine Learning 24(2), 1–28 (1996)Google Scholar
  12. 12.
    Carpineto, C., Romano, G.: Effective reformulation of Boolean queries with concept lattices. In: Proceedings of the 3rd International Conference on Flexible Query-Answering Systems, Roskilde, Denmark, pp. 83–94 (1998)Google Scholar
  13. 13.
    Carpineto, C., Romano, G.: Order-Theoretical Ranking. Journal of the American Society for Information Science 51(7), 587–601 (2000)CrossRefGoogle Scholar
  14. 14.
    Carpineto, C., Romano, G., Giannini, V.: Improving retrieval feedback with multiple term-ranking function combination. ACM Transactions on Information Systems 20(3), 259–290 (2002)CrossRefGoogle Scholar
  15. 15.
    Cole, R., Eklund, P.: Browsing semi-structured web texts using formal concept analysis. In: Proceedings of the 9th International Conference on Conceptual Structures, Stanford, CA, USA, pp. 319–332 (2001)Google Scholar
  16. 16.
    Cole, R., Eklund, P., Stumme, G.: Document retrieval for e-mail search and discovery using formal concept analysis. Applied Artificial Intelligence 17(3), 257–280 (2003)CrossRefGoogle Scholar
  17. 17.
    Cole, R., Stumme, G.: CEM: A Conceptual Email Manager. In: Proceedings of the 8th International Conference on Conceptual Structures, Darmstadt, Germany, pp. 438–452 (2000)Google Scholar
  18. 18.
    Efthimiadis, E.: Query expansion. In: Williams, M.E. (ed.) Annual Review of Information Systems and Technology, vol. 31, pp. 121–187. American Society for Information Science, Silver Spring, Maryland, USA (1996)Google Scholar
  19. 19.
    Ferré, S., Ridoux, O.: A file system based on concept analysis. In: Proceedings of the 1st International Conference on Computational Logic, London, UK, pp. 1033–1047 (2000)Google Scholar
  20. 20.
    Ganter, B.: Two basic algorithms in concept analysis. Technical Report FB4– Preprint No. 831, TU Darmstadt, Germany (1984)Google Scholar
  21. 21.
    Ganter, B., Wille, R.: Formal Concept Analysis – Mathematical Foundations. Springer, Heidelberg (1999)zbMATHGoogle Scholar
  22. 22.
    Gershon, N., Card, S.K., Eick, S.G.: Information visualization tutorial. In: Proceedings of ACM CHI 1998: Human Factors in Computing Systems, Los Angeles, CA, USA, pp. 109–110 (1998)Google Scholar
  23. 23.
    Gifford, D.K., Jouvelot, P., Sheldon, M.A., JrO’Toole, J.W.: Semantic file systems. In: Proceedings of the 13th ACM Symposium on Operating Systems Principles, pp. 16–25 (1991)Google Scholar
  24. 24.
    Godin, R., Gecsei, J., Pichet, C.: Design of a browsing interfaces for information retrieval. In: Proceedings of the 12th Annual International ACM SIGIR Conference on Reasearch and Development in Information Retrieval, pp. 32–39 (1989)Google Scholar
  25. 25.
    Godin, R., Mili, H.: Building and Maintaining Analysis Level Class Hierarchies Using Galois Lattices. In: Proceedings of the 8th Annual Conference on Object Oriented Programming Systems Languages and Applications, Washington, D.C., USA, pp. 394–410 (1993)Google Scholar
  26. 26.
    Godin, R., Missaoui, R., Alaoui, H.: Incremental concept formation algorithms based on Galois lattices. Computational Intelligence 11(2), 246–267 (1995)CrossRefGoogle Scholar
  27. 27.
    Godin, R., Missaoui, R., April, A.: Experimental comparison of navigation in a Galois lattice with conventional information retrieval methods. International Journal of Man-Machine Studies 38, 747–767 (1993)CrossRefGoogle Scholar
  28. 28.
    Godin, R., Saunders, E., Jecsei, J.: Lattice model of browsable data spaces. Journal of Information Sciences 40, 89–116 (1986)zbMATHCrossRefGoogle Scholar
  29. 29.
    Gopal, B., Manber, U.: Integrating content-based access mechanisms with hierarchical file systems. In: Proceedings of 3rd Symposium on Operating Systems Design and Implementation, New Orleans, Louisiana, USA, pp. 265–278 (1999)Google Scholar
  30. 30.
    Hearst, M.: User interfaces and visualization. In: Baeza-Yates, R., Ribeiro- Neto, B. (eds.) Modern Information Retrieval, pp. 257–322. ACM Press, New York (1999)Google Scholar
  31. 31.
    Hearst, M.A.: Untangling text data mining. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 1999), College Park, MD, USA (1999)Google Scholar
  32. 32.
    Hoaglin, D.C., Mosteller, F., Tukey, J.W.: Understanding robust and exploratory data analysis. John Wiley & Sons, Inc., Chichester (1983)zbMATHGoogle Scholar
  33. 33.
    Joho, H., Sanderson, M., Beaulieu, M.: Hierarchical approach to term suggestion device. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, p. 454 (2002)Google Scholar
  34. 34.
    Karp, D., Schabes, Y., Zaidel, M., Egedi, D.: A freely available wide coverage morphological analyzer for English. In: Proceedings of the 14th International Conference on Computational Linguistics (COLING 1992), Nantes, France, pp. 950–955 (1992)Google Scholar
  35. 35.
    Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. Journal of Experimental and Theoretical Artificial Intelligence 14(2–3), 189–216 (2002)zbMATHCrossRefGoogle Scholar
  36. 36.
    Lindig, C.: Concept-based component retrieval. In: Working notes of the IJCAI 1995 workshop: Formal Approaches to the Reuse of Plans, Proofs, and Programs, Montreal, Canada, pp. 21–25 (1995)Google Scholar
  37. 37.
    Lindig, C.: Fast concept analysis. In: Working with conceptual structures – Contribution to the 8th International Conference on Conceptual Structures, Darmstadt, Germany, pp. 152–161 (2000)Google Scholar
  38. 38.
    Lucarella, D., Parisotto, S., Zanzi, A.: MORE: Multimedia Object Retrieval Environment. In: Proceedings of ACM Hypertext 1993, Seattle, WA, USA, pp. 39–50 (1993)Google Scholar
  39. 39.
    Maarek, Y., Berry, D., Kaiser, G.: An information retrieval approach for automatically constructing software libraries. IEEE Transactions on software Engineering 17(8), 800–813 (1991)CrossRefGoogle Scholar
  40. 40.
    Norman, D.: Cognitive engineering. In: Norman, D., Draper, S. (eds.) User centered system design, pp. 31–61. Lawrence Erlbaum Associates, Hillsdale (1986)Google Scholar
  41. 41.
    Nourine, L., Raynaud, O.: A fast algorithm for building lattices. Information. Information Processing Letters 71, 199–204 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  42. 42.
    Pedersen, G.: A browser for bibliographic information retrieval based on an application of lattice theory. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pittsburgh, PA, USA, pp. 270–279 (1993)Google Scholar
  43. 43.
    Porter, M.F.: An algorithm for suffix stripping. Program 14, 130–137 (1980)Google Scholar
  44. 44.
    Priss, U.: A graphical interface for document retrieval based on Formal Concept Analysis. In: Proceedings of the 8th Midwest Artificial Intelligence and Cognitive Science Conference, Dayton, Ohio, USA, pp. 66–70 (1997)Google Scholar
  45. 45.
    Priss, U.: Lattice-based information retrieval. Knowledge Organization 27(3), 132–142 (2000)Google Scholar
  46. 46.
    Robertson, S.E., Walker, S., Beaulieu, M.M.: Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC, and Interactive track. In: Proceedings of the 7th Text REtrieval Conference (TREC-7), NIST Special Publication 500-242, Gaithersburg, MD, USA, pp. 253–264 (1998)Google Scholar
  47. 47.
    Rock, T., Wille, R.: Ein Toscana-Erkundungssystem zur Literatursuche. In: Stumme, G., Wille, R. (eds.) Begriffliche Wissensverarbeitung. Methoden und Anwendungen, pp. 239–253. Springer, Berlin (2000)Google Scholar
  48. 48.
    Salton, G.: Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading (1989)Google Scholar
  49. 49.
    Snelting, G., Tip, F.: Reengineering class hierarchies using concept analysis. In: Proceedings of ACM SIGSOFT 6th International Symposium on Foundations of Software Engineering, Lake Buena Vista, FL, USA, pp. 99–110 (1998)Google Scholar
  50. 50.
    Soergel, D.: Mathematical analysis of documentation systems. Information storage and retrieval 3, 129–173 (1967)CrossRefGoogle Scholar
  51. 51.
    Spink, A., Saracevic, T.: Interaction in information retrieval: selection and effectiveness of search terms. Journal of the American Society for Information Science 48(8), 741–761 (1997)CrossRefGoogle Scholar
  52. 52.
    Spoerri, A.: InfoCrystal: Integrating exact and partial matching approaches through visualization. In: Proceedings of RIAO 1994: Intelligent Multimedia Information Retrieval, New York, New York USA, pp. 687–696 (1994)Google Scholar
  53. 53.
    Stumme, G.: Local scaling in conceptual data systems. In: Proceedings of the 6th International Conference on Conceptual Structures, Montpellier, France, pp. 308–320 (1998)Google Scholar
  54. 54.
    van der Merwe, F.J., Kourie, D.G.: Compressed pseudo-lattices. Journal of Experimental and Theoretical Artificial Intelligence 14(2–3), 229–254 (2002)zbMATHCrossRefGoogle Scholar
  55. 55.
    Vogt, F., Wachter, C., Wille, R.: Data analysis based on a conceptual file. In: Bock, H.-H., Lenski, W., Ihm, P. (eds.) Classification, Data Analysis and Knowledge Organization, pp. 131–140. Springer, Berlin (1991)Google Scholar
  56. 56.
    Vogt, F., Wille, R.: TOSCANA – A graphical tool for analyzing and exploring data. In: Tammassia, R., Tollis, I.G. (eds.) Graph Drawing 1994, pp. 226–233. Springer, Berlin (1995)Google Scholar
  57. 57.
    Wille, R.: Line diagrams of hierarchical concept systems. Int. Classif. 11(2), 77–86 (1984)Google Scholar
  58. 58.
    Willet, P.: Recent trends in hierarchic document clustering: a critical review. Information Processing & Management 24(5), 577–597 (1988)CrossRefGoogle Scholar
  59. 59.
    Zamir, O., Etzioni, O.: Grouper: A dynamic clustering interface to web search results. WWW8/Computer Networks 31(11–16), 1361–1374 (1999)CrossRefGoogle Scholar
  60. 60.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA, USA, pp. 334–342 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Claudio Carpineto
    • 1
  • Giovanni Romano
    • 1
  1. 1.Fondazione Ugo BordoniRomeItaly

Personalised recommendations