Abstract
We model a Digital Library as a formal context in which objects are documents and attributes are terms describing documents contents. A formal concept is very close to the notion of a collection: the concept extent is the extension of the collection; the concept intent consists of a set of terms, the collection intension. The collection intension can be viewed as a simple conjunctive query which evaluates precisely to the extension. However, for certain collections no concept may exist, in which case the concept that best approximates the extension must be used. In so doing, we may end up with a too imprecise concept, in case too many documents denoted by the intension are outside the extension. We then look for a more precise intension by exploring 3 different query languages: conjunctive queries with negation; disjunctions of negation-free conjunctive queries; and disjunctions of conjunctive queries with negation. We show that a precise description can always be found in one of these languages for any set of documents. However, when disjunction is introduced, uniqueness of the solution is lost. In order to deal with this problem, we define a preferential criterion on queries, based on the conciseness of their expression. We then show that minimal queries are hard to find in the last 2 of the three languages above.
Keywords
- Digital Library
- Precise Description
- Formal Context
- Formal Concept Analysis
- Conjunctive Query
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bergmark, D.: Collection Synthesis. In: Proceeding of the second ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 253–262. ACM Press, New York (2002)
Blair, D.C.: The challenge of commercial document retrieval, Part II: a strategy for document searching based on identifiable document partitions. Information Processing and Management 38, 293–304 (2002)
Candela, L.: Virtual Digital Libraries. PhD thesis, Information Engineering Department, University of Pisa (2006)
Candela, L., Castelli, D., Pagano, P.: A Service for Supporting Virtual Views of Large Heterogeneous Digital Libraries. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 362–373. Springer, Heidelberg (2003)
Candela, L., Straccia, U.: The Personalized, Collaborative Digital Library Environment Cyclades and its Collections Management. In: Callan, J., Crestani, F., Sanderson, M. (eds.) Distributed Multimedia Information Retrieval. LNCS, vol. 2924, pp. 156–172. Springer, Heidelberg (2004)
Carpineto, C., Romano, G.: Information retrieval through hybrid navigation of lattice representations. International Journal of Human-Computer Studies 45(5), 553–578 (1996)
Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Machine Learning 24(2), 95–122 (1996)
Carpineto, C., Romano, G.: Effective reformulation of boolean queries with concept lattices. In: Andreasen, T., Christiansen, H., Larsen, H.L. (eds.) FQAS 1998. LNCS (LNAI), vol. 1495, pp. 83–94. Springer, Heidelberg (1998)
Carpineto, C., Romano, G.: Order-theoretical ranking. Journal of American Society for Information Science 51(7), 587–601 (2000)
Davey, B.A., Priestley, H.A.: Introduction to lattices and order (chapter 3), 2nd edn. Cambridge University Press, Cambridge (2002)
Ganter, B., Wille, R.: Applied lattice theory: Formal concept analysis. http://www.math.tu.dresden.de/~ganter/psfiles/concept.ps
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations, 1st edn. Springer, Heidelberg (1999)
Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)
Geisler, G., et al.: Creating Virtual Collections in Digital Libraries: Benefits and Implementation Issues. In: Proceedings of the second ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 210–218. ACM Press, New York (2002)
Godin, R., Gecsei, J., Pichet, C.: Design of a browsing interface for information retrieval. In: Proceedings of SIGIR89, the Twelfth Annual International ACM Conference on Research and Development in Information Retrieval, Cambridge, MA, pp. 32–39. ACM Press, New York (1989)
Lagoze, C., Fielding, D.: Defining Collections in Distributed Digital Libraries. D-Lib Magazine (November 1998)
Meghini, C., Spyratos, N.: Preference-based query tuning through refinement/enlargement in a formal context. In: Dix, J., Hegner, S.J. (eds.) FoIKS 2006. LNCS, vol. 3861, pp. 278–293. Springer, Heidelberg (2006)
Priss, U.: Lattice-based information retrieval. Knowledge Organization 27(3), 132–142 (2000)
Priss, U.: Formal concept analysis in information science. Annual Review of Information Science and Technology 40, 521–543 (2006)
Witten, I.H., Bainbridge, D., Boddie, S.J.: Power to the People: End-user Building of Digital Library Collections. In: Proceedings of the first ACM/IEEE-CS joint conference on Digital libraries, pp. 94–103. ACM Press, New York (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Meghini, C., Spyratos, N. (2007). Computing Intensions of Digital Library Collections. In: Kuznetsov, S.O., Schmidt, S. (eds) Formal Concept Analysis. ICFCA 2007. Lecture Notes in Computer Science(), vol 4390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70901-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-70901-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70828-5
Online ISBN: 978-3-540-70901-5
eBook Packages: Computer ScienceComputer Science (R0)
