Abstract
The notion of collection plays a key role in Digital Libraries, where several kinds of collections are typically found. We claim that all these kinds can be unified into a single abstraction mechanism, endowed with an extension and an intension, similarly to predicates in logic. The extension of a collection is the set of documents that are members of the collection at a given point in time, while the intension is a description of the meaning of the collection, that is the peculiar property that the members of the collection possess and that distinguishes the collection from other collections. The problem then arises how to automatically derive the intension from a given extension, a problem that must be solved e.g. for the creation of a collection from a set of documents. It turns out that our notion of collection is very close to the notion of formal concept in Formal Concept Analysis, which provides a well-founded framework to formalize the problem and very useful tools to solve it. We exploit this framework to study the problem of automatically deriving a collection intension from a given extension.We then show how intensions can be exploited for carrying out basic tasks on collections, establishing a connection between Digital Library management and data integration.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Adjiman, P., Chatalic, P., Goasdoué, F., Rousset, M.C., Simon, L.: Distributed reasoning in a peer-to-peer setting: Application to the semantic web. Journal of Artificial Intelligence Research 25, 269–314 (2006)
Baader, F., Calvanese, D., McGuiness, D., Nardi, D., Patel-Scheneider, P. (eds.): The description logic handbook. Cambridge University Press, Cambridge (2003)
Bergmark, D.: Collection Synthesis. In: Proceeding of the second ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 253–262. ACM Press, New York (2002), http://doi.acm.org/10.1145/544220.544275
Blair, D.C.: The challenge of commercial document retrieval, Part II: a strategy for document searching based on identifiable document partitions. Information Processing and Management 38, 293–304 (2002)
Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions on Information Systems (TOIS) 19(2), 97–130 (2001), http://doi.acm.org/10.1145/382979.383040
Callan, J.P., Lu, Z., Croft, W.B.: Searching Distributed Collections with Inference Networks. In: Fox, E.A., Ingwersen, P., Fidel, R. (eds.) Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 21–28. ACM Press, Seattle (1995)
Candela, L.: Virtual Digital Libraries. Ph.D. thesis, Information Engineering Department, University of Pisa (2006)
Candela, L., Castelli, D., Ferro, N., Koutrika, G., Meghini, C., Ioannidis, Y., Pagano, P., Ross, S., Soergel, D., Agosti, M., Dobreva, M., Katifori, V., Schuldt, H.: The DELOS Digital Library Reference Model - Foundations for Digital Libraries. DELOS Network of Excellence on Digital Libraries (2007) ISBN 2-912337-37-X
Candela, L., Castelli, D., Pagano, P.: A service for supporting virtual views of large heterogeneous digital libraries. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 362–373. Springer, Heidelberg (2003)
Candela, L., Straccia, U.: The Personalized, Collaborative Digital Library Environment Cyclades and its Collections Management. In: Callan, J., Crestani, F., Sanderson, M. (eds.) SIGIR 2003 Ws Distributed IR 2003. LNCS, vol. 2924, pp. 156–172. Springer, Heidelberg (2004)
Carpineto, C., Romano, G.: Information retrieval through hybrid navigation of lattice representations. International Journal of Human-Computer Studies 45(5), 553–578 (1996)
Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Machine Learning 24(2), 95–122 (1996)
Carpineto, C., Romano, G.: Effective reformulation of boolean queries with concept lattices. In: Andreasen, T., Christiansen, H., Larsen, H.L. (eds.) FQAS 1998. LNCS (LNAI), vol. 1495, pp. 83–94. Springer, Heidelberg (1998)
Carpineto, C., Romano, G.: Order-theoretical ranking. Journal of American Society for Information Science 51(7), 587–601 (2000)
French, J.C., Powell, A.L., Callan, J., Viles, C.L., Emmitt, T., Prey, K.J., Mou, Y.: Comparing the performance of database selection algorithms. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 238–245. ACM Press, New York (1999), http://doi.acm.org/10.1145/312624.312684
Ganter, B., Wille, R.: Applied lattice theory: Formal concept analysis, http://www.math.tu.dresden.de/~ganter/psfiles/concept.ps
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)
Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, New York (1979)
Geisler, G., Giersch, S., McArthur, D., McClelland, M.: Creating Virtual Collections in Digital Libraries: Benefits and Implementation Issues. In: Proceedings of the second ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 210–218. ACM Press, New York (2002), http://doi.acm.org/10.1145/544220.544265
Godin, R., Gecsei, J., Pichet, C.: Design of a browsing interface for information retrieval. In: Proceedings of SIGIR 1989, the Twelfth Annual International ACM Conference on Research and Development in Information Retrieval, Cambridge, MA, pp. 32–39 (1989)
Gonçalves, M.A., Fox, E.A., Watson, L.T., Kipp, N.A.: Stream, structures, spaces, scenarios, societies (5s): A formal model for digital library. ACM TOIS 22(2), 270–312 (2004)
Halevy, A.Y.: Answering Queries Using Views: A Survey. VLDB Journal 10(4), 270–294 (2001)
Lagoze, C., Fielding, D.: Defining Collections in Distributed Digital Libraries. D-Lib Magazine (1998), http://www.dlib.org
Lenzerini, M.: Data integration: A theoretical perspective. In: Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2002), Madison, Winsconsin, USA (2002) (Invited tutorial)
Meghini, C., Spyratos, N.: Preference-based query tuning through refinement/enlargement in a formal context. In: Dix, J., Hegner, S.J. (eds.) FoIKS 2006. LNCS, vol. 3861, pp. 278–293. Springer, Heidelberg (2006)
Meghini, C., Spyratos, N.: Computing intensions of digital library collections. In: Kuznetsov, S.O., Schmidt, S. (eds.) ICFCA 2007. LNCS (LNAI), vol. 4390, pp. 66–91. Springer, Heidelberg (2007)
Meghini, C., Spyratos, N.: Synthesizing monadic predicates. Journal of Logic and Computation 18, 831–847 (2008)
Priss, U.: Lattice-based information retrieval. Knowledge Organization 27(3), 132–142 (2000)
Renda, M.E., Callan, J.: The robustness of content-based search in hierarchical peer to peer networks. In: CIKM 2004: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 562–570. ACM Press, New York (2004), http://doi.acm.org/10.1145/1031171.1031276
Witten, I.H., Bainbridge, D., Boddie, S.J.: Power to the People: End-user Building of Digital Library Collections. In: Proceedings of the first ACM/IEEE-CS joint conference on Digital libraries, pp. 94–103. ACM Press, New York (2001), http://doi.acm.org/10.1145/379437.379458
Xu, J., Cao, Y., Lim, E.P., Ng, W.K.: Database selection techniques for routing bibliographic queries. In: Proceedings of the third ACM conference on Digital Libraries, pp. 264–274. ACM Press, New York (1998), http://doi.acm.org/10.1145/276675.276707
Yuwono, B., Lee, D.L.: Server Ranking for Distributed Text Retrieval Systems on the Internet. In: Database Systems for Advanced Applications, pp. 41–50 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Meghini, C., Spyratos, N. (2010). Unifying the Concept of Collection in Digital Libraries. In: Ras, Z.W., Tsay, LS. (eds) Advances in Intelligent Information Systems. Studies in Computational Intelligence, vol 265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05183-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-05183-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05182-1
Online ISBN: 978-3-642-05183-8
eBook Packages: EngineeringEngineering (R0)