Abstract

The Digital Libraries area is initially introduced with a report on initial approaches of designing library automation systems that can be considered “ancestors” of present days systems. After having presented the background to the area, the main concepts that underline present digital library systems are introduced together with a report on the efforts of defining the Digital Library Manifesto and the DELOS Digital Library Reference Model. Considerations on a possible way of improving present digital library systems to make them more user-centered are subsequently given. Finally, interoperability and evaluation issues are faced. The presentation ends with a concluding remark.

1.1 Introduction

In the beginning, Digital Libraries were almost monolithic systems, each one built for a specific kind of information resources—e.g. images or videos—and with very specialized functions developed ad-hoc for those contents. This approach caused a flourishing of systems where the very same functions were developed and re-developed many times from scratch. Moreover, these systems were confined to the realm of traditional libraries, since they were the digital counterpart of the latter, and they had a kind of static view of their role, which was data-centric rather than user-centric.

Afterwards, Digital Libraries moved from being monolithic systems to becoming component and service-based systems, where easily configurable and deployable services can be plugged together and re-used to create a Digital Library. Moreover, Digital Libraries started to be seen as increasingly user-centered systems, where the original content management task is partnered with new communication and cooperation tasks, so that Digital Libraries become a vehicle by which everyone can access, discuss, and enhance information of different forms. Finally, Digital Libraries are no longer perceived as isolated systems but, on the contrary, as systems that need to cooperate with each other to improve the user experience and give personalized services.

The Digital Libraries area has attracted much attention in recent years both from academics and professionals interested in envisaging new tools and systems able to manage diversified collections of documents, artifacts, and data in digital form in a consistent and coherent way. The area can be still considered a relatively young area since it is only fifteen/twenty years old, with its origins in the “Digital Library Initiative” in USA.1 and the constellation of the DELOS initiatives in Europe.2

In this evolving scenario, the design and development of effective services which foster cooperation among users and the integration of heterogeneous information resources becomes a key factor. Relevant examples of this kind are given together with examples that make digital libraries interoperable services.

The presentation is organized as follows: Section 1.2 introduces the Digital Libraries area and reports on the initial approaches of designing library automation systems that can be considered “ancestors” of present day systems; Section 1.3 presents the main concepts underlining present digital library systems; Section 1.4 reports on the efforts of defining the Digital Library Manifesto and the DELOS Digital Library Reference Model; Section 1.5 gives some clues on possible ways of improving present digital library systems to make them more user-centered systems; Section 1.6 addresses the concept of interoperability in the context of digital library systems; Section 1.7 sets the scene for an evaluation framework to be adopted for evaluating digital library systems; finally, Section 1.8 puts forward some final remarks.

1.2 Digital Libraries in the Beginning

Initial application systems able to manage the permanent data of interest to libraries were named library automation systems and they first appeared world-wide in the 1970s.

In general those systems were able to manage only the catalog data representing physical library objects—such as books, journals, and reports—that were held in a real and physical library, i.e. a physical place external to the application system, and were only able to refer to the physical library objects through the managed catalog data. The functions available through those systems were limited in particular because of limited space available on non-volatile random access memory; an example of a system that was impaired by a design approach that was limiting the possible management functions due to limitation in the availability of non-volatile memory was LINGEB, a system able to manage and retrieve documents only in classes instead of single items (Agosti et al. 1975), and the system named DOC-5 derived from it several years later (Agosti and Ronchi 1979; Agosti 1980).

Catalog data are now usually referred to as metadata, since they are data that represent other data (Metadata-IFLA-WG 2005). Each library or each group of cooperating libraries normally adopt a metadata scheme to make its metadata of general use for its users or for other libraries, institutions, and systems. At that time, catalog data represented only physical objects held in real and physical libraries, so objects held in archives and museums were not represented and managed in software application systems for library automation. Only book collections of archives and museums were managed at that time through library automation systems.

In the 1980s the most advanced library automation systems were designed to include procedures able to collect log data. Log data were collected to manage the system itself, and especially to monitor the usage of system search facilities by users, where the search facility which was designed for user search and access to catalog data was named Online Public Access Catalog (OPAC) (Hildreth 1985, 1989) and this name is still in use today.

An OPAC is a sophisticated software system designed to provide final users with direct access to the catalog data without the intervention of a professional user and to make available to final users all the data in the catalog database managed by the software system. The catalog database was and still is constructed by professional librarians who use authority control rules in describing author, place names and other relevant catalog data (Guerrini and Sardo 2003); over time the librarians have constructed many authority files where the software system stores all lists of preferred or accepted forms of names and other relevant headings (Baldacci and Sprugnoli 1983). Figure 1.1 shows an extract from an authority file of names which are stored in alphabetical order; the first depicted accepted form is “CARTESANA, MARISA”; the last depicted form is “CARTHARIUS, CAROLUS”, which is not the accepted form, so a cross-reference to “CARTARI, CARLO”, which is the accepted form, is made.
Fig. 1.1

An extract from an authority file of names which are stored in alphabetical order

Figure 1.2 shows a screen shot of the interface of a tool that gives access to authority files; the user has requested to see the list of authors that have works inserted in the catalog record database managed by the library automation system where the word “Smith” is a surname, and the letter “R” is the initial of the first name. The software tool is reporting all the variations of the string “Smith R” in alphabetical order together with the occurrences of each form present in the managed database.
Fig. 1.2

A screen shot of the interface of a tool that gives access to authority files

The complex database which is managed by a library automation system is a coherent collection of catalog data and authority files which can be searched by the OPAC system to give a more professional and reliable answer to the final user.

Traditional OPAC systems were accessible by registered users and through public login procedures. In both cases it was possible to trace each user/system interaction and each user session was identifiable, because at this level of development of software systems, each application system, even in a distributed environment, was reached using a system dependent interface. We call this type of access pre-Web access, because only since the introduction of the web Internet application and the web clients, that is the “browsers”,3 have library automation system become reachable in a distributed environment through a standard software interface (Tanenbaum 1996). Figure 1.3 shows a screen dump of the interface of the DUO OPAC system which was in use at the University of Padua in early 1990s (Agosti and Masotti 1992a, 1992b); this constitutes an example of the type of interaction which was available to the user through a character oriented interface before the introduction of the Web.
Fig. 1.3

Screen dump of DUO system: the OPAC system available at the University of Padua before the Web was invented

Figure 1.4 sketches the architecture of a library automation system before the Web was invented; this sort of library automation system can be considered a sort of ancestor of a present-day digital library system.
Fig. 1.4

Architecture of a library automation system before the Web

Figure 1.5 represents the architecture of the most common way of accessing a library automation system nowadays where the software system is made accessible through a web interface making the system itself interoperable with a web server. Figure 1.6 shows the view a final user has of the access through the web interface to the library automation system.
Fig. 1.5

Architecture of the different systems that interoperate to permit the user to access a library automation system through the Web

Fig. 1.6

Web user interface to the library automation system of the University of Padua

Due to the experience gained in the management of operating systems and the many other application systems that manage permanent data, log procedures are commonly put in place to collect and store data on the usage of application systems by its users. Since over time it became apparent that log data could also be used to study the usage of the application by its users, and to better adapt and personalize the system to the objectives the users were expecting to reach, log data started to be collected and analysed well before the library automation systems were accessible through the Web. Information on the use of the interaction between the system and the user was also stored at the beginning of the development of library automation systems in log files where information on the specific queries made by final users referring to the specific authority files from which the data were extracted were made. From that time the information on OPAC queries were used to better understand the effective use by the final user of the data stored by the library automation system.

Towards the end of the 1980s/beginning of 1990s it became apparent that a library automation system could not only manage catalog data or metadata describing physical objects, but also digital files representing physical objects, such as a digital file representing all the content of a book in digital form or a digital file representing an illuminated manuscript. Later on some objects started to appear in digital form—so called born-digital—so the collection of types of descriptions of physical objects and of digital objects themselves was becoming increasingly diversified and complex. Former library automation systems appeared to be limited in managing data related to such a diversified situation so the need to envisage and design a new generation of systems able to face the new reality of interest was evident.

The new collections of interest were those managed in book libraries, film libraries, music libraries, archives, museums, and so on. The new type of systems able to manage such diversified collections were named Digital Library Systems to highlight that the objects comprising the collection of interest were the many different types of objects that can be maintained in a library together with born-digital objects. Maintaining the term library was later considered misleading by some, but it still remains as the name for identifying this type of system, since no better name has been proposed and widely adopted by the reference community.

1.3 Digital Library Systems

Current digital library systems are complex software systems, often based on a service-oriented architecture, able to manage complex and diversified collections of digital objects. One significant aspect that relates current systems to the old ones is that the representation of the content of the digital objects that constitute the collection of interest is done by professionals. This means that the management of metadata can still be based on the use of authority control rules in describing author, place names and other relevant catalog data.

A digital library system can exploit authority data that keep lists of preferred or accepted forms of names and all other relevant headings, and it can also use more advanced systems of knowledge organization specifically envisaged for digital libraries, thus overcoming the shortcomings that derive from the use of authority files only in a traditional way (Hodge 2000). A more active and new way of using the principles of authority files can make a dramatic difference between digital library systems and search engines in terms of quality of information retrieval and access for the final users; by the way, this aspect is usually overcome with the analysis of log data. In fact a search engine often becomes a specific component of a digital library system, when the digital library system faces the management and search of digital objects by content, much the same way as information retrieval systems and search engines do (Salton and McGill 1983; Baeza-Yates and Ribeiro-Neto 1999; Agosti 2008). In all other types of searches, either the digital library system makes use of authority data to respond to final users in a more consistent and coherent way through a search system that is a sort of a new generation OPAC system, or the system supports the full content search with a service that gives the final users the facilities of a search engine.

If we consider the information space where a digital library system operates, as depicted in Fig. 1.7, it becomes evident that a digital library system operates on contents that require knowledge of user tasks and that are semi-structured. In fact, a digital library system operates at a sort of cross-roads between the structured data managed by catalog and database management systems, and the un-structured data managed in information retrieval systems/search engines and the Web.
Fig. 1.7

Digital library information space

The contents a digital library system is able to manage correspond to the diversified collections of media that can be represented in a digital form. This means that together with traditional textual documents, digital library systems are able to manage images, musical documents, and in general complex objects in the form of video, as shown in Fig. 1.8 where the spread of different types of media is reported. Each media can require specific services to be supported by the digital library system to match specific user requirements and tasks.
Fig. 1.8

Contents of digital libraries

1.3.1 User Interface to Digital Library Systems

It is worth underlining that the access to each service a digital library system provides is usually supplied through a web browser, and not through a specifically designed interface. This means that the analysis of user interaction with systems that have a Web-based interface requires the forecasting of ways that support the reconstruction of sessions in a setting, like the Web, where sessions are not naturally identified and kept (Berendt et al. 2002).

The use of a web interface is advantageous as it requires less effort by the final user accustomed to the use of a web browser to access and use many Web-based applications. The disadvantage is that through a Web-based interface the specific and rich semantics specifically related to the digital library application in use cannot be expressed. Often the digital library software system developer is forced to structure the user interface to the browser characteristics instead of structuring the interaction in a more natural way for the application under development. Another negative effect of a Web-based interface is that the user is accustomed to frequently interacting with search engines expects to find the functions a search engine supplies without having to open his mind towards a system that can give a richer interaction and browsing experience. All those different effects of having to interact with a Web-based interface to use a digital library system need to be taken into account when designing the log system of the digital library application to make possible the later study of user interaction data to improve the use of the digital library system of interest (Agosti et al. 2009).

1.3.2 Significative Examples of Digital Library Systems

The previous discussion demonstrates that a digital library system is a complex system able to support diversified functions and services. To better clarify how complex and powerful a digital library system can be, recently available significant and distinct examples of digital library systems are briefly presented. They are DelosDLMS, The European Library, and Europeana.

DelosDLMS is a prototype for the next generation of digital library management systems, and is the result of the joint effort of partners in the DELOS Network of Excellence4 representing the state of the art in the conception and design of digital library management systems.

The European Library5 represents a state of the art effective service providing access to the catalogs and digital collections of most European national libraries via one central multi-lingual web interface.

The idea for Europeana6 came from a letter to the Presidency of the European Council and to the Commission on 28 April 2005: six Heads of State and Government suggested the creation of a virtual European library, aiming to make Europe’s cultural and scientific resources accessible for all. In late 2005 the European Commission started to promote and support the creation of a European digital library, as a strategic goal within the European Information Society i2010 Initiative, which aims to foster growth and jobs in the information society and media industries.

1.3.2.1 DelosDLMS

DelosDLMS, the prototype Digital Library Management System developed in the context of the DELOS Network of Excellence,7 is a relevant example of the new generation of service-oriented digital library systems. DelosDLMS combines a rich set of features in a combination unavailable in any existing system (Schek and Schuldt 2006). It combines text and audio-visual searching, offers personalized browsing using new information visualization and relevance feedback tools, provides novel interfaces, allows retrieved information to be annotated and processed, and integrates and processes sensor data streams. The system is built over OSIRIS (Weber et al. 2003), an environment initially developed at ETH Zurich and then expanded and maintained at the University of Basel.8 OSIRIS is a middleware for distributed and decentralized process execution that allows the building of process-based digital library applications starting from services (and already existing processes alike), and executes them in a distributed fashion.

The philosophy behind DelosDLMS is that digital library applications can be easily built starting from specialized services produced independently from each other. The basic architecture of DelosDLMS and, in general, of a service-oriented digital library system is depicted in Fig. 1.9.
Fig. 1.9

An example of a service-oriented digital library system

DelosDLMS was developed in two different integration phases, with the results of the first integration phase being reported by Agosti et al. (2007), and the result of the second phase being reported by Binding et al. (2007); the overview of all the services which have been integrated is given in Fig. 1.10.9
Fig. 1.10

DelosDLMS: Overview of the service-oriented digital library management system architecture

The subsequent and present status of development of DelosDLMS has been reported in (Ioannidis et al. 2008) where the DelosDLMS digital library management system is presented together with its components developed by the research groups that operated in a coordinated way in the DELOS Network of Excellence.

1.3.2.2 The European Library

The European Library is a noncommercial organization which provides the services of a physical library and offers search facilities for the resources of many of the European national libraries. Available resources can be both digital or bibliographical, e.g. books, posters, maps, sound recordings, and videos. The European Library is a service of the “Conference of European National Librarians” (CENL)10 and it is hosted by the Koninklijke Bibliotheek, The Netherlands.11

The European Library initiative aims at providing a “low barrier of entry” so that the national libraries can join the federation with only minimal changes to their systems (van Veen and Oldroyd 2004). This means that The European Library exists to open up the universe of knowledge, information and culture of all European national libraries, where a national library is the library specifically established by a country to store its information database. National libraries usually host the legal deposit and the bibliographic control centre of a nation. Currently The European Library gives access to more than 150 million entries across Europe, and the amount of referenced digital collections is constantly increasing.12

The European Library portal13 is an evolving service currently in its version 2.3. Its home page is reported in Fig. 1.11.
Fig. 1.11

The European Library portal home page

The European Library provides three protocols to access the collections in the portal.14 The portal has corresponding components depicted in Fig. 1.12:
  • a web server: this provides users with access to the services;

  • a central index: this harvests catalog records from the national libraries, supports the “Open Archives Initiative Protocol for Metadata Harvesting” (OAI-PMH),15 and provides integrated access to them via “Search/Retrieve via URL” (SRU);

  • a gateway between SRU and Z39.50: this provides access through SRU to national libraries which would otherwise only be accessible through Z39.50.16

Fig. 1.12

Architecture of the European Library portal

In addition, the interaction between the portal, the federated libraries, and the user mainly happens on the client side by means of an extensive use of Javascript and AJAX (Asynchronous JavaScript Technology and XML).17

Once the client, which is a standard web browser, accesses the service and downloads all the necessary information from the web server, all the subsequent requests are managed locally by the client. The client interacts directly with each federated library and the central index, according to the SRU protocol, makes separate AJAX calls towards each federated library or the central index, and manages the responses to such calls in order to present the results to the user and to organize user interaction.

1.3.2.3 Europeana

The European Commission’s goal for Europeana is to make European information resources easier to use in an online environment. It will build on Europe’s rich heritage, combining multicultural and multilingual environments with technological advances and new business models.

Europeana is a Thematic Network funded by the European Commission under the eContentplus programme, as part of the i2010 policy. Originally known as the European digital library network—EDLnet—it is a partnership of 100 representatives of heritage and knowledge organizations and IT experts from throughout Europe. The partners of the thematic network are contributing to the work of solving technical and usability issues. The project is run by a core team based in the national library of the Netherlands, the Koninklijke Bibliotheek. It builds on the project management and technical expertise developed by The European Library.

The development route, site architecture and technical specifications are all published as deliverable outcomes of the project. After the launch of the Europeana prototype, the project’s final task is to recommend a business model that will ensure the sustainability of the website. It will also report on the further research and implementation needed to make Europe’s cultural heritage fully interoperable and accessible through a truly multilingual service. A number of satellite projects have been designed and are under development to reach the different and ambitious aims of Europeana. Among those “Europeana version 1.0”18 is a project that is operating to bring the Europeana prototype to full service; during 2010, a new version of the service was made available giving access to over 10 million digital objects, and in 2011 a fully-operational service with improved multilinguality and semantic web features is released. The contents are coming from libraries, museums, archives and audio-visual collections. The software solutions under development are mostly open source. The effort sees the co-operation among many different institutions, including universities, ministries and heritage strategy bodies.

The Europeana effort is still under development. Figure 1.13 shows the Web home page of the Europeana portal, and Fig. 1.14 shows the screen shot of Europeana version 1.0.
Fig. 1.13

Web home page of Europeana

Fig. 1.14

Screen shot of Europeana version 1.0

1.4 The Digital Library Manifesto and the DELOS Digital Library Reference Model

Given the absence of reference tools or guidelines for the scientists and professionals approaching the field, some scientists operating in the context of the DELOS Network of Excellence decided to work to fill this gap. They conceived the “Digital Library Manifesto” and the “Digital Library Reference Model”. The aim of the former has been to set the foundations and identify the cornerstone concepts within the universe of Digital Libraries to facilitate the integration of research and propose better ways of developing appropriate systems (Candela et al. 2006). The aim of the latter has been to unify and organize the overall body of knowledge gathered together in the sector in a coherent and systematic way (Candela et al. 2007).

Digital Libraries represent the meeting point of a large number of disciplines and fields, i.e. data management, information retrieval, library sciences, document management, information systems, the web, image processing, artificial intelligence, human-computer interaction, and others. It was only natural that these first fifteen years were mostly spent on bridging some of the gaps between the disciplines (and the scientists serving each one), improvising on what Digital Library functionality is supposed to be, and integrating solutions from each separate field into systems to support such functionality, sometimes with the solutions being induced by novel requirements of Digital Libraries. These have been achieved through much exploratory work, primarily in the context of focused efforts devising specialized approaches to address particular aspects of Digital Library functionality.

For example, one of the earliest projects was the Jukebox project which was launched in 1991, approved and supported by the European Commission in 199219 which proposed to create a new library service based on acces to large sound collections via the network. The final aim was to enlarge library services by making available sound documents which represent relevant products of twentieth century culture, since it was evident from then that sound documents play a relevant role in historical, anthropological and musical research. However, public access to these collections had previously been restricted to a small user community. The Jukebox service was innovative because it represented one of the first opportunities to offer library users easy access to new information resources at the same place. Another interesting example of research activities that were preceding effective digital libraries is the feasibility study named ADMV (“Archivio Digitale della Musica Veneta”—digital archive of Venetian music), for the creation of a digital archive of images of manuscripts of musical scores, digitized versions in MIDI format, and recordings of performances of musical works by Venetian composers, such as Marcello and Pellestrina (Agosti et al. 1998). Following the feasibility study, the ADMV system was developed and made available.20

In spite of its short life, the Digital Library Manifesto introduced the entities of discourse of the digital library universe by introducing the relationships between three types of relevant “system” in the area: Digital Library, Digital Library System, and Digital Library Management System. Then it presented the main concepts characterizing the three types of systems: content, user, functionality, quality, policy, and architecture. The Manifesto also introduced the main roles that actors may play within a Digital Library, i.e. end-user, designer, administrator and application developer.

While the reader is referred to (Candela et al. 2006) for a complete presentation of the Manifesto, it seems useful to extract from it and report here two figures that sketch the three relevant systems contributing to the constitution of an effective digital library, and the main concepts of the digital libraries universe. Figure 1.15 represents the three layers of the digital libraries universe, where:
  • A Digital Library (DL) is an organization, which might be virtual, that comprehensively collects, manages and preserves for the long term rich digital content, and offers to its user communities specialized functionality on that content, of measurable quality and according to codified policies.

  • A Digital Library System (DLS) is a software system that is based on a defined (possibly distributed) architecture and provides all functionality required by a particular Digital Library. Users interact with a Digital Library through the corresponding Digital Library System.

  • A Digital Library Management System (DLMS) is a generic software system that provides the appropriate software infrastructure both to produce and administer a Digital Library System incorporating the suite of functionality considered fundamental for Digital Libraries and to integrate additional software offering more refined, specialized or advanced functionality.

Fig. 1.15

Digital Library (DL), Digital Library System (DLS), and Digital Library Management System (DLMS)

Figure 1.16 introduces the general concepts of the area (Candela et al. 2006, p. 11).
Fig. 1.16

The digital library universe

The DELOS Digital Library Reference Model is a first step towards the development of a foundational theory and the result of the “collective understanding” matured by the community of scholars who have been contributing to the growth of the sector. The DELOS Digital Library Reference Model identifies a set of notions and relations that are typical of the Digital Libraries area.

The reference model is a conceptual framework for capturing significant entities and their relationships in the universe of digital libraries. The goal is to use current knowledge of the characteristics of a DLMS to develop more concrete models of it. Conceptual maps of the reference model domains are presented and described, providing a brief overview of the concepts of each domain, the relations that bind them as well as the interaction between concepts of different domains. Lastly, the reference model presents concepts and relations in a hierarchical fashion, thus providing an overview of the specialization relations between them. Concept and relation definitions are provided for each of the concepts and relations of the concept maps. Each concept definition contains a brief definition of the concept, its relations to other concepts, the rationale behind the addition of the concept and an example. Each relation, accordingly, is described by a definition, a rationale and an example.

Thus, this model represents the only published reference guide both for scientists and professionals interested in the development and use of digital library systems.

1.5 Digital Library Systems Become User-Centered Systems: Adding Advanced Annotation Functions to Digital Library Systems

Digital library systems are in a state of fast evolution. Although they are still places where information resources can be stored and made available to end users, present design and development efforts are moving in the direction of transforming them into systems able to support the user in different information centric activities. This evolution is depicted in Fig. 1.17 where digital library systems focused on specific and specialized data are depicted on the left, and systems that can be used in a concurrent way by the users are depicted on the right.
Fig. 1.17

Evolution of digital library systems: from a data centric system (left) to a user centric one (right)

Annotations are an effective means of enabling interaction between users and one or more digital library systems, since their use is a diffuse and very well-established practice. Annotations are not only a way of explaining and enriching an information resource with personal observations, they are also a means of transmitting and sharing ideas to improve collaborative work practices. Furthermore, annotations allow users to naturally merge and link personal contents with the information resources provided by one or more digital library systems so that a common context unifying all of these contents can be created (Agosti and Ferro 2008b).

Furthermore, annotations cover a very broad spectrum, because they range from explaining and enriching an information resource with personal observations to transmitting and sharing ideas and knowledge on a subject. Moreover, they may cover different scopes and have different kinds of annotative context: they can be private, shared or public, according to the type of intellectual work that is being carried out. In addition, the boundaries between these scopes are not fixed, rather they may vary and evolve with time. Finally, annotations call for active involvement, the degree of which varies according to the aim of the annotation: private annotations require the involvement of the authors, whereas shared or public annotations involve the participation of a whole community. Therefore, annotations are suitable for improving collaboration and co-operation among users.

Annotations allow the creation of new relationships among existing contents, by means of links that connect annotations together and with existing content. In this sense we can consider that existing content and annotations constitute a hypertext, according to the definition of hypertext provided in (Agosti 1996). This hypertext can be exploited not only for providing alternative navigation and browsing capabilities, but also for offering advanced search functionalities (Agosti and Melucci 2001). Furthermore, Marshall (1998) considers annotations as a natural way of creating and growing hypertexts that connect information resources in a digital library system by actively engaging users. Finally, the hypertext existing between information resources and annotations enables different annotation configurations, such as threads of annotations, i.e. an annotation made in response to another annotation, and sets of annotation, i.e. a bundle of annotations on the same passage of text (Agosti and Ferro 2003; Agosti et al. 2004).

Thus, annotations introduce a new content layer aimed at elucidating the meaning of underlying documents, so that annotations can make hidden facets of the annotated documents more explicit. In conclusion, we can consider that annotations constitute a special kind of context, that we call annotative context, for the documents of a digital library, because they provide additional content concerned with the annotated documents. This viewpoint about annotations covers a wide range of annotation kinds, ranging from personal jottings in the margin of a page to scholarly comments made by an expert to explain a passage of a text. Thus, these different kinds of annotations involve different scopes for the annotation itself and, consequently, different kinds of annotative context. If we deal with a personal jotting, the recipient of the annotation is usually the author himself and so this kind of annotation involves a private annotative context; on the other hand, the recipients of a scholarly annotation are usually people who are not necessarily related to the author of the annotation, which thus involves a public annotative context; finally, a team of people working together on a topic can share annotations about it, which in this case involve a collaborative annotative context.

Digital library systems usually offer some basic hypertext and browsing capabilities based on the available structured data, such as authors or references. But they do not normally provide users with advanced hypertext functions, where the information resources are linked on the basis of the semantics of their content and hypertext information retrieval functionalities are available. A relevant aspect of annotations is that they permit the construction over time of a useful hypertext (Agosti et al. 2004), which relates pieces of information of personal interest, which are inserted by final users, to the digital objects which are managed by the software system. In fact, the user annotations allow the creation of new relationships among existing digital objects by means of links that connect annotations together with existing objects. In addition, the hypertext between annotations and annotated objects can be exploited not only for providing alternative navigation and browsing capabilities, but also for offering advanced search functions, able to retrieve more and better ranked objects in response to a user query by also exploiting the annotations linked to them (Agosti and Ferro 2005a, 2006).

Therefore, annotations can turn out to be an effective way of associating this kind of hypertext to a digital library system to enable the active and dynamic use of information resources. In addition, this hypertext can span and cross the boundaries of the single digital library system, if users need to interact with the information resources managed by multiple digital library systems (Agosti and Ferro 2008a). This latter possibility is quite innovative, because it offers the means for interconnecting various digital library systems in a personalized and meaningful way for the end-user, and, as it has been highlighted by Ioannidis et al. (2005), this is a relevant challenge for the digital library systems of the next generation.

An annotation service with these characteristics has been designed and developed at the University of Padua. This service is named “Flexible Annotation Service Tool” (FAST) (Agosti and Ferro 2003, 2005b; Ferro 2005) because it is able to represent and manage annotations which range from metadata to full content. Its flexible and modular architecture makes it suitable for annotating general web resources as well as digital objects managed by different digital library systems, as depicted in Fig. 1.18, which shows that the FAST annotation service can be used as a tool for keeping user memory of thoughts integrating and relating them to information resources managed by different digital library systems.
Fig. 1.18

FAST: A user annotation service for many digital library systems

The annotations themselves can be complex multimedia compound objects, with varying degree of visibility which range from private to shared and public annotations and different access rights. Figure 1.19 illustrates the situation in which FAST manages the annotations that have been produced by two users and that are on documents managed by two different digital library management systems.
Fig. 1.19

Users annotations managed by FAST and digital libraries of annotated documents managed by different systems

1.6 Interoperability between Digital Library Systems

A relevant aspect that needs to be addressed to support final users with digital library systems that are user-centric is “interoperability”. Interoperability is a complex and multiform concept, which can be defined—as by the “ISO/IEC 2382-01, Information Technology Vocabulary, Fundamental Terms”—as follows: “The capability to communicate, execute programs, or transfer data among various functional units in a manner that requires the user to have little or no knowledge of the unique characteristics of those units”. But in the context of digital library systems, the definition has been further specified by the EC Working Group on Digital Library Interoperability which has identified six aspects that can be distinguished and taken into account: interoperating entities, objects of interaction, functional perspective of interoperation, linguistic interoperability (multilingualism), design and user perspectives, and technological standards enabling different kinds of interoperability21 (Gradmann 2007).

The six dimensions are depicted in Fig. 1.20, and they can be defined as by the EC Working Group on Digital Library Interoperability as:
  • Interoperating entities: These can be assumed to be the traditional cultural heritage institutions, such as libraries, museums, archives, and other institutions in charge of preservation of artifacts, that offer digital services, or again the digital repositories (institutional or not), eScience and/or eLearning platforms or simply web services.

  • Information objects: The entities that actually need to be processed in interoperability scenarios. Choices range from the full content of digital information objects (analogue/digitized or born digital) to mere representations of such objects—and these in turn are often conceived as librarian metadata attribute sets, but are sometimes also conceived as “surrogates”.

  • Functional perspective: This may simply be the exchange and/or propagation of digital content. Other functional goals are aggregating digital objects into a common content layer. Another approach is to enable users and/or software applications to interact with multiple digital library systems via unified interfaces or to facilitate operations across federated autonomous digital library systems.

  • Multilinguality-multilingualism: Linguistic interoperability can be thought of in two different ways: as multilingual user interfaces to digital library systems or as dynamic multilingual techniques for exploring the digital library systems object space. Three types of approaches can be distinguished in the second respect: dynamic query translation for addressing digital library systems that manage different languages, dynamic translation of metadata responding to queries in different languages or dynamic localization of digital content.

  • User perspective: Interoperability concepts of a digital library system manager differ substantially from those of a content consuming end user. A technical administrator will have a very different view from an end user providing content as an author.

  • Interoperability technology: Enabling different kinds of interoperability constitutes a major dimension with more traditional approaches geared towards librarian metadata interoperability such as Z39.50, SRU or the harvesting methods based on OAI-PMH or web service-based approaches.

Fig. 1.20

Dimensions of interoperability in the context of digital libraries

1.7 Evaluation of Digital Libraries

Although evaluation activities started soon after the first digital library systems were available, the underlying assumptions and goals of the evaluation approaches were quite disparate. So, in the context of the DELOS activities, efforts were made to analyse the general situation and to propose a framework for evaluating digital library systems with the objective of providing a set of flexible and adaptable guidelines for digital library systems evaluation (Fuhr et al. 2007); a number of recommendations have emerged from the proposed framework, and those of specific interest when setting up a digital library evaluation activity are here recalled:
  • Flexible evaluation frameworks: For complex entities such as digital library systems, the evaluation framework should be flexible, allowing for multi-level evaluations (e.g. by following the six levels proposed by Saracevic (2004), including user and social). Furthermore any evaluation framework should undergo a period of discussion, revision and validation by the digital libraries community before being widely adopted. Flexibility would help researchers to avoid obsolete studies based on rigid frameworks, and to use models that can “expand” or “collapse” at their project’s requirements and conditions.

  • Involvement of practitioners and real users: Practitioners have a wealth of experience and domain-related knowledge that is often neglected. Better communication and definition of common terminology, aims and objectives could establish a framework of co-operation and boost this research area.

  • Build on past experiences of large evaluation initiatives: Initiatives of evaluation, such as TREC,22 CLEF,23 INEX,24 and NTCIR25 have collected a wealth of knowledge about evaluation methodology, which needs be effectively deployed.

In order to foster evaluation research in general, the following issues should be addressed:
  • Community building in evaluation research: The lack of globally accepted abstract evaluation models and methodologies can be counter-balanced by collecting, publishing and analyzing current research activities. Maintaining an updated inventory of evaluation activities and their interrelations can help to define good practice in the field and to help the research community to reach a consensus.

  • Establishment of primary data repositories: The provision of open access to primary evaluation data (e.g. transaction logs, surveys, monitored events), as is common in other research fields, should be a goal. In this respect, methods to render anonymous the primary data must be adopted, as privacy is a strong concern. Common repositories and infrastructures for storing primary and pre-processed data are proposed along with the collaborative formation of evaluation best practices, and modular building blocks to be used in evaluation activities. An exemplary model of such a primary data repository is DIRECT, a digital library which manages scientific data to be used during the evaluation campaign of multilingual search and access systems (Di Nunzio and Ferro 2005; Dussin and Ferro 2009; Agosti and Ferro 2009). Figure 1.21 illustrates some examples of the activities that are supported by the DIRECT system during the CLEF evaluation campaigns.26
    Fig. 1.21

    DIRECT: Examples of supported activities in CLEF campaigns

  • Standardized logging format: Further use and dissemination of common logging standards is also considered useful (Klas et al. 2006). Logging could be extended to include user behavior and system internal activity and to support the personalization and intelligent user interface design processes.

  • Evaluation of user behavior in-the-large: Currently, evaluation is focused too much on user interface and system issues. User satisfaction with respect to how far the user information needs have been satisfied (i.e. information access) must be investigated, independently of the methods used to fulfill these needs. The determination of user strategies and tactics is also recommended (such as search strategies, and browsing behaviors). This relates to evaluation in context, and to the question of identifying dependencies in various contexts (e.g. sociological, business, institutional). Collecting user behavior as implicit rating information can also be used to establish collaborative filtering services in digital library environments.

  • Differences that are specific of the domain of evaluation: An important problem is how to relate a possible model of digital library system to other overlapping models in other areas. How does a digital library system relate to other complex networked information systems (e.g. archives, portals, knowledge bases) and their models? Is it possible to connect or integrate digital library system models to the multitude of related existing models? The answer to this question should also help to define the independent research area of digital library systems evaluation.

1.8 Conclusions

A final and general consideration that emerges from what has been presented is that the digital libraries area is very active and dense with new challenges and open problems. Probably the term “digital libraries” is not really adequate in representing the flourishing area that is concerned with user requirements, digital contents, system architectures, functions, policies, quality, and evaluation, but it is a token that evokes the fascinating world of representing and managing the knowledge that humankind has been able to produce and make collectively available.

Footnotes

  1. 1.
  2. 2.

    The initial European initiative for dealing with digital libraries was the DELOS Working Group, active from January 1996 to December 1999, which was funded by the ESPRIT Long Term Research Programme within the Fourth Framework Programme of the Commission of the European Union, URL: http://delos-noe.isti.cnr.it/home/background.html. Due to the success of the DELOS Working Group in terms of acting as a focal point for the European as well as the world-wide digital library community, the Commission approved funding for an initial DELOS Network of Excellence on Digital Libraries, from January 2000 to December 2002, and later on the DELOS Network of Excellence on Digital Libraries, from January 2004 to December 2007, URL: http://www.delos.info/.

  3. 3.
  4. 4.
  5. 5.
  6. 6.
  7. 7.

    DELOS, the Network of Excellence on Digital Libraries which operated from 2004 to 2007 in the context of the Information Society Technologies (IST) Program of the European Commission (Contract G038-507618).

  8. 8.

    The Databases and Information Systems (DBIS) Group of the University of Basel is taking care of the development of OSIRIS in the context of “OSIRIS next”, a peer-to-peer based open service infrastructure aiming to implement and demonstrate a vision of modern service oriented information systems, URL: http://on.cs.unibas.ch/.

  9. 9.

    An overall presentation of the DelosDLMS prototype is available at the URL: http://dbis.cs.unibas.ch/.

  10. 10.

    See http://www.cenl.org/, since 2006 the Conference of European National Librarians has been added to the list of International Non-Governmental Organizations (INGO) enjoying participatory status with the Council of Europe.

  11. 11.
  12. 12.

    At present 48 national libraries of Europe are collaborating with The European Library, an update map of the participating national libraries and information on them are available at the URL: http://search.theeuropeanlibrary.org/portal/en/libraries.html.

  13. 13.
  14. 14.

    Useful information on the way a partner has to adapt to the technical infrastructure of The European Library environment can be found in the Handbook available at the URL: http://www.theeuropeanlibrary.org/portal/organisation/handbook/.

  15. 15.
  16. 16.
  17. 17.
  18. 18.
  19. 19.
  20. 20.
  21. 21.

    The results of the EC Working Group on Digital Library Interoperability are reported in the briefing paper by Stefan Gradman, entitled “Interoperability. A key concept for large scale, persistent digital libraries” which can be downloaded from the URL: http://www.digitalpreservationeurope.eu/publications/briefs/interoperability.pdf.

  22. 22.

    Text REtrieval Conference, URL: http://trec.nist.gov/.

  23. 23.

    Cross-Language Evaluation Forum, URL: http://www.clef-campaign.org/.

  24. 24.

    INitiative for the Evaluation of XML Retrieval, from 2002 to 2007 at the URL: http://inex.is.informatik.uni-duisburg.de/, onwards at the URL: http://www.inex.otago.ac.nz/.

  25. 25.

    NII Test Collection for IR Systems Project, URL: http://research.nii.ac.jp/ntcir/.

  26. 26.

    More information on the DIRECT system and on the data collections managed by it can be found at the URL: http://direct.dei.unipd.it/.

Notes

Acknowledgements

The paper reports on work which originated in the context of the DELOS Network of Excellence on Digital Libraries. The author thanks Costantino Thanos, coordinator of DELOS, for his continuous support and advice.

The reported work has been partially supported by the TELplus Targeted Project for digital libraries, as part of the eContentplus Programme of the European Commission (Contract ECP-2006-DILI-510003) and by EuropeanaConnect Best Practice Network funded by the European Commission within the area of Digital Libraries of the eContentplus Programme (Contract ECP-2008-DILI-52800).

References

  1. Agosti M (1980) L’esperienza pilota dell’Istituto di Statistica dell’Università di Padova nel progetto BAC. In: Alcuni Problemi e Prospettive di Organizzazione e Diffusione Dell’informazione Bibliografica. CLEUP, Padova, Italy, pp 89–104 Google Scholar
  2. Agosti M (1996) An overview of hypertext. In: Information Retrieval and Hypertext. Springer, Berlin, pp 27–47 Google Scholar
  3. Agosti M (ed) (2008) Information Access Through Search Engines and Digital Libraries. Springer, Berlin MATHGoogle Scholar
  4. Agosti M, Ferro N (2003) Annotations: Enriching a digital library. In: Proceedings of the European Conference on Digital Libraries, pp 88–100 Google Scholar
  5. Agosti M, Ferro N (2005a) Annotations as context for searching documents. In: Proceedings of the Conference on Conceptions of Library and Information Science, pp 155–170 Google Scholar
  6. Agosti M, Ferro N (2005b) A system architecture as a support to a flexible annotation service. In: Proceedings of the DELOS Conference on Digital Libraries, pp 147–166 Google Scholar
  7. Agosti M, Ferro N (2006) Search strategies for finding annotations and annotated documents: The FAST service. In: Proceedings of the Conference on Flexible Query Answering Systems, pp 270–281 CrossRefGoogle Scholar
  8. Agosti M, Ferro N (2008a) Adding advanced annotation functionalities to an existing digital library. In: Interdisciplinary Aspects of Information Systems Studies. Springer, Berlin, pp 279–286 CrossRefGoogle Scholar
  9. Agosti M, Ferro N (2008b) A formal model of annotations of digital content. ACM Transactions on Information Systems 26(1):3–57 CrossRefGoogle Scholar
  10. Agosti M, Ferro N (2009) Towards an evaluation infrastructure for DL performance evaluation. In: Tsakonas G, Papatheodorou C (eds) Evaluation of Digital Libraries: An Insight to Useful Applications and Methods. Chandos Publishing, Oxford, pp 93–120 Google Scholar
  11. Agosti M, Masotti M (1992a) Design and functions of DUO: The first Italian academic OPAC. In: Proceedings of the ACM Symposium on Applied Computing, pp 308–313 Google Scholar
  12. Agosti M, Masotti M (1992b) Design of an OPAC database to permit different subject searching accesses in a multi-disciplines universities library catalogue database. In: Proceedings of the ACM Conference on Research and Development in Information Retrieval. ACM Press, New York, NY, pp 245–255 Google Scholar
  13. Agosti M, Melucci M (2001) Information retrieval on the Web. In: Agosti M, Crestani F, Pasi G (eds) Lectures on Information Retrieval: Third European Summer-School. ESSIR 2000 (Revised Lectures). Springer, Berlin/Heidelberg, pp 242–285 CrossRefGoogle Scholar
  14. Agosti M, Ronchi M (1979) Progetto BAC: La Biblioteca Automatica del CINECA. In: Atti del Congresso dell’Associazione Italiana per il Calcolo Automatico, Bari, Italy, pp 367–370 Google Scholar
  15. Agosti M, Caovilla E, Crescenti M, Lissandrini L, Rigoni A (1975) LINGEB—linguaggio gestione biblioteche. L’elaborazione Automatica 2(2):1–107 Google Scholar
  16. Agosti M, Bombi F, Melucci M, Mian G (1998) Towards a digital library for the venetian music of the eighteenth century (abstract). In: Proceedings of the International Conference on Digital Resources in the Humanities, the Humanities Advanced Technology and Information Institute, Glasgow, Scotland, pp 75–77 Google Scholar
  17. Agosti M, Ferro N, Frommholz I, Thiel U (2004) Annotations in digital libraries and collaboratories—facets, models and usage. In: Proceedings of the European Conference on Digital Libraries, pp 244–255 Google Scholar
  18. Agosti M, Berretti S, Brettlecker G, del Bimbo A, Ferro N, Fuhr N, Keim D, Klas CP, Lidy T, Milano D, Norrie M, Ranaldi P, Rauber A, Schek HJ, Schreck T, Schuldt H, Signer B, Springmann M (2007) DelosDLMS—the integrated DELOS digital library management system. In: Thanos C, Borri F, Candela L (eds) Proceedings of the DELOS Conference on Digital Libraries. Lecture Notes in Computer Science, vol 4877. Springer, Heidelberg, pp 36–45 Google Scholar
  19. Agosti M, Crivellari F, Di Nunzio G, Ioannidis Y, Stamatogiannakis E, Triantafyllidi M, Vayanou M (2009) Searching and browsing digital library catalogues: A combined log analysis for the European library. In: Proceedings of the Italian Research Conference on Digital Libraries, pp 120–135 Google Scholar
  20. Baeza-Yates R, Ribeiro-Neto B (1999) Modern Information Retrieval. ACM Press/Addison-Wesley, New York, NY Google Scholar
  21. Baldacci B, Sprugnoli R (1983) Informatica e Biblioteche: Automazione dei Sistemi Informativi Bibliotecari. NIS, Roma Google Scholar
  22. Berendt B, Mobasher B, Nakagawa M, Spiliopoulou M (2002) The impact of site structure and user environment on session reconstruction in web usage analysis. In: Proceedings of the Workshop on Web Mining and Web Usage Analysis, pp 159–179 Google Scholar
  23. Binding C, Brettlecker G, Catarci T, Christodoulakis S, Crecelius T, Gioldasis N, Jetter HC, Kacimi M, Milano D, Ranaldi P, Reiterer H, Santucci G, Schek HJ, Schuldt H, Tudhope D, Weikum G (2007) DelosDLMS: infrastructure and services for future digital library systems. http://hci.uni-konstanz.de/downloads/Paper_DelosDLMS_Infrastructure_and_Servcies.pdf, visited on February, 2011
  24. Candela L, Castelli D, Ioannidis Y, Koutrika G, Pagano P, Ross S, Schek HJ, Schuldt H (2006) The digital library manifesto. http://www.delos.info/index.php?option=com_content&task=view&id=345&Itemid, visited on February, 2011
  25. Candela L, Castelli D, Ferro N, Ioannidis Y, Koutrika G, Meghini C, Pagano P, Ross S, Soergel D, Agosti M, Dobreva M, Katifori V, Schuldt H (2007) The DELOS digital library reference model. Foundations for digital libraries. Version 0.98. http://www.delos.info/index.php?option=com_content&task=view&id=345&Itemid=#reference_model, visited on February, 2011
  26. Di Nunzio G, Ferro N (2005) DIRECT: A system for evaluating information access components of digital libraries. In: Proceedings of the European Conference on Digital Libraries, pp 483–484 Google Scholar
  27. Dussin M, Ferro N (2009) DIRECT: Applying the DIKW hierarchy to large-scale evaluation campaigns. In: Larsen R, Paepcke A, Borbinha J, Naaman M (eds) Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries. ACM Press, New York, NY, p 424 Google Scholar
  28. Ferro N (2005) Design choices for a flexible annotation service. In: Proceedings of the Italian Research Conference on Digital Libraries, pp 101–110 Google Scholar
  29. Fuhr N, Tsakonas G, Aalberg T, Agosti M, Hansen P, Kapidakis S, Klas CP, Kovács L, Landoni M, Micsik A, Papatheodorou C, Peters C, Sølvberg S (2007) Evaluation of digital libraries. International Journal on Digital Libraries 8(1):21–38 CrossRefGoogle Scholar
  30. Gradmann S (2007) Interoperability: A key concept for large scale, persistent digital libraries. http://www.digitalpreservationeurope.eu/publications/briefs/interoperability.pdf, visited on February, 2011
  31. Guerrini M, Sardo L (2003) Authority Control. Associazione Italiana Biblioteche, Roma Google Scholar
  32. Hildreth C (1985) Online public access catalogs. Annual Review of Information Science and Technology 20:233–285 Google Scholar
  33. Hildreth C (ed) (1989) The Online Catalogue: Developments and Directions. Library Association, London Google Scholar
  34. Hodge G (2000) Systems of knowledge organization for digital libraries: Beyond traditional authority files. Tech rep, Council on Library and Information Resources (CLIR). http://www.clir.org/pubs/reports/pub91/contents.html, visited on December, 2010
  35. Ioannidis Y, Maier D, Abiteboul S, Buneman P, Davidson S, Fox E, Halevy A, Knoblock C, Rabitti F, Schek HJ, Weikum G (2005) Digital library information-technology infrastructures. International Journal on Digital Libraries 5(4):266–274 CrossRefGoogle Scholar
  36. Ioannidis Y, Milano D, Schek HJ, Schuldt H (2008) DelosDLMS from the DELOS vision to the implementation of a future digital library management system. International Journal on Digital Libraries 9:101–114 CrossRefGoogle Scholar
  37. Klas C, Albrechtsen H, Hansen P, Kapidakis S, Kovacs L, Kriewel S, Micsik A, Papatheodorou C, Tsakonas G, Jacob E (2006) A logging scheme for comparative digital library evaluation. In: Proceedings of the European Conference on Digital Libraries, pp 17–22 Google Scholar
  38. Marshall C (1998) Toward an ecology of hypertext annotation. In: Proceedings of the Conference on Hypertext and Hypermedia, pp 40–49 Google Scholar
  39. Metadata-IFLA-WG (2005) Guidance on the nature, implementation and evaluation of metadata schemas in libraries. Final report of the IFLA cataloguing section working group on the use of metadata schemas. Tech rep, IFLA. For the Review and Approval of the IFLA Cataloguing Section Google Scholar
  40. Salton G, McGill M (1983) Introduction to Modern Information Retrieval. McGraw-Hill, New York, NY MATHGoogle Scholar
  41. Saracevic T (2004) Evaluation of digital libraries: An overview. http://comminfo.rutgers.edu/~tefko/DL_evaluation_Delos.pdf, visited on February, 2011
  42. Schek HJ, Schuldt H (2006) DelosDLMS—infrastructure for the next generation of digital management systems. ERCIM News: Special Issue on the European Digital Library 66:22–24 Google Scholar
  43. Tanenbaum A (1996) Computer Networks, 3rd edn. Prentice Hall, Upper Addle River, NJ Google Scholar
  44. van Veen T, Oldroyd B (2004) Search and retrieval in the European library. A new approach. http://www.dlib.org/dlib/february04/vanveen/02vanveen.html, visited on December, 2010
  45. Weber R, Schuler C, Neukomm P, Schuldt H, Schek HJ (2003) Webservice composition with O’GRAPE and OSIRIS. In: Proceedings of the International Conference on Very Large Databases, pp 1081–1084 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Department of Information EngineeringUniversity of PaduaPadovaItaly

Personalised recommendations