Abstract
Digital library (DL) support for different information seeking strategies (ISS) has not evolved as fast as their amount of offered stock or presentation quality. However, several studies argue for the support of explorative ISS in conjunction to the directed query-response paradigm. Hence, this paper presents a primarily explorative research system prototype for metadata harvesting allowing multimodal access to DL stock for researchers during the research idea development phase, i.e., while the information need (IN) is vague. To address evolving INs, the prototype also allows ISS transitions, e.g., to OPACs, if accuracy is needed.
As its second contribution, the paper presents a curated data set for digital humanities researchers that is automatically enriched with metadata derived by different algorithms including content-based image features. The automatic enrichment of originally bibliographic metadata is needed to support the exploration of large metadata stock as traditional metadata does not always address vague INs.
The presented proof of concept clearly shows that use case-specific metadata facilitates the interaction with large metadata corpora.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Open Archives Initiative Protocol for Metadata Harvesting; https://www.openarchives.org/pmh/.
- 3.
Online Public Access Catalog.
- 4.
- 5.
- 6.
- 7.
Although one might argue whether humans do not err.
- 8.
For the sake of readability, we decided to publish the full source in form of a documented Jupyter (http://jupyter.org/) notebook as a supplement to this paper to limit the discussion of algorithmic parameters to a minimum.
- 9.
For instance, the RAK main manual, the German counterpart to Anglo-American Cataloguing Rules, is a 627 pages long document.
- 10.
Because of the unavailability of ground truths similar to our corpus, the limited amount of data in the prototype, and the non-destructive extension of the metadata records, we decided against a full automation of the evaluation. However, the resulting clusters are stable enough to be checked against common authority files.
- 11.
The in principle optional normalization was carried out primarily to offer researchers a homogeneous image data set.
- 12.
- 13.
References
Reiterer, H., Mußler, G., Mann, T., Handschuh, S.: INSYDER. In: Proceedings of the 23rd SIGIR 2000, pp. 112–119. ACM (2000)
Kuhlthau, C.C.: Inside the search process: information seeking from the user’s perspective. J. Am. Soc. Inf. Sci. 42(5), 361–371 (1991)
Ingwersen, P.: Cognitive perspectives of information retrieval interaction: elements of a cognitive IR theory. J. Doc. 52, 3–50 (1996)
Ellis, D., Haugan, M.: Modelling the information seeking patterns of engineers and research scientists in an industrial environment. J. Doc. 53(4), 384–403 (1997)
Thomee, B., Popescu, A.: Overview of the ImageCLEF 2012 flickr photo annotation and retrieval task. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF 2012, Online Working Notes, Rome, Italy, 17–20 September 2012 (2012)
Caputo, B., Muller, H., Thomee, B., Villegas, M., Paredes, R., Zellhofer, D., Goeau, H., Joly, A., Bonnet, P., Martinez Gomez, J., Varea, I.G., Cazorla, M.: ImageCLEF 2013: the vision, the data and the open challenges. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds.) CLEF 2013. LNCS, vol. 8138, pp. 250–268. Springer, Heidelberg (2013)
deMey, M.: The cognitive viewpoint: its development and its scope. In: International Workshop on the Cognitive Viewpoint, CC 1977, Ghent, Belgium, pp. xvi–xxxii (1977)
Ingwersen, P., Järvelin, K.: The Turn: Integration of Information Seeking and Retrieval in Context. Springer, Dordrecht (2005)
Zellhöfer, D.: A preference-based relevance feedback approach for polyrepresentative multimedia retrieval. Ph.D. thesis, Brandenburg Technical University (2015)
Bates, M.: The design of browsing and berrypicking techniques for the online search interface. Online Rev. 13(5), 407–424 (1989)
Marchionini, G., Geisler, G., Brunk, B.: Agileviews: a human-centered framework for interfaces to information spaces. In: Proceedings of the Annual Conference of the American Society for Information Science, pp. 271–280 (2000)
Belkin, N., Marchetti, P., Cool, C.: BRAQUE: design of an interface to support user interaction in information retrieval. Inf. Process. Manag. 29(3), 325–344 (1993)
White, R., Roth, R.: Exploratory Search: Beyond the Query-Response Paradigm. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, San Rafael (2009)
Winkler, W.E.: String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In: Proceedings of the Section on Survey Research, pp. 354–359 (1990)
Fox, E.A., Leidig, J.: Digital Libraries Applications: CBIR, Education, Social Networks, eScience/Simulation, and GIS. Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers, San Rafael (2014)
Lux, M., Chatzichristofis, S.: Lire: Lucene image retrieval: an extensible Java CBIR library. In: Proceedings of the 16th ACM MM 2008, pp. 1085–1088. ACM (2008)
Shneiderman, B.: The eyes have it: a task by data type taxonomy for information visualizations. In: Proceedings of the 1996 IEEE Symposium on Visual Languages, VL 1996, pp. 336–343. IEEE Computer Society (1996)
Sculley, D.: Web-scale k-means clustering. In: Proceedings of the 19th International Conference on WWW 2010, pp. 1177–1178. ACM, New York (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zellhöfer, D. (2016). Exploring Large Digital Libraries by Multimodal Criteria. In: Fuhr, N., Kovács, L., Risse, T., Nejdl, W. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2016. Lecture Notes in Computer Science(), vol 9819. Springer, Cham. https://doi.org/10.1007/978-3-319-43997-6_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-43997-6_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43996-9
Online ISBN: 978-3-319-43997-6
eBook Packages: Computer ScienceComputer Science (R0)