Hierarchical Structuring of Cultural Heritage Objects within Large Aggregations

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8092)


Huge amounts of cultural content have been digitised and are available through digital libraries and aggregators like However, it is not easy for a user to have an overall picture of what is available nor to find related objects. We propose a method for hierarchically structuring cultural objects at different similarity levels. We describe a fast, scalable clustering algorithm with an automated field selection method for finding semantic clusters. We report a qualitative evaluation on the cluster categories based on records from the UK and a quantitative one on the results from the complete Europeana dataset.


Cluster Head Cultural Heritage Digital Library Data Provider Cluster Category 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ackerman, M., Ben-David, S.: Clusterability: A theoretical study. Journal of Machine Learning Research - Proceedings Track, 1–8 (2009)Google Scholar
  2. 2.
    Aletras, N., Stevenson, M.: Computing similarity between cultural heritage items using multimodal features. In: Proc. 6th EACL Workshop on Language Tech. for Cultural Heritage, Social Sciences and Humanities, pp. 85–92 (2012)Google Scholar
  3. 3.
    Aletras, N., Stevenson, M., Clough, P.: Computing similarity between items in a digital library of cultural heritage. J. Comput. Cult. Herit. 5(4), 16:1–16:19 (2013)Google Scholar
  4. 4.
    Broder, A.Z.: On the resemblance and containment of documents. In: Compression and Complexity of Sequences (SEQUENCES 1997), pp. 21–29. IEEE Computer Society (1997)Google Scholar
  5. 5.
    Cilibrasi, R., Vitanyi, P.M.B.: Clustering by compression. IEEE Transactions on Information Theory 51, 1523–1545 (2005)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Gennaro, C., Amato, G., Bolettieri, P., Savino, P.: An Approach to Content-Based Image Retrieval Based on the Lucene Search Engine Library. In: Lalmas, M., Jose, J., Rauber, A., Sebastiani, F., Frommholz, I. (eds.) ECDL 2010. LNCS, vol. 6273, pp. 55–66. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Grieser, K., et al.: Using ontological and document similarity to estimate museum exhibit relatedness. J. Comput. Cult. Herit. 3(3), 10:1–10:20 (2011)Google Scholar
  8. 8.
    Hall, M., Clough, P., Stevenson, M.: Evaluating the Use of Clustering for Automatically Organising Digital Library Collections. In: Zaphiris, P., Buchanan, G., Rasmussen, E., Loizides, F. (eds.) TPDL 2012. LNCS, vol. 7489, pp. 323–334. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Hickey, T.B., O’Neill, E.T., Toves, J.: Experiments with the IFLA Functional Requirements for Bibliographic Records (FRBR). D-Lib Magazine 8(9) (2002)Google Scholar
  10. 10.
    Hyvönen, E.: Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Synthesis Lectures on The Semantic Web. Morgan & Claypool (2012)Google Scholar
  11. 11.
    Knoth, P., Novotny, J., Zdrahal, Z.: Automatic generation of inter-passage links based on semantic similarity. In: The 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China (2010)Google Scholar
  12. 12.
    Manguinhas, H., Freire, N., Borbinha, J.: FRBRization of MARC records in multiple catalogs. In: Joint International Conference on Digital Libraries (JCDL), pp. 225–234 (2010)Google Scholar
  13. 13.
    Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)Google Scholar
  14. 14.
    Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proc. 21st National conference on Artificial intelligence (AAAI 2006), vol. 1, pp. 775–780 (2006)Google Scholar
  15. 15.
    Mitchell, M.: An Introduction to Genetic Algorithms (Complex Adaptive Systems). A Bradford Book, MIT Press (1999)Google Scholar
  16. 16.
    Papadakos, P., Armenatzoglou, N., Kopidaki, S., Tzitzikas, Y.: On exploiting static and dynamically mined metadata for exploratory web searching. Knowl. Inf. Syst. 30(3), 493–525 (2012)CrossRefGoogle Scholar
  17. 17.
    Takhirov, N., Duchateau, F., Aalberg, T.: Supporting FRBRization of Web Product Descriptions. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 69–76. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.OCLC ResearchLeidenThe Netherlands
  2. 2.Europeana FoundationThe HagueThe Netherlands

Personalised recommendations