Advertisement

Datenbank-Spektrum

, Volume 17, Issue 2, pp 113–129 | Cite as

The Hydra.PowerGraph System

Building Digital Archives with Directed and Typed Hypergraphs
  • Holger Meyer
  • Alf-Christian Schering
  • Andreas Heuer
Schwerpunktbeitrag
  • 172 Downloads

Abstract

Directed hypergraphs are known from graph theory [11] and are well understood within their own domain [7, 8, 9, 22, 23]. This paper provides an overview on the expressiveness of directed and typed hypergraphs as a modeling paradigm not only for the content of digital libraries and archives but a variety of applications. Furthermore, hypergraphs are sufficiently expressive to provide an implementation logic for conceptual models like CIDOC/CRM [18] in the context of museum-related systems and digital archives.

The directed hypergraph model supports typed nodes and individual flexible sets of attributes on a per node type basis. This allows for efficient mapping on object-relational database structures. It also features a flexible, semi-structured type system for hyperedges. The graph model is accompanied by a set of well defined graph operations forming an algebra and a descriptive hypergraph query language GrafL. This language supports typed, structure and value based queries as well as fundamental graph algorithms.

The suitability of such a hypergraph-based model is illustrated with a large digital ethnological archive system, which is developed in the WossiDiA project [43, 52, 53].

Keywords

Graph databases Directed hypergraphs Dynamic type checking Digital humanities Digital archive systems 

Notes

Acknowledgements

The authors would like to thank Christoph Schmitt, Reinhard Kerb, and Stefanie Janssen of the European Ethnology and Wossidlo Archive as well as Timothy R. Tangherlini from University of California in Los Angeles, and Theo Meder from Meertens Instituut, Amsterdam. Also, we would like to thank all the students involved in discussing, implementing, testing, and using theWossiDiAsystem, namely David Vendt, Roland Kiesendahl, Rasha Mukbil, Md. Janghir Alam, Martin Lichtwark, and Steffen Sachse.

References

  1. 1.
    Abello J, Broadwell P, Tangherlini TR (2012) Computational folkloristics. Commun ACM 55(7):60–70. doi: 10.1145/2209249.2209267 CrossRefGoogle Scholar
  2. 2.
    Abiteboul S (1997) Querying semi-structured data. Database Theory ICDT 97:1–18MathSciNetGoogle Scholar
  3. 3.
    Alam MJ (2016) Spatio-Temporal Operations in Digital Archive Systems. Masters thesis, University of Rostock, GermanyGoogle Scholar
  4. 4.
    Angles R, Gutiérrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39CrossRefGoogle Scholar
  5. 5.
    Angles R, Arenas M, Barceló P, Hogan A, Reutter JL, Vrgoc D (2016) Foundations of modern graph query languages. Comput Res Repos. arXiv:1610.06264Google Scholar
  6. 6.
    Arroyuelo D, Claude F, Maneth S, Mäkinen V, Navarro G, Nguyen K, Sirén J, Välimäki N (2010) Fast in-memory xpath search using compressed indexes. In: Li F, Moro MM, Ghandeharizadeh S, Haritsa JR, Weikum G, Carey MJ, Casati F, Chang EY, Manolescu I, Mehrotra S, Dayal U, Tsotras VJ (eds) Proceedings of the 26th International Conference on Data Engineering, ICDE 2010 Long Beach CA, March 1‑6, 2010. IEEE Computer Society, Washington, DC, pp 417–428CrossRefGoogle Scholar
  7. 7.
    Ausiello G (1988) Directed Hypergraphs: Data Structures and Applications. In: Dauchet M, Nivat M (eds) Proceedings CAAP ’88, 13th Colloquium on Trees in Algebra and Programming, Nancy, March 21-24, 1998. Lecture Notes in Computer Science, vol 299. Springer, Berlin Heidelberg, pp 295–303Google Scholar
  8. 8.
    Ausiello G, Laura L (2017) Directed hypergraphs: introduction and fundamental algorithms – a survey. Theor Comput Sci 658:293–306. doi: 10.1016/j.tcs.2016.03.016 MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Ausiello G, D’Atri A, Saccà D (1986) Minimal representation of directed hypergraphs. SIAM J Comput 15(2):418–431MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Ben-Amram A, Yoffe S (2011) A simple and efficient Union-Find-Delete algorithm. Theor Comput Sci 412(4):487–492. doi: 10.1016/j.tcs.2010.11.005 MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Berge C (1989) Hypergraphs – combinatorics of finite sets, 1st edn. North Holland, AmsterdamzbMATHGoogle Scholar
  12. 12.
    Boncz PA, Larriba-Pey J (eds) (2016) Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, Redwood Shores, CA, USA, June 24, 2016. ACM, New York. doi: 10.1145/2960414 Google Scholar
  13. 13.
    Brandstädt A, Le VB, Spinrad JP (1999) Graph classes: a survey. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRefzbMATHGoogle Scholar
  14. 14.
    Chauvin B, Flajolet P, Gardy D, Gittenberger B (2004) And/Or Trees Revisited. Comb Probab Comput 13(4-5):475–497. doi: 10.1017/S0963548304006273 MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Claude F, Navarro G (2010) Fast and compact web graph representations. ACM Trans Web 4(4):1–31. doi: 10.1145/1841909.1841913 CrossRefGoogle Scholar
  16. 16.
    Das M, Simitsis A, Wilkinson K (2016) A hybrid solution for mixed workloads on dynamic graphs. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES ´16. ACM, New York. doi: 10.1145/2960414 Google Scholar
  17. 17.
    Dave A, Jindal A, Li LE, Xin R, Gonzalez J, Zaharia M (2016) Graphframes: an integrated API for mixing graph and relational queries. In: GRADES ´16 Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems. ACM, New York. doi: 10.1145/2960414 Google Scholar
  18. 18.
    Doerr M, Ore CE, Stead S (2007) The CIDOC conceptual reference model – a new standard for knowledge sharing. In: Grundy JC, Hartmann S, Laender AHF, Maciaszek LA, Roddick JF (eds) ER (Tutorials, Posters, Panels & Industrial Contributions). CRPIT, vol 83. Australian Computer Society, Sydney, pp 51–56Google Scholar
  19. 19.
    Fagin R (1983) Degrees of Acyclicity for Hypergraphs and relational database schemes. J Assoc Comput Mach 30(3):514–550. doi: 10.1145/2402.322390 MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Firth H, Missier P (2016) Workload-aware streaming graph partitioning. In: Palpanas T, Stefanidis K (eds) Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, EDBT/ICDT Workshops 2016, Bordeaux, France, March 15, 2016, CEUR Workshop Proceedings, vol 1558. CEUR-WS.org, BordeauxGoogle Scholar
  21. 21.
    Fotakis D, Pagh R, Sanders P, Spirakis P (2005) Space efficient hash tables with worst case constant access time. Theory Comput Syst 38:229–248. doi: 10.1007/s00224-004-1195-x MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Gallo G, Scutella MG (1998) Directed hypergraphs as a modelling paradigm. Riv Mat Sci Econ Soc 21(1-2):97–123MathSciNetGoogle Scholar
  23. 23.
    Gallo G, Longo G, Pallottino S, Nguyen S (1993) Directed hypergraphs and applications. Discrete Appl Math 42(2):177–201MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Gao J, Zhao Q, Ren W, Swami A, Ramanathan R, Bar-Noy A (2012) Dynamic shortest path algorithms for hypergraphs. In: WiOpt. IEEE, Washington, pp 238–245Google Scholar
  25. 25.
    Gao J, Zhao Q, Ren W, Swami A, Ramanathan R, Bar-Noy A (2015) Dynamic shortest path algorithms for Hypergraphs. IEEE ACM Trans Netw 23(6):1805–1817. doi: 10.1109/TNET.2014.2343914 CrossRefGoogle Scholar
  26. 26.
    Grust T, Rittinger J, Teubner J (2008) Pathfinder: Xquery off the relational shelf. IEEE Data Eng Bull 31(4):7–14Google Scholar
  27. 27.
    Gubichev A, Then M (2014) Graph pattern matching: do we have to reinvent the wheel? In: Proceedings of workshop on graph data management, experiences and systems. ACM, New York, pp 1–7Google Scholar
  28. 28.
    Gutierrez C, Hurtado CA, Mendelzon AO (2004) Foundations of semantic web databases. In: ACM Symposium on PODS. ACM, New York, pp 95–106Google Scholar
  29. 29.
    Haubenschild M, Then M, Hong S, Chafi H (2016) Asgraph: a mutable multi-versioned graph container with high analytical performance. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES ´16Google Scholar
  30. 30.
    Hayes J (2004) A Graph Model for RDF. Dipoma thesis, Technische Universität Darmstadt, GermanyGoogle Scholar
  31. 31.
    Hayes J, Gutierrez C (2004) Bipartite graphs as intermediate model for RDF. In: Proceedings of the 3th Int. Semantic Web Conference (ISWC), number 3298 in LNCS. Springer, Berlin Heidelberg, pp 47–61Google Scholar
  32. 32.
    Hayes PJ (2004) Rdf semantics. W3C Recommendation. http://www.w3.org/TR/rdf-mt/. Accessed: 22 Febr 2017Google Scholar
  33. 33.
    He H, Singh AK (2008) Graphs-at-a-time: query language and access methods for graph databases. In: Wang JT (ed) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD Vancouver BC, June 10-12, 2008. vol 2008. ACM, New York, pp 405–418Google Scholar
  34. 34.
    Hölsch J, Schmidt T, Grossniklaus M (2017) On the performance of analytical and pattern matching graph queries in neo4j and a relational database. CEUR-WS.org. http://ceur-ws.org/Vol-1810. Accessed: 22 Febr 2017Google Scholar
  35. 35.
    Georgia Institute of Technology Stinger. Tech. rep. http://www.stingergraph.com/
  36. 36.
    Jacobson G (1989) Space-efficient static trees and graphs. In: 30th Annual Symposium on Foundations of Computer Science, 30 Oct.-1 Nov. 1989, IEEE Computer Society, Washington, pp 549–554. doi: 10.1109/SFCS.1989.63533
  37. 37.
    Jannaschk K, Rathje CA, Thalheim B, Förster F (2011) A generic database schema for CIDOC-CRM data management. Adv Databases Inf Syst 2(789):127–136Google Scholar
  38. 38.
    Kiesendahl R (2014) Konzeptuelle Modellierung historischer Daten in digitalen, historischen Informationssystemen. Bachelor thesis, University of Rostock, GermanyGoogle Scholar
  39. 39.
    Kimura K, Koike A (2009) Localized Suffix Array and Its Application to Genome Mapping Problems for Paired-End Short Reads. In: Morishita S, Lee SY, Sakakibara Y (eds) Proceedings of the 20th International Conference on Genome Informatics. Genome Informatics Series, vol 23. Imperial College Press, London, 2009, pp 60–71Google Scholar
  40. 40.
    Laurikari V (2000) NFas with tagged transitions, their conversion to deterministic automata and application to regular expressions. In: Seventh International Symposium on String Processing and Information Retrieval, SPIRE 2000, A Coruña, Spain, September 27–29, 2000. IEEE Computer Society, Washington, pp 181–187. doi: 10.1109/SPIRE.2000.878194
  41. 41.
    Levene M, Poulovassilis A (1991) An object-oriented data model formalised through hypergraphs. Data Knowl Eng 6:205–234CrossRefGoogle Scholar
  42. 42.
    Meyer H (2008) Inventarisation of historical maritime landscapes – an information science point of view. In: Meyer H, Springmann MJ, Wernicke H (eds) The Lagomar lagoons – unique maritime cultural landscapes in a scientific focus and in an interdisciplinary comparison. Steffen, Friedland, pp 21–33Google Scholar
  43. 43.
    Meyer H, Schering AC, Schmitt C (2014) WossiDiA – The Wossidlo Digital Archive. In: Meyer H, Schmitt C, Janssen S, Schering AC (eds) Corpora ethnographica online. Waxmann, Münster New YorkGoogle Scholar
  44. 44.
    Meyer H, Schmitt C (2015) Semantische, räumliche und zeitliche Vernetzung regionalethnologischer Archive. In: Bolenz E, Franken L, Hänel D (eds) Wenn das Erbe in die Wolke kommt – Digitalisierung und kulturelles Erbe. Klartext, Essen, pp 61–86Google Scholar
  45. 45.
    Meyer H, Springmann MJ, Wernicke H (eds) (2008) The Lagomar lagoons – unique maritime cultural landscapes in a scientific focus and in an interdisciplinary comparison. Steffen, FriedlandGoogle Scholar
  46. 46.
    Meyer H, Mukbhil R, Schering AC (2017) The Hydra.PowerGraph Data Definition, Manipulation and Query Language GrafL. CS-01-17. University of Rostock, CS Department, GermanyGoogle Scholar
  47. 47.
    Pagh R, Rodler FF (2004) Cuckoo hashing. J Algorithms 51(2):122–144. doi: 10.1016/j.jalgor.2003.12.002 MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Pitoura E, Maabout S, Koutrika G, Marian A, Tanca L, Manolescu I, Stefanidis K (eds) (2016) Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016. Bordeaux, March 15–16, 2016.Google Scholar
  49. 49.
    Prud’hommeaux E, Seaborne A (2013) SPARQL 1.1 Query Language for RDF. In: W3C Recommendation (Tech. rep.; 26 March 2013)Google Scholar
  50. 50.
    Raman R, Raman V, Satti SR (2007) Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans Algorithms 3(4):43. doi: 10.1145/1290672.1290680 MathSciNetCrossRefGoogle Scholar
  51. 51.
    van Rest O, Hong S, Kim J, Meng X, Chafi H (2016) PGQL: a property graph query language. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES ´16Google Scholar
  52. 52.
    Schering AC, Bruder I, Schmitt C, Meyer H, Heuer A (2007) Towards a digital archive for handwritten paper slips with ethnological contents. In: Goh DHL, Cao TH, Sølvberg I, Rasmussen EM (eds) ICADL. Lecture Notes in Computer Science, vol 4822. Springer, Berlin Heidelberg, pp 61–64Google Scholar
  53. 53.
    Schering AC, Bruder I, Jürgensmann S, Meyer H, Schmitt C (2011) From box to bin – semi-automatic digitization of a huge collection of ethnological documents. In: Xing C, Crestani F, Rauber A (eds) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Beijing, China, October 24-27, 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin Heidelberg, pp 168–171. doi: 10.1007/978-3-642-24826-9
  54. 54.
    Schick S, Meyer H, Heuer A (2011) Flexible publication workflows using dynamic dispatch. In: Xing C, Crestani F, Rauber A (eds) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Beijing, China, October 24-27, 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin Heidelberg, pp 257–266. doi: 10.1007/978-3-642-24826-9
  55. 55.
    Schmitt C (2014) Szenarien semantischer Vernetzung zwischen regionalethnographischen und dialektlexikographischen Korpora im Online-Projekt WossiDiA. In: Bühler R, Bürkle R, Leonhardt N (eds) Sprachkultur – Regionalkultur. Neue Felder kulturwissenschaftlicher Dialektforschung. TVV-Verlag, Tübingen, pp 255–286Google Scholar
  56. 56.
    Tangherlini TR (2013) The folklore Macroscope. The Archer Taylor memorial lecture. West Folk 72(1):7–27Google Scholar
  57. 57.
    Tangherlini TR, Broadwell PM (2016) WitchHunter: GeoSemantic browsing in a large folklore corpus. J Am Folklore 129(511):14–40CrossRefGoogle Scholar
  58. 58.
    Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422CrossRefzbMATHGoogle Scholar
  59. 59.
    Trißl S, Leser U (2007) Fast and practical indexing and querying of very large graphs. In: Chan CY, Ooi BC, Zhou A (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data Beijing, June 12-14, 2007. ACM, New York, pp 845–856Google Scholar
  60. 60.
    Vendt D (2013) Hochvernetzte Archivstrukturen und NoSQL-Systeme. Bachelor thesis, University of Rostock, GermanyGoogle Scholar
  61. 61.
    Wang J, Ntarmos N, Triantafillou P (2016) Indexing query graphs to speedup graph query processing. In: Pitoura E, Maabout S, Koutrika G, Marian A, Tanca L, Manolescu I, Stefanidis K (eds) (2016) Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016. Bordeaux, March 15–16, 2016.Google Scholar
  62. 62.
    Xing C, Crestani F, Rauber A (eds) (2011) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Beijing, China, October 24-27, 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin Heidelberg. doi: 10.1007/978-3-642-24826-9

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Database Research GroupUniversity of RostockRostockDeutschland

Personalised recommendations