Skip to main content

The Hydra.PowerGraph System

Building Digital Archives with Directed and Typed Hypergraphs

Abstract

Directed hypergraphs are known from graph theory [11] and are well understood within their own domain [7,8,9,8,, 22, 23]. This paper provides an overview on the expressiveness of directed and typed hypergraphs as a modeling paradigm not only for the content of digital libraries and archives but a variety of applications. Furthermore, hypergraphs are sufficiently expressive to provide an implementation logic for conceptual models like CIDOC/CRM [18] in the context of museum-related systems and digital archives.

The directed hypergraph model supports typed nodes and individual flexible sets of attributes on a per node type basis. This allows for efficient mapping on object-relational database structures. It also features a flexible, semi-structured type system for hyperedges. The graph model is accompanied by a set of well defined graph operations forming an algebra and a descriptive hypergraph query language GrafL. This language supports typed, structure and value based queries as well as fundamental graph algorithms.

The suitability of such a hypergraph-based model is illustrated with a large digital ethnological archive system, which is developed in the WossiDiA project [43, 52, 53].

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. The scenario in question here is about the exchange of gifts and wooing. A daytaler or farmhand is presenting a decorated rake, the “Austharke” (Lower German language for dowry rake), to his bride-to-be. The information about this rural custom is spread over several field notes with the “Austharke” in the centre.

  2. Missing adequacy, i. e., most concepts of graph and semi-structured data models are not supported in querying such data relational systems using SQL.

  3. The archive consists of about two million paper slips containing field research notes, which in turn contain up to twenty million references or links to other field notes, in fact represented as a sparsely populated graph.

  4. \(P(V)\) is the powerset, the set of all subsets, of V.

  5. Gallo calls them hyperarcs.

  6. The data structure used for the remainder type is an and-or-tree [14] with grouping nodes and quantifier annotations as edge labels, and used also for the current type and the content model.

  7. Another reason for using PostgreSQL are the PostGIS extension for spatial data and the range type supporting temporal operators [3].

  8. We could not track where the idea of the plex structure originally came from, but at least David Matuszek is an online source for it: https://www.cis.upenn.edu/~matuszek/.

  9. https://jena.apache.org/

References

  1. Abello J, Broadwell P, Tangherlini TR (2012) Computational folkloristics. Commun ACM 55(7):60–70. doi:10.1145/2209249.2209267

    Article  Google Scholar 

  2. Abiteboul S (1997) Querying semi-structured data. Database Theory ICDT 97:1–18

    MathSciNet  Google Scholar 

  3. Alam MJ (2016) Spatio-Temporal Operations in Digital Archive Systems. Masters thesis, University of Rostock, Germany

  4. Angles R, Gutiérrez C (2008) Survey of graph database models. ACM Comput Surv 40(1):1–39

    Article  Google Scholar 

  5. Angles R, Arenas M, Barceló P, Hogan A, Reutter JL, Vrgoc D (2016) Foundations of modern graph query languages. Comput Res Repos. arXiv:1610.06264

  6. Arroyuelo D, Claude F, Maneth S, Mäkinen V, Navarro G, Nguyen K, Sirén J, Välimäki N (2010) Fast in-memory xpath search using compressed indexes. In: Li F, Moro MM, Ghandeharizadeh S, Haritsa JR, Weikum G, Carey MJ, Casati F, Chang EY, Manolescu I, Mehrotra S, Dayal U, Tsotras VJ (eds) Proceedings of the 26th International Conference on Data Engineering, ICDE 2010 Long Beach CA, March 1‑6, 2010. IEEE Computer Society, Washington, DC, pp 417–428

    Chapter  Google Scholar 

  7. Ausiello G (1988) Directed Hypergraphs: Data Structures and Applications. In: Dauchet M, Nivat M (eds) Proceedings CAAP ’88, 13th Colloquium on Trees in Algebra and Programming, Nancy, March 21-24, 1998. Lecture Notes in Computer Science, vol 299. Springer, Berlin Heidelberg, pp 295–303

    Google Scholar 

  8. Ausiello G, Laura L (2017) Directed hypergraphs: introduction and fundamental algorithms – a survey. Theor Comput Sci 658:293–306. doi:10.1016/j.tcs.2016.03.016

    MathSciNet  Article  MATH  Google Scholar 

  9. Ausiello G, D’Atri A, Saccà D (1986) Minimal representation of directed hypergraphs. SIAM J Comput 15(2):418–431

    MathSciNet  Article  MATH  Google Scholar 

  10. Ben-Amram A, Yoffe S (2011) A simple and efficient Union-Find-Delete algorithm. Theor Comput Sci 412(4):487–492. doi:10.1016/j.tcs.2010.11.005

    MathSciNet  Article  MATH  Google Scholar 

  11. Berge C (1989) Hypergraphs – combinatorics of finite sets, 1st edn. North Holland, Amsterdam

    MATH  Google Scholar 

  12. Boncz PA, Larriba-Pey J (eds) (2016) Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, Redwood Shores, CA, USA, June 24, 2016. ACM, New York. doi:10.1145/2960414

    Google Scholar 

  13. Brandstädt A, Le VB, Spinrad JP (1999) Graph classes: a survey. Society for Industrial and Applied Mathematics, Philadelphia

    Book  MATH  Google Scholar 

  14. Chauvin B, Flajolet P, Gardy D, Gittenberger B (2004) And/Or Trees Revisited. Comb Probab Comput 13(4-5):475–497. doi:10.1017/S0963548304006273

    MathSciNet  Article  MATH  Google Scholar 

  15. Claude F, Navarro G (2010) Fast and compact web graph representations. ACM Trans Web 4(4):1–31. doi:10.1145/1841909.1841913

    Article  Google Scholar 

  16. Das M, Simitsis A, Wilkinson K (2016) A hybrid solution for mixed workloads on dynamic graphs. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES ´16. ACM, New York. doi:10.1145/2960414

    Google Scholar 

  17. Dave A, Jindal A, Li LE, Xin R, Gonzalez J, Zaharia M (2016) Graphframes: an integrated API for mixing graph and relational queries. In: GRADES ´16 Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems. ACM, New York. doi:10.1145/2960414

    Google Scholar 

  18. Doerr M, Ore CE, Stead S (2007) The CIDOC conceptual reference model – a new standard for knowledge sharing. In: Grundy JC, Hartmann S, Laender AHF, Maciaszek LA, Roddick JF (eds) ER (Tutorials, Posters, Panels & Industrial Contributions). CRPIT, vol 83. Australian Computer Society, Sydney, pp 51–56

    Google Scholar 

  19. Fagin R (1983) Degrees of Acyclicity for Hypergraphs and relational database schemes. J Assoc Comput Mach 30(3):514–550. doi:10.1145/2402.322390

    MathSciNet  Article  MATH  Google Scholar 

  20. Firth H, Missier P (2016) Workload-aware streaming graph partitioning. In: Palpanas T, Stefanidis K (eds) Proceedings of the Workshops of the EDBT/ICDT 2016 Joint Conference, EDBT/ICDT Workshops 2016, Bordeaux, France, March 15, 2016, CEUR Workshop Proceedings, vol 1558. CEUR-WS.org, Bordeaux

  21. Fotakis D, Pagh R, Sanders P, Spirakis P (2005) Space efficient hash tables with worst case constant access time. Theory Comput Syst 38:229–248. doi:10.1007/s00224-004-1195-x

    MathSciNet  Article  MATH  Google Scholar 

  22. Gallo G, Scutella MG (1998) Directed hypergraphs as a modelling paradigm. Riv Mat Sci Econ Soc 21(1-2):97–123

    MathSciNet  Google Scholar 

  23. Gallo G, Longo G, Pallottino S, Nguyen S (1993) Directed hypergraphs and applications. Discrete Appl Math 42(2):177–201

    MathSciNet  Article  MATH  Google Scholar 

  24. Gao J, Zhao Q, Ren W, Swami A, Ramanathan R, Bar-Noy A (2012) Dynamic shortest path algorithms for hypergraphs. In: WiOpt. IEEE, Washington, pp 238–245

    Google Scholar 

  25. Gao J, Zhao Q, Ren W, Swami A, Ramanathan R, Bar-Noy A (2015) Dynamic shortest path algorithms for Hypergraphs. IEEE ACM Trans Netw 23(6):1805–1817. doi:10.1109/TNET.2014.2343914

    Article  Google Scholar 

  26. Grust T, Rittinger J, Teubner J (2008) Pathfinder: Xquery off the relational shelf. IEEE Data Eng Bull 31(4):7–14

    Google Scholar 

  27. Gubichev A, Then M (2014) Graph pattern matching: do we have to reinvent the wheel? In: Proceedings of workshop on graph data management, experiences and systems. ACM, New York, pp 1–7

    Google Scholar 

  28. Gutierrez C, Hurtado CA, Mendelzon AO (2004) Foundations of semantic web databases. In: ACM Symposium on PODS. ACM, New York, pp 95–106

    Google Scholar 

  29. Haubenschild M, Then M, Hong S, Chafi H (2016) Asgraph: a mutable multi-versioned graph container with high analytical performance. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES ´16

    Google Scholar 

  30. Hayes J (2004) A Graph Model for RDF. Dipoma thesis, Technische Universität Darmstadt, Germany

  31. Hayes J, Gutierrez C (2004) Bipartite graphs as intermediate model for RDF. In: Proceedings of the 3th Int. Semantic Web Conference (ISWC), number 3298 in LNCS. Springer, Berlin Heidelberg, pp 47–61

    Google Scholar 

  32. Hayes PJ (2004) Rdf semantics. W3C Recommendation. http://www.w3.org/TR/rdf-mt/. Accessed: 22 Febr 2017

    Google Scholar 

  33. He H, Singh AK (2008) Graphs-at-a-time: query language and access methods for graph databases. In: Wang JT (ed) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD Vancouver BC, June 10-12, 2008. vol 2008. ACM, New York, pp 405–418

    Google Scholar 

  34. Hölsch J, Schmidt T, Grossniklaus M (2017) On the performance of analytical and pattern matching graph queries in neo4j and a relational database. CEUR-WS.org. http://ceur-ws.org/Vol-1810. Accessed: 22 Febr 2017

    Google Scholar 

  35. Georgia Institute of Technology Stinger. Tech. rep. http://www.stingergraph.com/

  36. Jacobson G (1989) Space-efficient static trees and graphs. In: 30th Annual Symposium on Foundations of Computer Science, 30 Oct.-1 Nov. 1989, IEEE Computer Society, Washington, pp 549–554. doi:10.1109/SFCS.1989.63533

  37. Jannaschk K, Rathje CA, Thalheim B, Förster F (2011) A generic database schema for CIDOC-CRM data management. Adv Databases Inf Syst 2(789):127–136

    Google Scholar 

  38. Kiesendahl R (2014) Konzeptuelle Modellierung historischer Daten in digitalen, historischen Informationssystemen. Bachelor thesis, University of Rostock, Germany

  39. Kimura K, Koike A (2009) Localized Suffix Array and Its Application to Genome Mapping Problems for Paired-End Short Reads. In: Morishita S, Lee SY, Sakakibara Y (eds) Proceedings of the 20th International Conference on Genome Informatics. Genome Informatics Series, vol 23. Imperial College Press, London, 2009, pp 60–71

  40. Laurikari V (2000) NFas with tagged transitions, their conversion to deterministic automata and application to regular expressions. In: Seventh International Symposium on String Processing and Information Retrieval, SPIRE 2000, A Coruña, Spain, September 27–29, 2000. IEEE Computer Society, Washington, pp 181–187. doi:10.1109/SPIRE.2000.878194

  41. Levene M, Poulovassilis A (1991) An object-oriented data model formalised through hypergraphs. Data Knowl Eng 6:205–234

    Article  Google Scholar 

  42. Meyer H (2008) Inventarisation of historical maritime landscapes – an information science point of view. In: Meyer H, Springmann MJ, Wernicke H (eds) The Lagomar lagoons – unique maritime cultural landscapes in a scientific focus and in an interdisciplinary comparison. Steffen, Friedland, pp 21–33

  43. Meyer H, Schering AC, Schmitt C (2014) WossiDiA – The Wossidlo Digital Archive. In: Meyer H, Schmitt C, Janssen S, Schering AC (eds) Corpora ethnographica online. Waxmann, Münster New York

    Google Scholar 

  44. Meyer H, Schmitt C (2015) Semantische, räumliche und zeitliche Vernetzung regionalethnologischer Archive. In: Bolenz E, Franken L, Hänel D (eds) Wenn das Erbe in die Wolke kommt – Digitalisierung und kulturelles Erbe. Klartext, Essen, pp 61–86

    Google Scholar 

  45. Meyer H, Springmann MJ, Wernicke H (eds) (2008) The Lagomar lagoons – unique maritime cultural landscapes in a scientific focus and in an interdisciplinary comparison. Steffen, Friedland

    Google Scholar 

  46. Meyer H, Mukbhil R, Schering AC (2017) The Hydra.PowerGraph Data Definition, Manipulation and Query Language GrafL. CS-01-17. University of Rostock, CS Department, Germany

  47. Pagh R, Rodler FF (2004) Cuckoo hashing. J Algorithms 51(2):122–144. doi:10.1016/j.jalgor.2003.12.002

    MathSciNet  Article  MATH  Google Scholar 

  48. Pitoura E, Maabout S, Koutrika G, Marian A, Tanca L, Manolescu I, Stefanidis K (eds) (2016) Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016. Bordeaux, March 15–16, 2016.

    Google Scholar 

  49. Prud’hommeaux E, Seaborne A (2013) SPARQL 1.1 Query Language for RDF. In: W3C Recommendation (Tech. rep.; 26 March 2013)

    Google Scholar 

  50. Raman R, Raman V, Satti SR (2007) Succinct indexable dictionaries with applications to encoding k-ary trees, prefix sums and multisets. ACM Trans Algorithms 3(4):43. doi:10.1145/1290672.1290680

    MathSciNet  Article  Google Scholar 

  51. van Rest O, Hong S, Kim J, Meng X, Chafi H (2016) PGQL: a property graph query language. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems - GRADES ´16

    Google Scholar 

  52. Schering AC, Bruder I, Schmitt C, Meyer H, Heuer A (2007) Towards a digital archive for handwritten paper slips with ethnological contents. In: Goh DHL, Cao TH, Sølvberg I, Rasmussen EM (eds) ICADL. Lecture Notes in Computer Science, vol 4822. Springer, Berlin Heidelberg, pp 61–64

    Google Scholar 

  53. Schering AC, Bruder I, Jürgensmann S, Meyer H, Schmitt C (2011) From box to bin – semi-automatic digitization of a huge collection of ethnological documents. In: Xing C, Crestani F, Rauber A (eds) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Beijing, China, October 24-27, 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin Heidelberg, pp 168–171. doi:10.1007/978-3-642-24826-9

  54. Schick S, Meyer H, Heuer A (2011) Flexible publication workflows using dynamic dispatch. In: Xing C, Crestani F, Rauber A (eds) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Beijing, China, October 24-27, 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin Heidelberg, pp 257–266. doi:10.1007/978-3-642-24826-9

  55. Schmitt C (2014) Szenarien semantischer Vernetzung zwischen regionalethnographischen und dialektlexikographischen Korpora im Online-Projekt WossiDiA. In: Bühler R, Bürkle R, Leonhardt N (eds) Sprachkultur – Regionalkultur. Neue Felder kulturwissenschaftlicher Dialektforschung. TVV-Verlag, Tübingen, pp 255–286

    Google Scholar 

  56. Tangherlini TR (2013) The folklore Macroscope. The Archer Taylor memorial lecture. West Folk 72(1):7–27

    Google Scholar 

  57. Tangherlini TR, Broadwell PM (2016) WitchHunter: GeoSemantic browsing in a large folklore corpus. J Am Folklore 129(511):14–40

    Article  Google Scholar 

  58. Thompson K (1968) Regular expression search algorithm. Commun ACM 11(6):419–422

    Article  MATH  Google Scholar 

  59. Trißl S, Leser U (2007) Fast and practical indexing and querying of very large graphs. In: Chan CY, Ooi BC, Zhou A (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data Beijing, June 12-14, 2007. ACM, New York, pp 845–856

    Google Scholar 

  60. Vendt D (2013) Hochvernetzte Archivstrukturen und NoSQL-Systeme. Bachelor thesis, University of Rostock, Germany

  61. Wang J, Ntarmos N, Triantafillou P (2016) Indexing query graphs to speedup graph query processing. In: Pitoura E, Maabout S, Koutrika G, Marian A, Tanca L, Manolescu I, Stefanidis K (eds) (2016) Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016. Bordeaux, March 15–16, 2016.

  62. Xing C, Crestani F, Rauber A (eds) (2011) Digital Libraries: For Cultural Heritage, Knowledge Dissemination, and Future Creation. 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Beijing, China, October 24-27, 2011. Lecture Notes in Computer Science, vol 7008. Springer, Berlin Heidelberg. doi:10.1007/978-3-642-24826-9

Download references

Acknowledgements

The authors would like to thank Christoph Schmitt, Reinhard Kerb, and Stefanie Janssen of the European Ethnology and Wossidlo Archive as well as Timothy R. Tangherlini from University of California in Los Angeles, and Theo Meder from Meertens Instituut, Amsterdam. Also, we would like to thank all the students involved in discussing, implementing, testing, and using theWossiDiAsystem, namely David Vendt, Roland Kiesendahl, Rasha Mukbil, Md. Janghir Alam, Martin Lichtwark, and Steffen Sachse.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Holger Meyer.

Additional information

This work was part-financed by the German Research Foundation (DFG) under contract INST-16963/2-1, SCHM-1855/1-2 and the Federal Office of Civil Protection and Disaster Assistance (BBK) under contract III.1 122/04.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Meyer, H., Schering, AC. & Heuer, A. The Hydra.PowerGraph System. Datenbank Spektrum 17, 113–129 (2017). https://doi.org/10.1007/s13222-017-0253-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-017-0253-x

Keywords

  • Graph databases
  • Directed hypergraphs
  • Dynamic type checking
  • Digital humanities
  • Digital archive systems