Skip to main content

Introduction to Graph Databases

  • Chapter

Part of the Lecture Notes in Computer Science book series (LNISA,volume 8714)

Abstract

The use of graphs in analytic environments is getting more and more widespread, with applications in many different environments like social network analysis, fraud detection, industrial management, knowledge analysis, etc. Graph databases are one important solution to consider in the management of large datasets. The course will be oriented to tackle four important aspects of graph management. First, to give a characterization of graphs and the most common operations applied on them. Second, to review the technologies for graph management and focus on the particular case of Sparksee. Third, to analyze in depth some important applications and how graphs are used to solve them. Fourth, to understand the use of benchmarking to make the requirements of the user compatible with the growth of the technologies for graph management.

Keywords

  • Social Network
  • Degree Distribution
  • Sentiment Analysis
  • Graph Database
  • Graph Query

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-10587-1_4
  • Chapter length: 24 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   54.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-10587-1
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   69.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Leskovec, J., Huttenlocher, D.P., Kleinberg, J.M.: Signed networks in social media. In: CHI, pp. 1361–1370 (2010)

    Google Scholar 

  2. Goertzel, B.: OpenCogPrime: A cognitive synergy based architecture for artificial general intelligence. In: IEEE ICCI, pp. 60–68 (2009)

    Google Scholar 

  3. Newman, M.: Networks: An Introduction. Oxford University Press, Inc., New York (2010)

    Google Scholar 

  4. Levene, M., Poulovassilis, A.: The hypernode model: A graph-theoretic approach to integrating data and computation. In: FMLDO, pp. 55–77 (1989)

    Google Scholar 

  5. Ërdos, P., Rényi, A.: On random graphs. Mathematicae 6, 290–297 (1959)

    MathSciNet  MATH  Google Scholar 

  6. Leskovec, J., Lang, K.J., Dasgupta, A., Mahoney, M.W.: Statistical properties of community structure in large social and information networks. In: WWW, pp. 695–704 (2008)

    Google Scholar 

  7. Flickr Blog: Six billion (retrieved on march 2014), http://blog.flickr.net/en/2011/08/04/6000000000/

  8. Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: SIGCOMM, pp. 251–262 (1999)

    Google Scholar 

  9. McGlohon, M., Akoglu, L., Faloutsos, C.: Weighted graphs and disconnected components: patterns and a generator. In: KDD, pp. 524–532 (2008)

    Google Scholar 

  10. Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Comput. Surv. 38 (2006)

    Google Scholar 

  11. Leskovec, J., Kleinberg, J.M., Faloutsos, C.: Graph evolution: Densification and shrinking diameters. TKDD 1 (2007)

    Google Scholar 

  12. SNAP: (Stanford large network dataset collection), http://snap.stanford.edu/data/index.html

  13. Martínez-Bazan, N., Muntés-Mulero, V., Gómez-Villamor, S., Nin, J., Sánchez-Martínez, M.-A., Larriba-Pey, J.-L.: Dex: high-performance exploration on large graphs for information retrieval. In: CIKM, pp. 573–582 (2007)

    Google Scholar 

  14. Martínez-Bazan, N., Aguila-Lorente, M.A., Muntés-Mulero, V., Dominguez-Sal, D., Gómez-Villamor, S., Larriba-Pey, J.-L.: Efficient graph management based on bitmap indices. In: IDEAS, pp. 110–119 (2012)

    Google Scholar 

  15. Nelson, J., Myers, B., Hunter, A.H., Briggs, P., Ceze, L., Ebeling, C., Grossman, D., Kahan, S., Oskin, M.: Crunching large graphs with commodity processors. In: HotPar (2011)

    Google Scholar 

  16. Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146 (2010)

    Google Scholar 

  17. Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30 (2012)

    Google Scholar 

  18. Stutz, P., Bernstein, A., Cohen, W.: Signal/Collect: Graph algorithms for the (Semantic) web. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 764–780. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  19. Gupta, P., Goel, A., Lin, J., Sharma, A., Wang, D., Zadeh, R.: Wtf: The who to follow service at twitter. In: WWW, pp. 505–514 (2013)

    Google Scholar 

  20. Averbuch, A., Neumann, M.: Partitioning graph databases-a quantitative evaluation. arXiv preprint arXiv:1301.5121 (2013)

    Google Scholar 

  21. Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization and identification of web communities. IEEE Computer 35(3), 66–71 (2002)

    CrossRef  Google Scholar 

  22. Girvan, M., Newman, M.: Community structure in social and biological networks. National Academy of Sciences 99(12), 7821–7826 (2002)

    MathSciNet  CrossRef  MATH  Google Scholar 

  23. Schwartz, M., Wood, D.: Discovering shared interests among people using graph analysis of global electronic mail traffic. Communications of the ACM 36, 78–89 (1992)

    CrossRef  Google Scholar 

  24. Prat-Pérez, A., Dominguez-Sal, D., Larriba-Pey, J.-L.: High quality, scalable and parallel community detection for large real graphs. In: To be published in WWW (2014)

    Google Scholar 

  25. Bleiholder, J., Naumann, F.: Data fusion. ACM Computing Surveys (CSUR) 41, 1 (2008)

    CrossRef  Google Scholar 

  26. Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. on Knowledge and Data Engineering 24, 1537–1555 (2012)

    CrossRef  Google Scholar 

  27. Arasu, A., Ré, C., Suciu, D.: Large-scale deduplication with constraints using dedupalog. In: ICDE, pp. 952–963 (2009)

    Google Scholar 

  28. Whang, S.E., Garcia-Molina, H.: Entity resolution with evolving rules. PVLDB 3, 1326–1337 (2010)

    Google Scholar 

  29. Whang, S.E., Benjelloun, O., Garcia-Molina, H.: Generic entity resolution with negative rules. VLDB Journal 18, 1261–1277 (2009)

    CrossRef  Google Scholar 

  30. Leitão, L., Calado, P., Weis, M.: Structure-based inference of xml similarity for fuzzy duplicate detection. In: CIKM, pp. 293–302 (2007)

    Google Scholar 

  31. Rastogi, V., Dalvi, N., Garofalakis, M.: Large-scale collective entity matching. PVLDB 4, 208–218 (2011)

    Google Scholar 

  32. Thor, A., Rahm, E.: MOMA - A Mapping-based Object Matching System. In: CIDR, pp. 247–258 (2007)

    Google Scholar 

  33. Transaction Processing Performance Council (TPC): TPC benchmark website, http://www.tpc.org

  34. Cattell, R., Skeen, J.: Object operations benchmark. ACM Trans. Database Syst. 17, 1–31 (1992)

    CrossRef  Google Scholar 

  35. Carey, M.J., DeWitt, D.J., Naughton, J.F.: The oo7 benchmark. In: SIGMOD Conference, pp. 12–21 (1993)

    Google Scholar 

  36. Bader, D., Feo, J., Gilbert, J., Kepner, J., Koetser, D., Loh, E., Madduri, K., Mann, B., Meuse, T., Robinson, E.: HPC Scalable Graph Analysis Benchmark v1.0. HPC Graph Analysis (2009)

    Google Scholar 

  37. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-mat: A recursive model for graph mining. In: SDM, pp. 442–446 (2004)

    Google Scholar 

  38. Dominguez-Sal, D., Urbón-Bayes, P., Giménez-Vañó, A., Gómez-Villamor, S., Martínez-Bazan, N., Larriba-Pey, J.-L.: Survey of graph database performance on the hpc scalable graph analysis benchmark. In: WAIM Workshops, pp. 37–48 (2010)

    Google Scholar 

  39. Dominguez-Sal, D., Martinez-Bazan, N., Muntes-Mulero, V., Baleta, P., Larriba-Pey, J.L.: A discussion on the design of graph database benchmarks. In: Nambiar, R., Poess, M. (eds.) TPCTC 2010. LNCS, vol. 6417, pp. 25–40. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  40. Ciglan, M., Averbuch, A., Hluchý, L.: Benchmarking traversal operations over graph databases. In: ICDE Workshops, pp. 186–189 (2012)

    Google Scholar 

  41. Tinkerpop: Open source property graph software stack, http://www.tinkerpop.com

  42. Graph 500 Website: The graph 500 list, http://www.graph500.org/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Larriba-Pey, J.L., Martínez-Bazán, N., Domínguez-Sal, D. (2014). Introduction to Graph Databases. In: , et al. Reasoning Web. Reasoning on the Web in the Big Data Era. Reasoning Web 2014. Lecture Notes in Computer Science, vol 8714. Springer, Cham. https://doi.org/10.1007/978-3-319-10587-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10587-1_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10586-4

  • Online ISBN: 978-3-319-10587-1

  • eBook Packages: Computer ScienceComputer Science (R0)