Advertisement

GiViP: A Visual Profiler for Distributed Graph Processing Systems

  • Alessio Arleo
  • Walter Didimo
  • Giuseppe Liotta
  • Fabrizio Montecchiani
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10692)

Abstract

Analyzing large-scale graphs provides valuable insights in different application scenarios. While many graph processing systems working on top of distributed infrastructures have been proposed to deal with big graphs, the tasks of profiling and debugging their massive computations remain time consuming and error-prone. This paper presents GiViP, a visual profiler for distributed graph processing systems based on a Pregel-like computation model. GiViP captures the huge amount of messages exchanged throughout a computation and provides an interactive user interface for the visual analysis of the collected data. We show how to take advantage of GiViP to detect anomalies related to the computation and to the infrastructure, such as slow computing units and anomalous message patterns.

References

  1. 1.
    http://hadoop.apache.org/. Accessed 10 June 2017
  2. 2.
    https://spark.apache.org/. Accessed 10 June 2017
  3. 3.
    http://www.circos.ca. Accessed 10 June 2017
  4. 4.
    Hpc toolkit (2011). http://hpctoolkit.org/index.html Accessed 22 Aug 2017
  5. 5.
    Archambault, D., Purchase, H.C., Pinaud, B.: Animation, small multiples, and the effect of mental map preservation in dynamic graphs. IEEE Trans. Vis. Comput. Graph. 17(4), 539–552 (2011)CrossRefMATHGoogle Scholar
  6. 6.
    Argyriou, E.N., Symvonis, A., Vassiliou, V.: A fraud detection visualization system utilizing radial drawings and heat-maps. In: Laramee, R.S., Kerren, A., Braz, J. (eds.) IVAPP 2014, pp. 153–160. SciTePress (2014)Google Scholar
  7. 7.
    Arleo, A., Didimo, W., Liotta, G., Montecchiani, F.: GiViP: a visual profiler for distributed graph processing systems. ArXiv e-prints http://arxiv.org/abs/1708.07985 (2017)
  8. 8.
    Arleo, A., Didimo, W., Liotta, G., Montecchiani, F.: A distributed multilevel force-directed algorithm. In: Hu, Y., Nöllenburg, M. (eds.) GD 2016. LNCS, vol. 9801, pp. 3–17. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-50106-2_1 CrossRefGoogle Scholar
  9. 9.
    Arleo, A., Didimo, W., Liotta, G., Montecchiani, F.: Large graph visualizations using a distributed computing platform. Inf. Sci. 381, 124–141 (2017)CrossRefGoogle Scholar
  10. 10.
    Baur, M., Brandes, U.: Crossing reduction in circular layouts. In: Hromkovič, J., Nagl, M., Westfechtel, B. (eds.) WG 2004. LNCS, vol. 3353, pp. 332–343. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-30559-0_28 CrossRefGoogle Scholar
  11. 11.
    Beck, F., Burch, M., Diehl, S., Weiskopf, D.: A taxonomy and survey of dynamic graph visualization. Comput. Graph. Forum 36(1), 133–159 (2017)CrossRefGoogle Scholar
  12. 12.
    Behrisch, M., Bach, B., Hund, M., Delz, M., von Rüden, L., Fekete, J., Schreck, T.: Magnostics: image-based search of interesting matrix views for guided network exploration. IEEE Trans. Vis. Comput. Graph. 23(1), 31–40 (2017)CrossRefGoogle Scholar
  13. 13.
    Bostock, M., Ogievetsky, V., Heer, J.: D\({^3}\) data-driven documents. IEEE Trans. Vis. Comput. Graph. 17(12), 2301–2309 (2011)CrossRefGoogle Scholar
  14. 14.
    Braun, B., Qin, H.: ddtrace: rich performance monitoring in distributed systemsGoogle Scholar
  15. 15.
    Bruls, M., Huizing, K., van Wijk, J.J.: Squarified treemaps. In: de Leeuw, W.C., van Liere, R. (eds.) IEEE TCVG 2000. pp. 33–42. Eurographics Association (2000)Google Scholar
  16. 16.
    Burch, M., Vehlow, C., Beck, F., Diehl, S., Weiskopf, D.: Parallel edge splatting for scalable dynamic graph visualization. IEEE Trans. Vis. Comput. Graph. 17(12), 2344–2353 (2011)CrossRefGoogle Scholar
  17. 17.
    Byron, L., Wattenberg, M.: Stacked graphs - geometry & aesthetics. IEEE Trans. Vis. Comput. Graph. 14(6), 1245–1252 (2008)CrossRefGoogle Scholar
  18. 18.
    CERN: Hadoop profiler (2016). https://github.com/cerndb/Hadoop-Profiler. Accessed 10 June 2017
  19. 19.
    Ching, A., Edunov, S., Kabiljo, M., Logothetis, D., Muthukrishnan, S.: One trillion edges: graph processing at Facebook-scale. PVLDB 8(12), 1804–1815 (2015)Google Scholar
  20. 20.
    Cohen, J.: Graph twiddling in a mapreduce world. Comput. Sci. Eng. 11(4), 29–41 (2009)CrossRefGoogle Scholar
  21. 21.
    Crnovrsanin, T., Chu, J., Ma, K.: An incremental layout method for visualizing online dynamic graphs. J. Graph Algorithms Appl. 21(1), 55–80 (2017)MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Dehkordi, H.R., Eades, P., Hong, S., Nguyen, Q.H.: Circular right-angle crossing drawings in linear time. Theor. Comput. Sci. 639, 26–41 (2016)MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Di Battista, G., Eades, P., Tamassia, R., Tollis, I.G.: Graph Drawing: Algorithms for the Visualization of Graphs. Prentice-Hall, Englewood Cliffs (1999)MATHGoogle Scholar
  24. 24.
    Dogrusoz, U., Belviranli, M.E., Dilek, A.: CiSE: a circular spring embedder layout algorithm. IEEE Trans. Vis. Comput. Graph. 19(6), 953–966 (2013)CrossRefGoogle Scholar
  25. 25.
    Elmqvist, N., Do, T.N., Goodell, H., Henry, N., Fekete, J.D.: ZAME: Interactive large-scale graph visualization. In: IEEE PacificVis 2008, pp. 215–222 (2008)Google Scholar
  26. 26.
    Elmqvist, N., Fekete, J.D.: Hierarchical aggregation for information visualization: overview, techniques, and design guidelines. IEEE Trans. Vis. Comput. Graph. 16(3), 439–454 (2010)CrossRefGoogle Scholar
  27. 27.
    Frishman, Y., Tal, A.: Online dynamic graph drawing. IEEE Trans. Vis. Comput. Graph. 14(4), 727–740 (2008)CrossRefGoogle Scholar
  28. 28.
    Fuchs, J., Fischer, F., Mansmann, F., Bertini, E., Isenberg, P.: Evaluation of alternative glyph designs for time series data in a small multiple setting. In: Mackay, W.E., Brewster, S.A., Bødker, S. (eds.) 2013 ACM SIGCHI, pp. 3237–3246. ACM (2013)Google Scholar
  29. 29.
    Gabrielli, L., Rinzivillo, S., Ronzano, F., Villatoro, D.: From Tweets to semantic trajectories: mining anomalous urban mobility patterns. In: Nin, J., Villatoro, D. (eds.) CitiSens 2013. LNCS (LNAI), vol. 8313, pp. 26–35. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-04178-0_3 CrossRefGoogle Scholar
  30. 30.
    Graham, S.L., Kessler, P.B., McKusick, M.K.: Gprof: a call graph execution profiler. ACM SIGPLAN Not. 39(4), 49–57 (2004)CrossRefGoogle Scholar
  31. 31.
    Gulzar, M.A., Interlandi, M., Yoo, S., Tetali, S.D., Condie, T., Millstein, T.D., Kim, M.: BigDebug: debugging primitives for interactive big data processing in spark. In: ICSE 2016, pp. 784–795. ACM (2016)Google Scholar
  32. 32.
    Havre, S., Hetzler, B., Nowell, L.: Themeriver: visualizing theme changes over time. In: IEEE InfoVis 2000, pp. 115–123. IEEE (2000)Google Scholar
  33. 33.
    Heer, J., Bostock, M., Ogievetsky, V.: A tour through the visualization zoo. Commun. ACM 53(6), 59–67 (2010)CrossRefGoogle Scholar
  34. 34.
    Henry, N., Fekete, J., McGuffin, M.J.: NodeTrix: a hybrid visualization of social networks. IEEE Trans. Vis. Comput. Graph. 13(6), 1302–1309 (2007)CrossRefGoogle Scholar
  35. 35.
    Hochheiser, H., Shneiderman, B.: Dynamic query tools for time series data sets: timebox widgets for interactive exploration. Inform. Vis. 3(1), 1–18 (2004)CrossRefGoogle Scholar
  36. 36.
    Holten, D.: Hierarchical edge bundles: visualization of adjacency relations in hierarchical data. IEEE Trans. Vis. Comput. Graph. 12(5), 741–748 (2006)CrossRefGoogle Scholar
  37. 37.
    Jackson, J.: Facebook’s graph search puts Apache Giraph on the map (2013). http://www.pcworld.com/article/2046680/facebooks-graph-search-puts-apache-giraph-on-the-map.html/. Accessed 10 June 2017
  38. 38.
    Javed, W., McDonnel, B., Elmqvist, N.: Graphical perception of multiple time series. IEEE Trans. Vis. Comput. Graph. 16(6), 927–934 (2010)CrossRefGoogle Scholar
  39. 39.
    Johnson, A.: Introducing statsd-jvm-profiler: a JVM profiler for hadoop (2015). https://github.com/cerndb/Hadoop-Profiler. Accessed 10 June 2017
  40. 40.
    Krstajic, M., Bertini, E., Keim, D.: CloudLines: compact display of event episodes in multiple time-series. IEEE Trans. Vis. Comput. Graph. 17(12), 2432–2439 (2011)CrossRefGoogle Scholar
  41. 41.
    Krzywinski, M., Schein, J., Birol, I., Connors, J., Gascoyne, R., Horsman, D., Jones, S.J., Marra, M.A.: Circos: an information aesthetic for comparative genomics. Genome Res. 19(9), 1639–1645 (2009)CrossRefGoogle Scholar
  42. 42.
    Lumsdaine, A., Gregor, D.P., Hendrickson, B., Berry, J.W.: Challenges in parallel graph processing. Parallel Process. Lett. 17(1), 5–20 (2007)MathSciNetCrossRefGoogle Scholar
  43. 43.
    Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: ACM SIGMOD 2010, pp. 135–146. ACM (2010)Google Scholar
  44. 44.
    Masuda, S., Kashiwabara, T., Nakajima, K., Fujisawa, T.: On the NP-completeness of a computer network layout problem. In: IEEE International Symposium on Circuits and Systems, pp. 292–295 (1987)Google Scholar
  45. 45.
    McCune, R.R., Weninger, T., Madey, G.: Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Comput. Surv. 48(2), 25:1–25:39 (2015)CrossRefGoogle Scholar
  46. 46.
    McLachlan, P., Munzner, T., Koutsofios, E., North, S.: LiveRAC: interactive visual exploration of system management time-series data. In: 2008 ACM SIGCHI, pp. 1483–1492. ACM (2008)Google Scholar
  47. 47.
    Plaisant, C., Milash, B., Rose, A., Widoff, S., Shneiderman, B.: LifeLines: visualizing personal histories. In: 1996 ACM SIGCHI, pp. 221–227. ACM (1996)Google Scholar
  48. 48.
    Playfair, W.: The Commercial and Political Atlas: Representing, by Means of Stained Copper-plate Charts, the Progress of the Commerce, Revenues, Expenditure and Debts of England During the Whole of the Eighteenth Century. Printed by T. Burton for J. Wallis, etc; 3rd edn. (1801)Google Scholar
  49. 49.
    Purchase, H.C.: Effective information visualisation: a study of graph drawing aesthetics and algorithms. Interact. Comput. 13(2), 147–162 (2000)CrossRefGoogle Scholar
  50. 50.
    Purchase, H.C., Carrington, D.A., Allder, J.A.: Empirical evaluation of aesthetics-based graph layout. Empirical Softw. Eng. 7(3), 233–255 (2002)CrossRefMATHGoogle Scholar
  51. 51.
    Purchase, H.C., Hoggan, E., Görg, C.: How important is the “Mental Map”? – an empirical investigation of a dynamic graph layout algorithm. In: Kaufmann, M., Wagner, D. (eds.) GD 2006. LNCS, vol. 4372, pp. 184–195. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-70904-6_19 CrossRefGoogle Scholar
  52. 52.
    Reinders, J.: VTune Performance Analyzer Essentials. Intel Press (2005)Google Scholar
  53. 53.
    Saito, T., Miyamura, H.N., Yamamoto, M., Saito, H., Hoshiya, Y., Kaseda, T.: Two-tone pseudo coloring: Compact visualization for one-dimensional data. In: 2005 IEEE InfoVis, pp. 173–180. IEEE (2005)Google Scholar
  54. 54.
    Salihoglu, S., Shin, J., Khanna, V., Truong, B.Q., Widom, J.: Graft: a debugging tool for Apache Giraph. In: ACM SIGMOD 2015, pp. 1403–1408. ACM (2015)Google Scholar
  55. 55.
    Salihoglu, S., Widom, J.: GPS: a graph processing system. In: SSDBM 2013, pp. 22:1–22:12. ACM (2013)Google Scholar
  56. 56.
    Seo, S., Yoon, E.J., Kim, J., Jin, S., Kim, J., Maeng, S.: HAMA: an efficient matrix computation with the mapreduce framework. In: CloudCom 2010, pp. 721–726. IEEE (2010)Google Scholar
  57. 57.
    Six, J.M., Tollis, I.G.: A framework for circular drawings of networks. In: Kratochvíyl, J. (ed.) GD 1999. LNCS, vol. 1731, pp. 107–116. Springer, Heidelberg (1999).  https://doi.org/10.1007/3-540-46648-7_11 CrossRefGoogle Scholar
  58. 58.
    Stitz, H., Gratzl, S., Aigner, W., Streit, M.: ThermalPlot: visualizing multi-attribute time-series data using a thermal metaphor. IEEE Trans. Vis. Comput. Graph. 22(12), 2594–2607 (2016)CrossRefGoogle Scholar
  59. 59.
    Stitz, H., Gratzl, S., Krieger, M., Streit, M.: CloudGazer: a divide-and-conquer approach to monitoring and optimizing cloud-based networks. In: IEEE PacificVis 2015, pp. 175–182. IEEE (2015)Google Scholar
  60. 60.
    Tang, J.: Graph mining with Apache Giraph (2013). https://www.slideshare.net/Hadoop_Summit/tang-june26-205pmroom210cv2, Accessed 10 June 2017
  61. 61.
    Tufte, E.: The Visual Display of Quantitative Information. Encyclopedia of Mathematics and its Applications. Graphics Press, Cheshire (1983)Google Scholar
  62. 62.
    Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRefGoogle Scholar
  63. 63.
    Vaquero, L.M., Cuadrado, F., Logothetis, D., Martella, C.: Adaptive partitioning for large-scale dynamic graphs. In: IEEE ICDCS 2014, pp. 144–153. IEEE (2014)Google Scholar
  64. 64.
    Ward, M.O.: Multivariate data glyphs: principles and practice. Handbook of Data Visualization. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-33037-0_8 Google Scholar
  65. 65.
    Ware, C., Purchase, H.C., Colpoys, L., McGill, M.: Cognitive measurements of graph aesthetics. Inform. Vis. 1(2), 103–110 (2002)CrossRefGoogle Scholar
  66. 66.
    Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI 2012, p. 2. USENIX Association (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Alessio Arleo
    • 1
  • Walter Didimo
    • 1
  • Giuseppe Liotta
    • 1
  • Fabrizio Montecchiani
    • 1
  1. 1.Università degli Studi di PerugiaPerugiaItaly

Personalised recommendations