Explaining Successful Docker Images Using Pattern Mining Analysis

  • Riccardo GuidottiEmail author
  • Jacopo Soldani
  • Davide Neri
  • Antonio Brogi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11176)


Docker is on the rise in today’s enterprise IT. It permits shipping applications inside portable containers, which run from so-called Docker images. Docker images are distributed in public registries, which also monitor their popularity. The popularity of an image directly impacts on its usage, and hence on the potential revenues of its developers. In this paper, we present a frequent pattern mining-based approach for understanding how to improve an image to increase its popularity. The results in this work can provide valuable insights to Docker image providers, helping them to design more competitive software products.



Work partly supported by the EU H2020 Program under the funding scheme “INFRAIA-1-2014-2015: Research Infrastructures” grant agreement 654024 “SoBigData”


  1. 1.
    Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference Very Large Data Bases, VLDB, vol. 1215, pp. 487–499 (1994)Google Scholar
  2. 2.
    Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Min. Knowl. Discov. 5(3), 213–246 (2001)CrossRefGoogle Scholar
  3. 3.
    Berri, D.J., Schmidt, M.B., Brook, S.L.: Stars at the gate: the impact of star power on nba gate revenues. J. Sports Econ. 5(1), 33–50 (2004)CrossRefGoogle Scholar
  4. 4.
    Brogi, A., Neri, D., Soldani, J.: DockerFinder: multi-attribute search of docker images. In: IC2E, pp. 273–278. IEEE (2017)Google Scholar
  5. 5.
    Franck, E., Nüesch, S.: Mechanisms of superstar formation in german soccer: empirical evidence. Eur. Sport Manag. Q. 8(2), 145–164 (2008)CrossRefGoogle Scholar
  6. 6.
    Guidotti, R., Monreale, A., Rinzivillo, S., Pedreschi, D., Giannotti, F.: Retrieving points of interest from human systematic movements. In: Canal, C., Idani, A. (eds.) SEFM 2014. LNCS, vol. 8938, pp. 294–308. Springer, Cham (2015). Scholar
  7. 7.
    Guidotti, R., Rossetti, G., Pedreschi, D.: Audio Ergo Sum. In: Milazzo, P., Varró, D., Wimmer, M. (eds.) STAF 2016. LNCS, vol. 9946, pp. 51–66. Springer, Cham (2016). Scholar
  8. 8.
    Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Record, vol. 29, pp. 1–12. ACM (2000)Google Scholar
  9. 9.
    Harackiewicz, J.M., et al.: Predicting success in college: a longitudinal study of achievement goals and ability measures as predictors of interest and performance from freshman year through graduation. JEP 94(3), 562 (2002)Google Scholar
  10. 10.
    Hars, A., Ou, S.: Working for free? - motivations of participating in open source projects. IJEC 6(3), 25–39 (2002)Google Scholar
  11. 11.
    Herrera, F., Carmona, C.J., González, P., Del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. KAIS 29(3), 495–525 (2011)Google Scholar
  12. 12.
    Joy, A.: Performance comparison between Linux containers and virtual machines. In: ICACEA, pp. 342–346, March 2015Google Scholar
  13. 13.
    Litman, B.R.: Predicting success of theatrical movies: an empirical study. J. Popular Cult. 16(4), 159–175 (1983)CrossRefGoogle Scholar
  14. 14.
    Ma, Z., Sun, A., Cong, G.: On predicting the popularity of newly emerging hashtags in twitter. JASIST 64(7), 1399–1410 (2013)CrossRefGoogle Scholar
  15. 15.
    Miell, I., Sayers, A.H.: Docker in Practice. Manning Publications Co., Shelter Island (2016)Google Scholar
  16. 16.
    Pahl, C., Brogi, A., Soldani, J., Jamshidi, P.: Cloud container technologies: a state-of-the-art review. IEEE Trans. Cloud Comput. (2017, in press)Google Scholar
  17. 17.
    Pappalardo, L., Cintia, P.: Quantifying the relation between performance and success in soccer. In: Advances in Complex Systems, p. 1750014 (2017)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Pappalardo, L., Cintia, P., Pedreschi, D., Giannotti, F., Barabasi, A.-L.: Human perception of performance. arXiv preprint arXiv:1712.02224 (2017)
  19. 19.
    Park, J., et al.: Style in the age of instagram: predicting success within the fashion industry using social media. In: CSCW, pp. 64–73. ACM (2016)Google Scholar
  20. 20.
    Penner, O., Pan, R.K., Petersen, A.M., Kaski, K., Fortunato, S.: On the predictability of future impact in science. Sci. Rep. 3, 3052 (2013)CrossRefGoogle Scholar
  21. 21.
    Pollacci, L., Guidotti, R., Rossetti, G., Giannotti, F., Pedreschi, D.: The fractal dimension of music: geography, popularity and sentiment analysis. In: Guidi, B., Ricci, L., Calafate, C., Gaggi, O., Marquez-Barja, J. (eds.) GOODTECHS 2017. LNICST, vol. 233, pp. 183–194. Springer, Cham (2018). Scholar
  22. 22.
    Sinatra, R., Wang, D., Deville, P., Song, C., Barabási, A.-L.: Quantifying the evolution of individual scientific impact. Science 354(6312), aaf5239 (2016)CrossRefGoogle Scholar
  23. 23.
    Soltesz, S., et al.: Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. In: SIGOPS, vol. 41, pp. 275–287 (2007)CrossRefGoogle Scholar
  24. 24.
    Tan, P.-N., et al.: Introduction to Data Mining. Pearson Education India (2006)Google Scholar
  25. 25.
    Trzciński, T., Rokita, P.: Predicting popularity of online videos using support vector regression. IEEE Trans. Multimedia 19(11), 2561–2570 (2017)CrossRefGoogle Scholar
  26. 26.
    Wang, D., Song, C., Barabási, A.-L.: Quantifying long-term scientific impact. Science 342(6154), 127–132 (2013)CrossRefGoogle Scholar
  27. 27.
    Weicheng, Y., Beijun, S., Ben, X.: Mining GitHub: why commit stops–exploring the relationship between developer’s commit pattern and le version evolution. In: APSEC, vol. 2, pp. 165–169. IEEE (2013)Google Scholar
  28. 28.
    Yu, Y., Yin, G., Wang, H., Wang, T.: Exploring the patterns of social behavior in GitHub. In: CrowdSoft, pp. 31–36. ACM (2014)Google Scholar
  29. 29.
    Zhou, Z.-H., Zhang, M.-L.: Multi-instance multi-label learning with application to scene classification. In: NIPS, pp. 1609–1616 (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Riccardo Guidotti
    • 1
    • 2
    Email author
  • Jacopo Soldani
    • 1
  • Davide Neri
    • 1
  • Antonio Brogi
    • 1
  1. 1.University of PisaPisaItaly
  2. 2.KDDLabISTI-CNRPisaItaly

Personalised recommendations