Repairing and Optimizing Hadoop hashCode Implementations

  • Zoltan A. Kocsis
  • Geoff Neumann
  • Jerry Swan
  • Michael G. Epitropakis
  • Alexander E. I. Brownlee
  • Sami O. Haraldsson
  • Edward Bowles
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8636)

Abstract

We describe how contract violations in JavaTMhashCode methods can be repaired using novel combination of semantics-preserving and generative methods, the latter being achieved via Automatic Improvement Programming. The method described is universally applicable. When applied to the Hadoop platform, it was established that it produces hashCode functions that are at least as good as the original, broken method as well as those produced by a widely-used alternative method from the ‘Apache Commons’ library.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Arcuri, A., Briand, L.: A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd International Conference on Software Engineering, ICSE 2011, pp. 1–10 (2011)Google Scholar
  2. 2.
    Bahi, J.M., Guyeux, C.: Hash Functions Using Chaotic Iterations. Journal of Algorithms & Computational Technology 4(2), 167–182 (2010)CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Grech, N., Rathke, J., Fischer, B.: JEqualityGen: Generating equality and hashing methods. In: The Ninth International Conference on Generative Programming and Component Engineering, pp. 177–186 (2011)Google Scholar
  4. 4.
    Hoosand, H.H., Stützle, T.: Stochastic Local Search: Foundations & Applications. Elsevier / Morgan Kaufmann (2004)Google Scholar
  5. 5.
    Kong, W., Li, W.J.: Isotropic Hashing. Technical report (2012)Google Scholar
  6. 6.
    Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge (1994)MATHGoogle Scholar
  7. 7.
    Langdon, W.B., Harman, M.: Optimising Existing Software with Genetic Programming. IEEE Transactions on Evolutionary Computation PP(99), 1–18 (2014)Google Scholar
  8. 8.
    Meyer, B.: Applying ‘design by contract’. Computer 25(10), 40–51 (1992)CrossRefGoogle Scholar
  9. 9.
    Oracle. Java Platform Standard Ed. 7 (2013)Google Scholar
  10. 10.
    Rayside, D., Benjamin, Z., Singh, R., Near, J.P., Milicevic, A., Jackson, D.: Equality and hashing for (almost) free. In: ICSE 2009 Proceedings of the 31st International Conference on Software Engineering, pp. 342–352 (2009)Google Scholar
  11. 11.
    Swan, J., Epitropakis, M.G., Woodward, J.R.: Gen-O-Fix: An embeddable framework for Dynamic Adaptive Genetic Improvement Programming. Technical Report January, Department of Computing Science and Mathematics, University of Stirling, Stirling, UK (2014)Google Scholar
  12. 12.
    Vaziri, M., Tip, F., Fink, S.J., Dolby, J.: Declarative object identity using relation types. In: Ernst, E. (ed.) ECOOP 2007. LNCS, vol. 4609, pp. 54–78. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Zoltan A. Kocsis
    • 1
  • Geoff Neumann
    • 1
  • Jerry Swan
    • 1
  • Michael G. Epitropakis
    • 1
  • Alexander E. I. Brownlee
    • 1
  • Sami O. Haraldsson
    • 1
  • Edward Bowles
    • 2
  1. 1.University of StirlingUK
  2. 2.University of YorkUK

Personalised recommendations