Abstract
Finding similarities between two datasets is an important task in many research areas, particularly those of data mining, information retrieval, cloud computing, and biometrics. However, maintaining data protection and privacy while enabling similarity measurements has become a priority for data owners in recent years. In this paper, we study the design of an efficient and secure protocol to facilitate the Hamming distance computation between two semi-honest parties (a client and a server). In our protocol design, both parties are constrained to ensure that no extra information will be revealed other than the computed result (privacy is protected) and further, the output of the protocol is according to the prescribed functionality (correctness is guaranteed). In order to achieve these requirements, we utilize a multiplicative homomorphic cryptosystem and include chaff data into the computation. Two experimental results in this paper demonstrate the performance of both the client and the server.
Similar content being viewed by others
References
Agrawal R, Evfimievski A, Srikant R (2003) Information sharing across private databases. In: Paper presented at the proceedings of the 2003 ACM SIGMOD international conference on management of data, San Diego
Agrawal R, Terzi E (2006) On honesty in sovereign information sharing. In: Paper presented at the proceedings of the 10th international conference on advances in database technology, Munich
Clifton C, Kantarcio M, Doan A, Schadow G, Vaidya J, Elmagarmid A, Suciu D (2004) Privacy-preserving data integration and sharing. In: Paper presented at the proceedings of the 9th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, Paris
Ajmani S, Morris R, Liskov B (2001) A trusted third-party computation. In: Technical report MIT-LCS-TR-847. MIT, Cambridge
Connor EF, Simberloff D (1983) Intraspecific competition and species co-occurrence patterns on Islands. Oikos 41:455–465
Gilpin ME, Diamond JM (1982) Factors contributing to non-randomness in species co-occurrences on Islands. Oecologia 52(1):75–84. doi:10.1007/bf00349014
Pandi MH, Kashefi O, Minaei B (2011) A novel similarity measure for sequence data. J Inf Process Syst 7(3):413–424. doi:10.3745/JIPS.2011.7.3.413
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2(2):139–172. doi:10.1023/a:1022852608280
Willett P (2003) Similarity-based approaches to virtual screening. Biochem Soc Trans 31(Pt 3):603–606. doi:http://www.ncbi.nlm.nih.gov/pubmed/12773164
Fan C-I, Lin Y-H (2012) Full privacy minutiae-based fingerprint verification for low-computation devices. J Convergence 3(2):21–24
Upmanyu M, Namboodiri AM, Srinathan K, Jawahar CV (2009) Efficient biometric verification in encrypted domain. In: Paper presented at the proceedings of the 3rd international conference on advances in biometrics, Alghero
Kok-Seng W, Myung-Ho K (2012) A privacy-preserving biometric matching protocol for iris codes verification. In: 3rd FTRA international conference on mobile, ubiquitous, and intelligent computing (MUSIC), pp 120–125, 26–28 June 2012. doi:10.1109/music.2012.28
Lindell Y, Pinkas B (2009) Secure multiparty computation for privacy-preserving data mining. J Priv Confid 1(1):59–98
Hamming RW (1950) Error detecting and error correcting codes. Bell Syst Tech J 29(2):147–160
Hamming RW (1980) Coding and information theory. Prentice-Hall, New Jersey
Pang KF, Gamal AE (1986) Communication complexity of computing the Hamming distance. SIAM J Comput 15(4):932–947. doi:10.1137/0215065
Agrawal R, Srikant R (2000) Privacy-preserving data mining. SIGMOD Rec 29(2):439–450. doi:10.1145/335191.335438
Clifton C, Kantarcioglu M, Vaidya J, Lin X, Zhu MY (2002) Tools for privacy preserving distributed data mining. SIGKDD Explor News 14(2):28–34. doi:10.1145/772862.772867
Bringer J, Chabanne H, Izabachene M, Pointcheval D, Tang Q, Zimmer S (2007) An application of the Goldwasser–Micali cryptosystem to biometric authentication. In: Paper presented at the proceedings of the 12th Australasian conference on information security and privacy, Townsville
Sang Y, Shen H (2009) Efficient and secure protocols for privacy-preserving set operations. ACM Trans Inf Syst Secur 13(1):1–35. doi:10.1145/1609956.1609965
Bringer J, Chabanne H (2012) Embedding edit distance to enable private keyword search. Hum Centric Comput Inf Sci 2(2):1–12. doi:10.1186/2192-1962-2-2
Diffie W, Hellman ME (1976) New directions in cryptography. IEEE Trans Inf Theory 22(6):644–654
Shah P, Miyake A (2005) The Cambridge handbook of visuospatial thinking. Cambridge University Press, Cambridge
Feigenbaum J, Ishai Y, Malkin T, Nissim K, Strauss MJ, Wright RN (2006) Secure multiparty computation of approximations. ACM Trans Algorithms 2(3):435–472. doi:10.1145/1159892.1159900
Goldwasser S, Micali S (1984) Probabilistic encryption. J Comput Syst Sci 28(2):270–299
Rivest RL, Adleman L, Dertouzos ML (1978) On data banks and privacy homomorphisms. In: Foundations of secure computation. Academia Press, London, pp 169–179
Yao AC (1982) Protocols for secure computations. In: Paper presented at the proceedings of the 23rd annual symposium on foundations of computer science
Goldreich O, Micali S, Wigderson A (1987) How to play ANY mental game. In: Paper presented at the proceedings of the 19th annual ACM symposium on theory of computing, New York
Jarrous A, Pinkas B (2009) Secure Hamming distance based computation and its applications. In: Paper presented at the proceedings of the 7th international conference on applied cryptography and network security, Paris-Rocquencourt
Paillier P (1999) Public-key cryptosystems based on composite degree residuosity classes. In: Paper presented at the proceedings of the 17th international conference on theory and application of cryptographic techniques, Prague
Yang Z, Zhong S, Wright RN (2008) Towards privacy-preserving model selection. In: Paper presented at the proceedings of the 1st ACM SIGKDD international conference on privacy, security, and trust in KDD, San Jose
Elgamal T (1984) A public key cryptosystem and a signature scheme based on discrete logarithms. In: Paper presented at the proceedings of CRYPTO 84 on advances in cryptology, Santa Barbara
Goldreich O (2005) Foundations of cryptography: a primer. Found Trends Theor Comput Sci 1(1):1–116. doi:10.1561/0400000001
Hazay C, Lindell Y (2008) Efficient protocols for set intersection and pattern matching with security against malicious and covert adversaries. In: Paper presented at the proceedings of the 5th conference on theory of cryptography, New York
Goldreich O (2004) Foundations of cryptography, vol 2. In: Basic applications. Cambridge University Press, Cambridge
Goldwasser S, Micali S (1982) Probabilistic encryption and how to play mental poker keeping secret all partial information. In: Paper presented at the proceedings of the 14th annual ACM symposium on theory of computing, San Francisco
Sakuma J, Wright RN (2009) Privacy-preserving evaluation of generalization error and its application to model and attribute selection. In: Paper presented at the proceedings of the 1st Asian conference on machine learning: advances in machine learning, Nanjing
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wong, KS., Kim, M.H. On private Hamming distance computation. J Supercomput 69, 1123–1138 (2014). https://doi.org/10.1007/s11227-013-1063-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-1063-z