Protecting business intelligence and customer privacy while outsourcing data mining tasks

Qiu, Ling; Li, Yingjiu; Wu, Xintao

doi:10.1007/s10115-007-0113-3

Protecting business intelligence and customer privacy while outsourcing data mining tasks

Regular Paper
Published: 16 November 2007

Volume 17, pages 99–120, (2008)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Ling Qiu¹,
Yingjiu Li² &
Xintao Wu³

699 Accesses
31 Citations
Explore all metrics

Abstract

Nowadays data mining plays an important role in decision making. Since many organizations do not possess the in-house expertise of data mining, it is beneficial to outsource data mining tasks to external service providers. However, most organizations hesitate to do so due to the concern of loss of business intelligence and customer privacy. In this paper, we present a Bloom filter based solution to enable organizations to outsource their tasks of mining association rules, at the same time, protect their business intelligence and customer privacy. Our approach can achieve high precision in data mining by trading-off the storage requirement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Privacy-Preserving Association Rule Mining in Distributed Database Environment: A Review

Survey on Privacy-Preserving and Other Security Issues in Data Mining

Anonymization-as-a-Service: The Service Center Transcripts Industrial Case

References

Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of the 20th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, pp 247–255
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of database pp 207–216
Agrawal R, Kiernan J, Srikant R, Xu Y (2004) Order preserving encryption for numeric data. In: Proceedings of the ACM SIGMOD ICMD, pp 563–574
Agrawal R, Srikant R (1994) Faster algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB’94), Santiago de Chile, Chile, September 12–15, pp 487–499
Agrawal R, Srikant R (2000) Privacy preserving data mining. In: Proceedings of the 2000 ACM SIGMOD international conference on management of database, Texas, USA, May 16–18, pp 439–450
Agrawal S, Haritsa JR (2005) A framework for high-accuracy privacy-preserving mining. In: Proceedings of the 21th IEEE international conference on data engineering (ICDE 2005), Tokyo, Japan, pp 193–204
Apte C, Liu B, Pednault E and Smyth P (2002). Business applications of data mining. Commun ACM 45(8): 49–53
Article Google Scholar
Atallah M, Bertino E, Elmagarmid AK, Ibrahim M, Verykios VS (1999) Disclosure limitation of sensitive rules. In: Proceedings of the IEEE KDEE, pp 45–52
Bishop M, Bhumiratana B, Crawford R, Levitt K (2004) How to sanitize data. In: Proceedings of the 13th IEEE international workshops on enabling technologies: infrastructure for collaborative enterprises (WETICE’04), Modena, Italy, June 14–16, pp 217–222
Bloom B (1970). Space time tradeoffs in hash coding with allowable errors. Commun ACM 7(13): 422–426
Article Google Scholar
Dasseni E, Verykios VS, Elmagarmid AK, Bertino E (2001) Hiding association rules by using confidence and support. In: Proceedings of the 4th international information hiding workshop, pp 369–383
Dibbeern J, Heinzl A (2002) Outsourcing information systems in small and medium sized enterprises: a test of a multi-theoretical casaul model. In: Dibbeern J (ed) Information systems outsourcing: enduring themes, emergent patterns, and future directions. Springer, New York
Du W, Zhan Z (2002) Building decision tree classifier on private data. In: Proceedings of IEEE ICDM’02 workshop on privacy, security, and data mining, vol 14, pp 1–8
Evfimievski A, Gehrke J, Srikant R (2003) Limiting privacy breaches in privacy preserving data mining. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART symposium on principles of database system, pp 211–222
Evfimievski A, Srikant R, Agrawal R, Gehrke J (2002) Privacy preserving mining of association rules. In: Proceedings of the 8th ACM SIGKDD KDD 2002, pp 217–228
Hacigumus H, Iyer B, Li C, Mehrotra S (2002) Executing SQL over encrypted data in the database-service-provider model. In: Proceedings of the ACM SIGMOD international conference on management of database, pp 216–227
Hacigumus H, Iyer B, Mehrotra S (2002) Providing database as a service. In: Proceedings of the international conference on data engineering, pp 29–40
Hacigumus H, Iyer B, Mehrotra S (2004) Efficient execution of aggregation queries over encrypted relational databases. In: Proceedings of international conference on database systems for advanced applications, pp 125–136
Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: Proceedings of the ACM SIGMOD international conference on management of data, Baltimore, MA, USA, June 14–16, pp 37–48
Iyer B, Mehrotra S, Mykletun E, Tsudik G, Wu Y (2004) A framework for efficient storage security in RDBMS. In: Proceedings of international conference on EDBT, pp 147–164
Kantarcıǒlu M, Clifton C (2002) Privacy preserving distributed mining of association rules on horizontally partitioned data. In: Proceedings of the ACM SIGMOD workshop on research issues on data mining and knowledge discovery, pp 24–31
Kantarcıǒlu M, Jin J, Clifton C (2004) When do data mining results violate privacy? In: Proceedings of the 10th ACM SIGKDD KDD 2004, pp 599–604
Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: Proceedings of the 3rd IEEE ICDM, pp 99–106
Kargupta H, Datta S, Wang Q and Sivakumar K (2005). Random-data perturbation techniques and privacy-preserving data mining. Knowledge Inf Syst Int J 7(4): 387–414
Article Google Scholar
Lin Q-Y, Chen Y-L, Chen J-S and Chen Y-C (2003). Mining inter-organizational retailing knowledge for an alliance formed by competitive firms. Inf Manage 40(5): 431–442
Article Google Scholar
Lindell Y and Pinkas B (2002). Privacy preserving data mining. J Cryptol 15(3): 177–206
Article MATH MathSciNet Google Scholar
Lui SM, Qiu L (2007) Individual privacy and organizational privacy in business analytics. In: Proceedings of the 40th Hawaii international conference on system sciences (HICSS 2007), Hawaii, USA, January 3–6, p 216b
Milne G-R (2000). Privacy and ethical issues in database/interactive marketing and public policy: a research framework and overview of the special issue. J Public Policy Marketing 19: 1–6
Article Google Scholar
Oliveira S, Zaiane O (2002) Privacy preserving frequent itemset mining. In: Proceedings of the IEEE ICDM workshop on privacy, security and data mining, pp 43–54
Oliveira S, Zaiane O (2003) Algorithms for balancing privacy and knowledge discovery in association rule mining. In: Proceedings of the 7th international database engineering and applications symposium, pp 54–63
Oliveira S, Zaiane O (2003) Protecting sensitive knowledge by data sanitization. In: Proceedings of the 3rd IEEE ICDM, pp 211–218
Ordones C, Ezquerra N and Santana CA (2006). Constraining and summarizing association rules in medical data. Knowledge Inf Syst Int J 9(3): 259–283
Google Scholar
Pinkas B (2002). Cryptographic techniques for privacy preserving data mining. ACM SIGKDD Explor 4(2): 12–19
Article Google Scholar
Qiu L, Li Y, Wu X (2006) An approach to outsourcing data mining tasks while protecting business intelligence and customer privacy. In: Workshops proceedings of the 6th IEEE international conference on data mining (ICDM 2006), Hong Kong, China, December 18–22, pp 551–558
Raś ZW, Gürdal O, Im S, Tzacheva A (2007) Data confidentiality versus chase. In: Proceedings of the joint rough sets symposium (JRS07), Toronto, Canada, May 14–16. Springer LNAI vol 4482, pp 330–337
Rizvi S, Haritsa J (2002) Maintaining data privacy in association rule mining. In: Proceedings of VLDB’02, pp 682–693
Saygin Y, Verykios VS and Clifton C (2001). Using unknowns to prevent discovery of association rules. Sigmod Rec 30(4): 45–54
Article Google Scholar
Vaidya J and Clifton C (2004). Privacy-preserving data mining: why, how and when. IEEE Security Privacy 2(6): 19–27
Article Google Scholar
Xu S, Zhang J, Han D and Wang J (2006). A singular value decomposition based data distortion strategy for privacy protection. Knowledge Inf Syst Int J 10(3): 383–397
Article Google Scholar
Yao AC-C (1986) How to generate and exchange secrets. In: Proceedings of the 27th IEEE symposium on foundations of computer science (FOCS’86), Xi’an, China, pp 162–167
Zheng Z, Kohavi R, Mason L (2001) Real world performance of association rule algorithms. In: Proceedings of the 7th ACM-SIGKDD international conference on knowledge discovery and data mining, pp 401–406

Download references

Author information

Authors and Affiliations

School of Mathematics, Physics and Information Technology, James Cook University, Townsville, QLD, 4811, Australia
Ling Qiu
School of Information Systems, Singapore Management University, Singapore, 178902, Singapore
Yingjiu Li
Department of Software and Information Systems, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
Xintao Wu

Authors

Ling Qiu
View author publications
You can also search for this author in PubMed Google Scholar
Yingjiu Li
View author publications
You can also search for this author in PubMed Google Scholar
Xintao Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ling Qiu.

Additional information

This research was supported by the USA National Science Foundation Grants CCR-0310974 and IIS-0546027.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qiu, L., Li, Y. & Wu, X. Protecting business intelligence and customer privacy while outsourcing data mining tasks. Knowl Inf Syst 17, 99–120 (2008). https://doi.org/10.1007/s10115-007-0113-3

Download citation

Received: 07 February 2007
Revised: 04 July 2007
Accepted: 20 September 2007
Published: 16 November 2007
Issue Date: October 2008
DOI: https://doi.org/10.1007/s10115-007-0113-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Protecting business intelligence and customer privacy while outsourcing data mining tasks

Abstract

Access this article

Similar content being viewed by others

Privacy-Preserving Association Rule Mining in Distributed Database Environment: A Review

Survey on Privacy-Preserving and Other Security Issues in Data Mining

Anonymization-as-a-Service: The Service Center Transcripts Industrial Case

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Protecting business intelligence and customer privacy while outsourcing data mining tasks

Abstract

Access this article

Similar content being viewed by others

Privacy-Preserving Association Rule Mining in Distributed Database Environment: A Review

Survey on Privacy-Preserving and Other Security Issues in Data Mining

Anonymization-as-a-Service: The Service Center Transcripts Industrial Case

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation