Skip to main content

Advertisement

SpringerLink
Log in
Menu
Find a journal Publish with us
Search
Cart
Book cover

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2012: Machine Learning and Knowledge Discovery in Databases pp 1–18Cite as

  1. Home
  2. Machine Learning and Knowledge Discovery in Databases
  3. Conference paper
AUDIO: An Integrity \(\underline{Audi}\)ting Framework of \(\underline{O}\)utlier-Mining-as-a-Service Systems

AUDIO: An Integrity \(\underline{Audi}\)ting Framework of \(\underline{O}\)utlier-Mining-as-a-Service Systems

  • Ruilin Liu21,
  • Hui (Wendy) Wang21,
  • Anna Monreale22,
  • Dino Pedreschi22,
  • Fosca Giannotti23 &
  • …
  • Wenge Guo24 
  • Conference paper
  • 4678 Accesses

  • 3 Citations

Part of the Lecture Notes in Computer Science book series (LNAI,volume 7524)

Abstract

Spurred by developments such as cloud computing, there has been considerable recent interest in the data-mining-as-a-service paradigm. Users lacking in expertise or computational resources can outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises issues about result integrity: how can the data owner verify that the mining results returned by the server are correct? In this paper, we present AUDIO, an integrity auditing framework for the specific task of distance-based outlier mining outsourcing. It provides efficient and practical verification approaches to check both completeness and correctness of the mining results. The key idea of our approach is to insert a small amount of artificial tuples into the outsourced data; the artificial tuples will produce artificial outliers and non-outliers that do not exist in the original dataset. The server’s answer is verified by analyzing the presence of artificial outliers/non-outliers, obtaining a probabilistic guarantee of correctness and completeness of the mining result. Our empirical results show the effectiveness and efficiency of our method.

Keywords

  • Cloud Computing
  • Association Rule Mining
  • Data Owner
  • Mining Result
  • Frequent Itemset Mining

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Download conference paper PDF

References

  1. Cloud Security Alliance. Security Guidance for Critical Areas of Focus in Cloud Computing (2009), http://www.cloudsecurityalliance.org/guidance/csaguide.pdf

  2. Google Prediction APIs, http://code.google.com/apis/predict/

  3. Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: SIGMOD (2001)

    Google Scholar 

  4. Angiulli, F., Fassetti, F.: Dolphin: An efficient algorithm for mining distance-based outliers in very large datasets. In: TKDD (2009)

    Google Scholar 

  5. Arora, S., Lund, C., Motwani, R., Sudan, M., Szegedy, M.: Proof verification and the hardness of approximation problems. Journal of ACM 45 (1998)

    Google Scholar 

  6. Barnett, V., Lewis, T.: Outliers in Statistical Data. John Wiley and Sons (1994)

    Google Scholar 

  7. Chow, R., Golle, P., Jakobsson, M., Shi, E., Staddon, J., Masuoka, R., Molina, J.: Controlling Data in the Cloud: Outsourcing Computation without Outsourcing Control. In: CCSW (2009)

    Google Scholar 

  8. Cuzzocrea, A., Wang, W.: Approximate range-Sum query answering on data cubes with probabilistic guarantees. In: JIIS, vol. 27 (2007)

    Google Scholar 

  9. Cuzzocrea, A., Wang, W., Matrangolo, U.: Answering Approximate Range Aggregate Queries on OLAP Data Cubes with Probabilistic Guarantees. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 97–107. Springer, Heidelberg (2004)

    CrossRef  Google Scholar 

  10. Du, J., Wei, W., Gu, X., Yu, T.: Runtest: assuring integrity of dataflow processing in cloud computing infrastructures. In: ASIACCS (2010)

    Google Scholar 

  11. Du, W., Jia, J., Mangal, M., Murugesan, M.: Uncheatable grid computing. In: ICDCS (2004)

    Google Scholar 

  12. Giannotti, F., Lakshmanan, L.V., Monreale, A., Pedreschi, D., Wang, H.: Privacy-preserving mining of association rules from outsourced transaction databases. In: SPCC (2010)

    Google Scholar 

  13. Goldwasser, S., Micali, S., Rackoff, C.: The knowledge complexity of interactive proof systems. SIAM Journal of Computing 18 (1989)

    Google Scholar 

  14. Hacigümüş, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. In: SIGMOD (2002)

    Google Scholar 

  15. Jeevanand, E.S., Nair, N.U.: On determining the number of outliers in exponential and pareto samples. On determining the number of outliers in exponential and Pareto samples. Stat. Pap. 39 (1998)

    Google Scholar 

  16. Papadimitriou, S., Kitawaga, H., Gibbons, P.B., Faloutsos, C.: LOCI: Fast Outlier Detection Using the Local Correlation Integral. In: ICDE (2002)

    Google Scholar 

  17. Knorr, E.M., Ng, R.T.: Algorithms for mining distance-based outliers in large datasets. In: VLDB (1998)

    Google Scholar 

  18. Kollios, G., Gunopulos, D., Koudas, N., Berchtold, S.: An Efficient Approximation Scheme for Data Mining Tasks. In: ICDE (2001)

    Google Scholar 

  19. Kreibich, C., Crowcroft, J.: Honeycomb: creating intrusion detection signatures using honeypots. SIGCOMM Computer Communication Review 34 (2004)

    Google Scholar 

  20. Li, F., Hadjieleftheriou, M., Kollios, G., Reyzin, L.: Dynamic authenticated index structures for outsourced databases. In: SIGMOD (2006)

    Google Scholar 

  21. Liu, K., Giannella, C., Kargupta, H.: An attacker’s view of distance preserving maps for privacy preserving data mining. In: PKDD (2006)

    Google Scholar 

  22. Molloy, I., Li, N., Li, T.: On the (in)security and (im)practicality of outsourcing precise association rule mining. In: ICDM (2009)

    Google Scholar 

  23. Mykletun, E., Narasimha, M., Tsudik, G.: Authentication and integrity in outsourced databases. Trans. Storage 2 (May 2006)

    Google Scholar 

  24. Nguyen, H.V., Gopalkrishnan, V., Assent, I.: An Unbiased Distance-Based Outlier Detection Approach for High-Dimensional Data. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part I. LNCS, vol. 6587, pp. 138–152. Springer, Heidelberg (2011)

    CrossRef  Google Scholar 

  25. Pang, H., Jain, A., Ramamritham, K., Tan, K.-L.: Verifying completeness of relational query results in data publishing. In: SIGMOD (2005)

    Google Scholar 

  26. Qiu, L., Li, Y., Wu, X.: Protecting business intelligence and customer privacy while outsourcing data mining tasks. Knowledge Information System 17(1) (2008)

    Google Scholar 

  27. Ramaswamy, S., Rastogi, R., Shim, K., Aitrc: Efficient algorithms for mining outliers from large data sets. In: SIGMOD (2000)

    Google Scholar 

  28. Sion, R.: Query execution assurance for outsourced databases. In: VLDB (2005)

    Google Scholar 

  29. Tai, C.-H., Yu, P.S., Chen, M.-S.: k-support anonymity based on pseudo taxonomy for outsourcing of frequent itemset mining. In: SIGKDD (2010)

    Google Scholar 

  30. Wong, W.K., Cheung, D.W., Hung, E., Kao, B., Mamoulis, N.: Security in outsourcing of association rule mining. In: VLDB (2007)

    Google Scholar 

  31. Wong, W.K., Cheung, D.W., Kao, B., Hung, E., Mamoulis, N.: An audit environment for outsourcing of frequent itemset mining. PVLDB 2 (2009)

    Google Scholar 

  32. Xie, M., Wang, H., Yin, J., Meng, X.: Integrity auditing of outsourced data. In: VLDB (2007)

    Google Scholar 

  33. Yiu, M.L., Assent, I., Jensen, C.S., Kalnis, P.: Outsourced Similarity Search on Metric Data Assets. TKDE 24 (2012)

    Google Scholar 

  34. Yiu, M.L., Ghinita, G., Jensen, C.S., Kalnis, P.: Enabling search services on outsourced private spatial data. VLDB J. 19 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Stevens Institute of Technology, NJ, USA

    Ruilin Liu & Hui (Wendy) Wang

  2. University of Pisa, Pisa, Italy

    Anna Monreale & Dino Pedreschi

  3. ISTI-CNR, Pisa, Italy

    Fosca Giannotti

  4. New Jersey Institute of Technology, NJ, USA

    Wenge Guo

Authors
  1. Ruilin Liu
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Hui (Wendy) Wang
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Anna Monreale
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Dino Pedreschi
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Fosca Giannotti
    View author publications

    You can also search for this author in PubMed Google Scholar

  6. Wenge Guo
    View author publications

    You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

  1. Intelligent Systems Laboratory, University of Bristol, Merchant Venturers Building, Woodland Road, BS8 1UB, Bristol, UK

    Peter A. Flach

  2. Intelligent Systems Laboratory, University of Bristol, Merchant Venturers Building, Woodland Road,, BS8 1UB, Bristol, UK

    Tijl De Bie & Nello Cristianini & 

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, R., Wang, H.(., Monreale, A., Pedreschi, D., Giannotti, F., Guo, W. (2012). AUDIO: An Integrity \(\underline{Audi}\)ting Framework of \(\underline{O}\)utlier-Mining-as-a-Service Systems. In: Flach, P.A., De Bie, T., Cristianini, N. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2012. Lecture Notes in Computer Science(), vol 7524. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33486-3_1

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-642-33486-3_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33485-6

  • Online ISBN: 978-3-642-33486-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Search

Navigation

  • Find a journal
  • Publish with us

Discover content

  • Journals A-Z
  • Books A-Z

Publish with us

  • Publish your research
  • Open access publishing

Products and services

  • Our products
  • Librarians
  • Societies
  • Partners and advertisers

Our imprints

  • Springer
  • Nature Portfolio
  • BMC
  • Palgrave Macmillan
  • Apress
  • Your US state privacy rights
  • Accessibility statement
  • Terms and conditions
  • Privacy policy
  • Help and support

167.114.118.210

Not affiliated

Springer Nature

© 2023 Springer Nature