Budget-Constrained Result Integrity Verification of Outsourced Data Mining Computations

  • Bo Zhang
  • Boxiang Dong
  • Wendy Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10359)


When outsourcing data mining needs to an untrusted service provider in the Data-Mining-as-a-Service (DMaS) paradigm, it is important to verify whether the service provider (server) returns correct mining results (in the format of data mining objects). We consider the setting in which each data mining object is associated with a weight for its importance. Given a client who is equipped with limited verification budget, the server selects a subset of mining results whose total verification cost does not exceed the given budget, while the total weight of the selected results is maximized. This maps to the well-known budgeted maximum coverage (BMC) problem, which is NP-hard. Therefore, the server may execute a heuristic algorithm to select a subset of mining results for verification. The server has financial incentives to cheat on the heuristic output, so that the client has to pay more for verification of the mining results that are less important. Our aim is to verify that the mining results selected by the server indeed satisfy the budgeted maximization requirement. It is challenging to verify the result integrity of the heuristic algorithms as the results are non-deterministic. We design a probabilistic verification method by including negative candidates (NCs) that are guaranteed to be excluded from the budgeted maximization result of the ratio-based BMC solutions. We perform extensive experiments on real-world datasets, and show that the NC-based verification approach can achieve high guarantee with small overhead.


Data-Mining-as-a-Service (DMaS) Cloud computing Result integrity Budgeted maximization 


  1. 1.
    Babai, L.: Trading group theory for randomness. In: Symposium on Theory of Computing (1985)Google Scholar
  2. 2.
    Benabbas, S., Gennaro, R., Vahlis, Y.: Verifiable delegation of computation over large datasets. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 111–131. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-22792-9_7 CrossRefGoogle Scholar
  3. 3.
    Bleiholder, J., Khuller, S., Naumann, F., Raschid, L., Wu, Y.: Query planning in the presence of overlapping sources. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Boehm, K., Kemper, A., Grust, T., Boehm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 811–828. Springer, Heidelberg (2006). doi: 10.1007/11687238_48 CrossRefGoogle Scholar
  4. 4.
    Chen, T.Y., et al.: Metamorphic testing: a new approach for generating next test cases. Technical report, Hong Kong University of Science and Technology (1998)Google Scholar
  5. 5.
    Chen, T.Y., et al.: Fault-based testing in the absence of an oracle. In: International Conference on Computer Software and Applications (2001)Google Scholar
  6. 6.
    Curtis, D.E., et al.: Budgeted maximum coverage with overlapping costs: monitoring the emerging infections network. In: Algorithm Engineering & Expermiments (2010)Google Scholar
  7. 7.
    Dong, B., et al.: Integrity verification of outsourced frequent itemset mining with deterministic guarantee. In: ICDM (2013)Google Scholar
  8. 8.
    Dong, B., Liu, R., Wang, H.W.: Result integrity verification of outsourced frequent itemset mining. In: Wang, L., Shafiq, B. (eds.) DBSec 2013. LNCS, vol. 7964, pp. 258–265. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39256-6_17 CrossRefGoogle Scholar
  9. 9.
    Gennaro, R., Gentry, C., Parno, B.: Non-interactive verifiable computing: outsourcing computation to untrusted workers. In: Rabin, T. (ed.) CRYPTO 2010. LNCS, vol. 6223, pp. 465–482. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-14623-7_25 CrossRefGoogle Scholar
  10. 10.
    Goldwasser, S., et al.: The knowledge complexity of interactive proof systems. SIAM J. Comput. 18(1), 186–208 (1989)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Han, J., et al.: Mining frequent patterns without candidate generation. In: ACM Sigmod Record (2000)Google Scholar
  12. 12.
    Kanewala, U., et al.: Techniques for testing scientific programs without an oracle. In: International Workshop on Software Engineering for Computational Science and Engineering (2013)Google Scholar
  13. 13.
    Khuller, S., et al.: The budgeted maximum coverage problem. Inf. Process Lett. 70(1), 39–45 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Liu, R., Wang, H.W., Monreale, A., Pedreschi, D., Giannotti, F., Guo, W.: AUDIO: An integrity \(\underline{audi}\)ting framework of \(\underline{o}\)utlier-mining-as-a-service systems. In: Flach, P.A., Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS, vol. 7524, pp. 1–18. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33486-3_1 CrossRefGoogle Scholar
  15. 15.
    Liu, R., et al.: Integrity verification of k-means clustering outsourced to infrastructure as a service (IAAS) providers. In: SDM (2013)Google Scholar
  16. 16.
    Liu, R., et al.: Result integrity verification of outsourced Bayesian network structure learning. In: SDM (2014)Google Scholar
  17. 17.
    Papamanthou, C., Tamassia, R., Triandopoulos, N.: Optimal verification of operations on dynamic sets. In: Rogaway, P. (ed.) CRYPTO 2011. LNCS, vol. 6841, pp. 91–110. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-22792-9_6 CrossRefGoogle Scholar
  18. 18.
    Parno, B., Raykova, M., Vaikuntanathan, V.: How to delegate and verify in public: verifiable computation from attribute-based encryption. In: Cramer, R. (ed.) TCC 2012. LNCS, vol. 7194, pp. 422–439. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-28914-9_24 CrossRefGoogle Scholar
  19. 19.
    Sindelar, M., et al.: Sharing-aware algorithms for virtual machine colocation. In: Symposium on Parallelism in Algorithms and Architectures (2011)Google Scholar
  20. 20.
    Vaidya, J., et al.: Efficient integrity verification for outsourced collaborative filtering. In: ICDM (2014)Google Scholar
  21. 21.
    Wong, W.K., et al.: Security in outsourcing of association rule mining. In: VLDB (2007)Google Scholar
  22. 22.
    Zhang, B., et al.: Budget-constrained result integrity verification of outsourced data mining computations (2017). hwang4/papers/dbsec2017full.pdf

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  1. 1.Stevens Institute of TechnologyHobokenUSA
  2. 2.Montclair State UniversityMontclairUSA

Personalised recommendations