Skip to main content

Query-Based Data Valuation Strategy: An Exploratory View

  • Conference paper
  • First Online:
Proceedings of International Conference on Data Science and Applications

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 288))

Abstract

In this era of big data, the operational and user-related data is generated in huge scales or dimensions, by user platforms, high-end algorithms and computing devices. This data is an essential ‘asset’ for the organization/individual for diverse analytics. The recent plethora of data raised immense challenges and opportunities to a data-driven organization. Data valuation of potential data objects in a prodigious data/dataset is one such co-occurring and multifaceted task, due to inherent characteristics/features of data objects and lack of a global measure or mechanism to evaluate. A data valuation scheme assists the organizations to rank/outline or weighting the potential data objects for a computational objective. In this paper, we have explored the fundamentals aspects of traditional data valuation approaches to investigate the evolution in existing techniques and implicit aspects. In this process, an automated data-evaluation strategy is proposed. The strategy evaluates the values of data objects based on the assessment of user queries and ranked attributes of a target database. The key contribution of the work is its capability to evaluate the data value a desired granularity level, e.g. attribute level, tuple level, record level, etc., on just-in-time basis for the buyer/consumer. Each data objects will is assigned with rank values and could be adapted by several consumer/buyer. The paper also asserts the design challenges and issues for the development of the similar approaches in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. https://www.xignite.com/

  2. https://developer.twitter.com/en/enterprise

  3. https://azuremarketplace.microsoft.com/en-us/sell

  4. P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, D. Suciu, Query-based data pricing. J. ACM (JACM) 62(5), 1–44 (2015)

    Article  MathSciNet  Google Scholar 

  5. A. Ginart, M. Guan, G. Valiant, J.Y. Zou, Making AI forget you: data deletion in machine learning, in Advances in Neural Information Processing Systems, pp. 3513–3526 (2019)

    Google Scholar 

  6. A. Ghorbani, Zou, J., Data Shapley: Equitable valuation of data for machine learning (2019). arXiv preprint arXiv:1904.02868

  7. A.J. Myles, A.F. Murray, A.R. Wallace, J. Barnard, G. Smith, Estimating MLP generalization ability without a test set using fast, approximate leave-one-out cross-validation. Neural Comput. Appl. 5(3), 134–151 (1997)

    Article  Google Scholar 

  8. J. Liu, Y. Tan, Estimating the leave-one-out error for support vector regression, in 2005 International Conference on Neural Networks and Brain. IEEE (2005, October), Vol. 1, pp. 208–213

    Google Scholar 

  9. R. Jia, D. Dao, B. Wang, F.A. Hubis, N.M. Gurel, B. Li et al., Efficient task-specific data valuation for nearest neighbor algorithms. Proc. VLDB Endowment 12(11), 1610–1623 (2019)

    Article  Google Scholar 

  10. https://cloud.google.com/bigquery

  11. D.R. Valz, U.S. Patent No. 9,076,148. U.S. Patent and Trademark Office, Washington, DC (2015)

    Google Scholar 

  12. H.T. Lam, J.M. Thiebaut, M. Sinn, B. Chen, T. Mai, O. Alkan, One button machine for automating feature engineering in relational databases (2017). arXiv preprint arXiv:1706.00327

  13. P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, D. Suciu, Toward practical query pricing with QueryMarket, in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (2013, June), pp. 613–624

    Google Scholar 

  14. C. Li, D.Y. Li, G. Miklau, D. Suciu, A theory of pricing private data. ACM Trans. Database Syst. (TODS) 39(4), 1–28 (2014)

    Article  MathSciNet  Google Scholar 

  15. J.M. Kanter, K. Veeramachaneni, Deep feature synthesis: towards automating data science endeavors, in IEEE International Conference on Data Science and Advanced Analytics (DSAA). 36678 2015. IEEE (2015), pp. 1–10

    Google Scholar 

  16. S. Deep, P. Koutris, QIRANA: a framework for scalable query pricing, in Proceedings of the 2017 ACM International Conference on Management of Data (2017, May), pp. 699–713

    Google Scholar 

  17. S. Sathananthan, Data valuation considering knowledge transformation, process models and data models, in 2018 12th International Conference on Research Challenges in Information Science (RCIS). IEEE (2018, May), pp. 1–5

    Google Scholar 

  18. R. Tang, D. Shao, S. Bressan, P. Valduriez, What you pay for is what you get. In International Conference on Database and Expert Systems Applications (Springer, Berlin, Heidelberg, 2013, August), pp. 395–409

    Google Scholar 

  19. B.R. Lin, D. Kifer, On arbitrage-free pricing for general data queries. Proc. VLDB Endowment 7(9), 757–768 (2014)

    Article  Google Scholar 

  20. M. Balazinska, B. Howe, P. Koutris, D. Suciu, P. Upadhyaya, A discussion on pricing relational data, in In Search of Elegance in the Theory and Practice of Computation (Springer, Berlin, Heidelberg, 2013), pp. 167–173

    Google Scholar 

  21. U. Khurana, D. Turaga, H. Samulowitz, S. Parthasarathy (eds.) Cognito: Automated Feature Engineering for Supervised Learning (ICDM, 2016)

    Google Scholar 

  22. V. Kassarnig, F. Wotawa, An approach to automatically extract predictive properties from nominal attributes in relational databases, in 2018 IEEE International Conference on Big Data (Big Data). IEEE (2018, December), pp. 4932–4939

    Google Scholar 

  23. A. Fatima, F.A. Khan, A. Raza, A.B. Kamran, Automated feature synthesis from relational database for data science related problems, in 2018 International Conference on Frontiers of Information Technology (FIT). IEEE (2018, December), pp. 71–75

    Google Scholar 

  24. Y. Chen, Information valuation for information lifecycle management. In Second International Conference on Autonomic Computing (ICAC'05). IEEE (2005, June), pp. 135–146

    Google Scholar 

  25. P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, D. Suciu, Querymarket demonstration: pricing for online data markets. Proc. VLDB Endowment 5(12), 1962–1965 (2012)

    Article  Google Scholar 

  26. H. Yu, M. Zhang, Data pricing strategy based on data quality. Comput. Ind. Eng. 112, 1–10 (2017)

    Article  Google Scholar 

  27. J. Yoon, S.O. Arik, T. Pfister, Data valuation using reinforcement learning (2019). arXiv preprint arXiv:1909.11671.

  28. S. Hara, A. Nitanda, T. Maehara, Data cleansing for models trained with SGD, in Advances in Neural Information Processing Systems (pp. 4215–4224) (2019)

    Google Scholar 

  29. L. Zhu, S.O. Arik, Y. Yang, T. Pfister, Learning to transfer learn (2019). arXiv preprint arXiv:1908.11406

  30. R. Jia, X. Sun, J. Xu, C. Zhang, B. Li, D. Song, An empirical and comparative analysis of data valuation with scalable algorithms (2019). arXiv preprint arXiv:1911.07128

  31. N. Vincent, Y. Li, R. Zha, B. Hecht, Mapping the potential and pitfalls of “data dividends” as a means of sharing the profits of artificial intelligence (2019). arXiv preprint arXiv:1912.00757

  32. R.C. Fernandez, P. Subramaniam, M.J. Franklin, Data market platforms trading data assets to solve data problems [Vision Paper] (2020). arXiv preprint arXiv:2002.01047

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vikram Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Verma, N., Singh, V. (2022). Query-Based Data Valuation Strategy: An Exploratory View. In: Saraswat, M., Roy, S., Chowdhury, C., Gandomi, A.H. (eds) Proceedings of International Conference on Data Science and Applications . Lecture Notes in Networks and Systems, vol 288. Springer, Singapore. https://doi.org/10.1007/978-981-16-5120-5_52

Download citation

Publish with us

Policies and ethics