Skip to main content

Modeling Data Heterogeneity Using Big DataSpace Architecture

  • Conference paper
  • First Online:
Advanced Computing and Communication Technologies

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 452))

Abstract

With the wide use of information expertise in advanced analytics, basically three characteristics of big data have been identified. These are volume, velocity and variety. The first of these two have enjoyed quite a lot of focus, volume of data and velocity of data, less thought has been focused on variety of available data worldwide. Data variety refers to the nature of data in store and under processing, which has three orthogonal natures: structured, semi-structured and unstructured. To handle the variety of data, current universally acceptable solutions are either costlier than customized solutions or less efficient to cater data heterogeneity. Thus, a basic idea is to, first design data processing systems that create abstraction that covers a wide range of data types and support fundamental processing on underlying heterogeneous data. In this paper, we conceptualized data management architecture ‘Big DataSpace’, for big data processing with the capability to combine heterogeneous data from various data sources. Further, we explain how Big DataSpace architecture can help in processing the heterogeneous and distributed data, a fundamental task in data management.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anis, D.S., Dong, X., Halevy, A.Y.: Bootstrapping pay-as-you-go data integration systems. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 861–874. ACM, USA (2008)

    Google Scholar 

  2. Divyakant, A., Bernstein, P., et. al.: Challenges and Opportunities with Big Data. A community white paper. Feb, USA (2012)

    Google Scholar 

  3. David, L., Alex, P., et. al.: Computational Social Science. A technical report on Science, vol. 323(5915), pp. 721–723. USA (2009)

    Google Scholar 

  4. Daizy, Z., Dong, X., Sarma, A.D., Franklin, M.J., Halevy, A.Y.: Functional dependency generation and applications in pay-as-you-go data integration systems. In: WebDB (2009)

    Google Scholar 

  5. Steve, L.: The age of Big Data. A technical report. New York Times, Feb (2012)

    Google Scholar 

  6. Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Manohar, M., Patil, S., Pearlman L.: A metadata catalog service for data intensive applications. In: Proceedings of International Conference on Supercomputing, pp. 20–37. IEEE/ACM, USA (2003)

    Google Scholar 

  7. Vagelis, H., Gravano, L., Papakonstantinou, Y.: Efficient IR-Style keyword search over relational databases. In: Proceedings of the International Conference on VLDB, pp. 850–861. Berlin, Germany (2003)

    Google Scholar 

  8. Vagelis, H., Papakonstantinou, Y.: DISCOVER: Keyword search in relational databases. In: Proceedings of the International Conference on VLDB, pp. 670–681. Berlin, Germany (2002)

    Google Scholar 

  9. Dittrich, J.P.: iDM: A unified and versatile data model for personal dataspace management. In: Proceedings of the International Conference on VLDB, pp. 367–378. Seoul, Korea (2006)

    Google Scholar 

  10. Salles, M.A., Dittrich, V.J., Blunschi, L.: Intentional associations in Dataspaces. In: Proceedings of International Conference of Data Engineering, pp. 30–35. IEEE, USA (2010)

    Google Scholar 

  11. Franklin, M., Halevy A., Maier, D.: From databases to dataspaces: A new abstraction for information management. In: Proceedings of the 2005 ACM SIGMOID Record, vol. 34(4), pp. 27–33, ACM USA (2005)

    Google Scholar 

  12. Ibrahim, E., Peter, B., Tjoa, A.M.: Towards realization of dataspaces. In: Proceedings of the 17th International Conference on Database and Expert Systems Applications, pp. 266–272. IEEE, USA (2006)

    Google Scholar 

  13. Bhalotia, G., Nakhey, C., Hulgeri, A., Chakrabarti, S., Sudarshanz S.: Keyword Searching and browsing in databases using BANKS. In: Proceedings of the International Conference of Data Engineering, pp. 431–441. IEEE, USA (2002)

    Google Scholar 

  14. Xin, D., Halevy, A.: Indexing dataspaces. In: Proceedings of 2007 ACM SIGMOD International Conference on Management of Data, pp. 32–45. ACM, USA (2007)

    Google Scholar 

  15. Manyika, J., Chui, M., et.al.: Big data: the next frontier for innovation, competition, and productivity. A Technical Report. McKinsey Global Institute (2011)

    Google Scholar 

  16. Marcos, A., Salles M.A., Dittrich J.: iTrails: pay-as-you-go information integration in dataspaces. In: Proceedings of International Conference of VLDB, pp 663–674. Vienna, Austria (2007)

    Google Scholar 

  17. Dittrich, J.P.: iMeMex: A platform for personal dataspace management. In: Proceedings of 2nd Invitational Workshop for Personal Information Management, pp. 292–308. USA (2006)

    Google Scholar 

  18. Salles, M.V.: Pay-as-you-go information integration in personal and social dataspaces. Ph.D. Dissertation, ETH Zurich (2008)

    Google Scholar 

  19. Sanjay, A., Chaudhuri, S., Das, G.: Dbxplorer: a system for keyword-based search over relational databases. In: Proceedings of the International Conference on Data Engineering, pp. 1–5. IEEE, USA (2002)

    Google Scholar 

  20. Shawn, R.J., Franklin, M.J., Halevy, A.Y.: Pay-as-you-go user feedback for dataspace systems. In: Proceedings of the SIGMOD Conference, pp. 847–860. ACM, USA (2008)

    Google Scholar 

  21. Yuhan, C., Xin, L.: Personal information management with SEMEX. In: Proceedings of 2005 ACM SIGMOD International Conference on Management of Data, pp. 921–923. ACM, USA (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vishal Sheokand .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Sheokand, V., Singh, V. (2016). Modeling Data Heterogeneity Using Big DataSpace Architecture. In: Choudhary, R., Mandal, J., Auluck, N., Nagarajaram, H. (eds) Advanced Computing and Communication Technologies. Advances in Intelligent Systems and Computing, vol 452. Springer, Singapore. https://doi.org/10.1007/978-981-10-1023-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-1023-1_26

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-1021-7

  • Online ISBN: 978-981-10-1023-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics