Advertisement

SmartData 4.0: a formal description framework for big data

  • Morteza Sargolzaei Javan
  • Mohammad Kazem Akbari
Article

Abstract

Describing big data problems and solutions in a formal language can accelerate the innovation and development across many sectors to launch smarter services and applications from data. SmartData 4.0 provides a framework to provide metadata and relations in a formal language. It could also be considered as a technique that empowers raw data by wrapping in a cloak of intelligence. From linear regression to more complex mathematical models, the SmartData Description Framework enables us to define context-aware behaviors linked to data. The framework also supports formalized description of data operations such as data fusion, transformation, and provenance management. We have shown some practical examples step by step, during the whole formalization process.

Keywords

Semantic Web Linked Data Model-driven engineering Metadata Contextualization 

Notes

Acknowledgements

The first acknowledgement I should like to make is to my wife for her patience and her encouragement of my writing over such a long period. This work has been supported by joint Cloud Computing workgroup of the High Performance Computing Research Center (HPCRC) and Cloud Research Center (CRC) under Grant No. Cloud-100516-1771, and also joint Big Data workgroup of Iran Telecommunication Research Center (ITRC) with Open Community of Cloud Computing (OCCC) under Grant No. 228.

References

  1. 1.
    Goli-Malekabadi Z, Sargolzaei-Javan M, Akbari MK (2016) An effective model for store and retrieve big health data in cloud computing. Comput Methods Programs Biomed 132:75–82CrossRefGoogle Scholar
  2. 2.
    Hitzler P, Janowicz K (2013) Linked data, big data, and the 4th paradigm. Semant Web 4(3):233–235Google Scholar
  3. 3.
    Turner V et al (2014) The digital universe of opportunities: rich data and the increasing value of the internet of things. In: IDC Analyze the FutureGoogle Scholar
  4. 4.
    Gantz J, Reinsel D (2012) The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. In: IDC iView: IDC Analyze the FutureGoogle Scholar
  5. 5.
    NITRD, Big Data Senior Steering Group (2016) The federal big data research and development strategic plan. https://bigdatawg.nist.gov/pdf/bigdatardstrategicplan.pdf. Accessed 3 Sept 2016
  6. 6.
    Big Data. Gartner (2015). http://www.gartner.com/it-glossary/big-data. Accessed Sept 2017
  7. 7.
    Mills S et al (2012) Demystifying big data: a practical guide to transforming the business of government. TechAmerica Foundation, WashingtonGoogle Scholar
  8. 8.
    Cavoukian A (2013) Privacy by design and the promise of SmartData. In: SmartData. Springer, New York, pp 1–9Google Scholar
  9. 9.
    Chen M, Mao S, Liu Y (2014) Big data: a survey. Mob Netw Appl 19(2):171–209CrossRefGoogle Scholar
  10. 10.
    NIST (2017) Big data interoperability framework: definitions, vol 1. NIST big data public working groupGoogle Scholar
  11. 11.
    NIST (2017) Big data interoperability framework: big data taxonomies, vol 2. NIST big data public working groupGoogle Scholar
  12. 12.
    NIST (2017) Big data interoperability framework: use cases and general requirements, vol 3. NIST big data public working groupGoogle Scholar
  13. 13.
    NIST (2015) NIST big data interoperability framework: security and privacy, vol 4. NIST big data public working groupGoogle Scholar
  14. 14.
    NIST (2017) Big data interoperability framework: reference architecture, vol 6. NIST big data public working groupGoogle Scholar
  15. 15.
    NIST (2017) Big data interoperability framework: standards roadmap, vol 7. NIST big data public working groupGoogle Scholar
  16. 16.
    ITU-T (2016) TU-T Y.3600—big data standardization roadmap. ITU-T, GenevaGoogle Scholar
  17. 17.
    ISO/IEC (2014) Big data preliminary report. ISO/IEC JTC1, New YorkGoogle Scholar
  18. 18.
    Hashem IAT et al (2015) The rise of “big data” on cloud computing: review and open research issues. Inf Syst 47:98–115CrossRefGoogle Scholar
  19. 19.
    Zaslavsky A, Perera C, Georgakopoulos D (2012) Sensing as a service and big data. In: International Conference on Advances in Cloud Computing (ACC-2012), Bangalore, IndiaGoogle Scholar
  20. 20.
    Nasser T, Tariq RS (2015) Big data challenges. J Comput Eng Inf Technol 4(3):2MathSciNetGoogle Scholar
  21. 21.
  22. 22.
    Yin S, Kaynak O (2015) Big data for modern industry: challenges and trends [point of view]. Proc IEEE 103(2):143–146CrossRefGoogle Scholar
  23. 23.
    Sri PSGA, Anusha M (2016) Big data-survey. Indones J Electr Eng Inform (IJEEI) 4(1):74–80Google Scholar
  24. 24.
    De Mauro A, Greco M, Grimaldi M (2016) A formal definition of big data based on its essential features. Libr Rev 65(3):122–135CrossRefGoogle Scholar
  25. 25.
    Iafrate F (2013) A journey from big data to smart data. In: Proceedings of the Second International Conference on Digital Enterprise Design and Management DED&M 2014Google Scholar
  26. 26.
    Wikipedia. Data warehouse. https://en.wikipedia.org/wiki/Data_warehouse. Accessed 3-9-2016
  27. 27.
    Iafrate F (2015) From big data to smart data. Wiley, New YorkCrossRefGoogle Scholar
  28. 28.
    Sheth A. Smart data. Knoesis.org. http://wiki.knoesis.org/index.php/Smart_Data. Accessed 10-7-2016
  29. 29.
    Allemang D (2006) Rule-based intelligence in the semantic web-or- I’ll settle for a web that’s just not so dumb. In: International Conference on Rules and Rule Markup Languages for the Semantic Web (RuleML’06). IEEEGoogle Scholar
  30. 30.
    Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43CrossRefGoogle Scholar
  31. 31.
    Sheth A (2014) Smart data—how you and i will exploit big data for personalized digital health and many other activities. In: IEEE International Conference on Big DataGoogle Scholar
  32. 32.
    Thirunarayan K (2015) Value-oriented Big Data processing with applications. In: IEEE International Conference on Collaboration Technologies and Systems (CTS)Google Scholar
  33. 33.
    Tomko N (2008) SmartData: adaptable, autonomous agents to protect digital data. Masters of engineering project, University of TorontoGoogle Scholar
  34. 34.
    Tomko GJ et al (2010) SmartData: make the data “think” for itself. Identity Inf Soc 3(2):343–362CrossRefGoogle Scholar
  35. 35.
    Coughlin TM, Linfoot SL (2010) A novel taxonomy for consumer metadata. In: 2010 Digest of Technical Papers International Conference on Consumer Electronics (ICCE)Google Scholar
  36. 36.
    Bar-Yam Y (2016) From big data to important information. Complexity 21:73–98MathSciNetCrossRefGoogle Scholar
  37. 37.
    Tomko G (2013) SmartData: the need, the goal and the challenge. In: SmartData. Springer, New York, pp 11–25Google Scholar
  38. 38.
    Microsoft (2013) The microsoft modern data warehouse. Microsoft, AlbuquerqueGoogle Scholar
  39. 39.
    Eastin MS et al (2016) Living in a big data world: predicting mobile commerce activity through privacy concerns. Comput Hum Behav 58:214–220CrossRefGoogle Scholar
  40. 40.
    Varga J et al (2016) Dimensional enrichment of statistical linked open data. Web Semant Sci Serv Agents World Wide Web 40:22–51CrossRefGoogle Scholar
  41. 41.
    Decker S et al (2000) The semantic web: the roles of XML and RDF. IEEE Internet Comput 4(5):63–73CrossRefGoogle Scholar
  42. 42.
    Cruz IF, Xiao H (2005) The role of ontologies in data integration. Eng Intell Syst Electr Eng Commun 13(4):245Google Scholar
  43. 43.
    Da Silva AR (2015) Model-driven engineering: a survey supported by the unified conceptual model. Comput Lang Syst Struct 43:139–155Google Scholar
  44. 44.
    Samal P, Mishra P (2013) Analysis of variants in round robin algorithms for load balancing in cloud computing. IJCSIT 4(3):416–419Google Scholar
  45. 45.
    Lange C (2013) Ontologies and languages for representing mathematical knowledge on the semantic web. Semant Web 4(2):119–158Google Scholar
  46. 46.
    W3C MathML 3.0 approved as ISO/IEC international standard. W3C, 23-6-2015. https://www.w3.org/2015/06/mathmlpas.html.en. Accessed 10-8-2016
  47. 47.
    Ellis J et al (2015) Exploring big data with Helix: finding needles in a big haystack. ACM SIGMOD Rec 43(4):43–54CrossRefGoogle Scholar
  48. 48.
    Kliegr T (2015) Linked hypernyms: enriching dbpedia with targeted hypernym discovery. Web Semant Sci Serv Agents World Wide Web 31:59–69CrossRefGoogle Scholar
  49. 49.
    Goodman IR, Mahler RP, Nguyen HT (2013) Mathematics of data fusion. Springer, BerlinzbMATHGoogle Scholar
  50. 50.
    Baroni AL (2002) Formal definition of object-oriented design metrics. Doctoral dissertation, Universidade Nova de LisboaGoogle Scholar
  51. 51.
    Alkhalil A, Ramadan RA (2017) IoT data provenance implementation challenges. Procedia Comput Sci 109C:1134–1139CrossRefGoogle Scholar
  52. 52.
    ITU-T (2016) Y.3600—big data standardization roadmap. ITU-T, GenevaGoogle Scholar
  53. 53.
    Sack H (2016) Linked data engineering. openHPI. https://open.hpi.de/courses/semanticweb2016. Accessed 9-2016
  54. 54.
    Serafini L, Homola M (2012) Contextualized knowledge repositories for the semantic web. Web Semant Sci Serv Agents World Wide Web 12:64–87CrossRefGoogle Scholar
  55. 55.
    Bozzato L, Homola M, Serafini L (2012) Context on the semantic web: why and how. In: ARCOE-12Google Scholar
  56. 56.
    Karger DR (2011) Unify everything: it’s all the same to me. In: Jones WP, Teevan J (eds) Personal information management. University of Washington Press, Seattle, p 127Google Scholar
  57. 57.
    Gayo JEL et al (2014) Representing statistical indexes as linked data including metadata about their computation process. In: Research Conference on Metadata and Semantics Research. Springer, Berlin, pp 42–53Google Scholar
  58. 58.
    Servant, F-P (2008) Linking enterprise data. In: LDOWGoogle Scholar
  59. 59.
    Wenzel K, Putz M (2014) Integrated knowledge models of products, processes and resources with key indicators for economic and energy performance. Energy-Related Technologic and Economic Balancing and Evaluation—Results from the Cluster of Excellence eniPROD, p 67Google Scholar
  60. 60.
    Wenzel K, Tisztl M (2012) Linking process models and operating data for exploration and visualization. In: Proceedings of the Workshop on Ontology and Semantic Web for Manufacturing (OSEMA 2012), GrazGoogle Scholar
  61. 61.
    Edwards P et al (2014) Lessons learnt from the deployment of a semantic virtual research environment. Web Semant Sci Serv Agents World Wide Web 27:70–77CrossRefGoogle Scholar
  62. 62.
    Daskalaki E et al (2016) Instance matching benchmarks in the era of linked data. Web Semant Sci Serv Agents World Wide Web 39:1–14CrossRefGoogle Scholar
  63. 63.
    Dietze H, Schroeder M (2009) Goweb: a semantic search engine for the life science web. BMC Bioinform 10(S10):7CrossRefGoogle Scholar
  64. 64.
    Thalhammer A, Rettinger A (2014) Browsing dbpedia entities with summaries. In: European Semantic Web Conference. Springer, BerlinGoogle Scholar
  65. 65.
    Domingue, J, Dzbor M, Motta E (2004) Collaborative semantic web browsing with magpie. In: European Semantic Web Symposium. Springer, BerlinGoogle Scholar
  66. 66.
    Aghaei S, Nematbakhsh MA, Farsani HK (2012) Evolution of the world wide web: from WEB 1.0 TO WEB 4.0. Int J Web Semant Technol 3(1):1CrossRefGoogle Scholar
  67. 67.
    Le-Phuoc D et al (2016) The graph of things: a step towards the live knowledge graph of connected things. Web Semant Sci Serv Agents World Wide Web 37:25–35CrossRefGoogle Scholar
  68. 68.
    Sparks P (2017) The route to a trillion devices. ARM. https://www.arm.com/company/news/2017/07/the-path-to-a-trillion-connected-devices. Accessed Sept 2017
  69. 69.
  70. 70.
    Arenas M et al (2014) A principled approach to bridging the gap between graph data and their schemas. Proc VLDB Endow 7(8):601–602CrossRefGoogle Scholar
  71. 71.
    Roberts FS (1979) Measurement theory. Encycl Math 7Google Scholar
  72. 72.
    de Leoni M, Maggi FM, van der Aalst WMP (2015) An alignment-based framework to check the conformance of declarative process models and to preprocess event-log data. Inf Syst 47:258–277CrossRefGoogle Scholar
  73. 73.
    Duan S et al (2011) A clustering-based approach to ontology alignment. In: International Semantic Web Conference. Springer, BerlinCrossRefGoogle Scholar
  74. 74.
    Cariou E et al (2011) Contracts for model execution verification. In: European Conference on Modelling Foundations and Applications. Springer, BerlinCrossRefGoogle Scholar
  75. 75.
    Feng M et al (2011) Prototyping an online wetland ecosystem services model using open model sharing standards. Environ Model Softw 26(4):458–468CrossRefGoogle Scholar
  76. 76.
    Ristoski P, Paulheim H (2016) Semantic web in data mining and knowledge discovery: a comprehensive survey. Web Semant Sci Serv Agents World Wide Web 36:1–22CrossRefGoogle Scholar
  77. 77.
    Heflin J, Pan Z (2004) A model theoretic semantics for ontology versioning. In: International Semantic Web Conference. Springer, BerlinCrossRefGoogle Scholar
  78. 78.
    Austel P et al (2015) Continuous delivery of composite solutions: a case for collaborative software defined PaaS environments. In: Proceedings of the 2nd International Workshop on Software-Defined Ecosystems. ACM, New YorkGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Morteza Sargolzaei Javan
    • 1
  • Mohammad Kazem Akbari
    • 1
  1. 1.Department of Computer Engineering and Information TechnologyAmirkabir University of TechnologyTehranIran

Personalised recommendations