A Generalized Scalable Software Architecture for Analyzing Temporally Structured Big Data in the Cloud

  • Magnus Westerlund
  • Ulf Hedlund
  • Göran Pulkkis
  • Kaj-Mikael Björk
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 275)

Abstract

Software architectures that allow researchers to explore advanced modeling by scaling horizontally in the cloud can lead to new insights and improved accuracy of modeling results. We propose a generalized highly scalable information system architecture that researchers can employ in predictive analytics research for working with both historical data and real-time temporally structured big data. The proposed architecture is fully automated and uses the same analytical software for both training and live predictions.

Keywords

predictive analytics temporal data system-level design selfadaptive systems runtime models 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Demirkan, H., Delen, D.: Leveraging the capabilities of service-oriented decision support systems: Putting analytics and big data in cloud. Decision Support Systems 55(1), 412–421 (2013)CrossRefGoogle Scholar
  2. 2.
    Chen, Q., Hsu, M., Zeller, H.: Experience in continuous analytics as a service (CaaaS). In: Proc. EDBT 2011 (March 2011)Google Scholar
  3. 3.
    Talia, D.: Clouds for Scalable Big Data Analytics. Computer 46(5), 98–101 (2013)CrossRefGoogle Scholar
  4. 4.
    Mell, P., Grance, T.: The NIST Definition of Cloud Computing. NIST Special Publication 800-145 (September 2011), http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf (last accessed on August 8, 2013)
  5. 5.
    Timmermann, A.: Forecast Combinations. CEPR Discussion Papers 5361 (2005)Google Scholar
  6. 6.
    SVM - Support Vector Machines, http://www.support-vector-machines.org/ (last accessed on August 24, 2013)
  7. 7.
    Haykin, S.O.: Neural Networks and Learning Machines, 3rd edn. Pearson Prentice Hall, USA (2009)Google Scholar
  8. 8.
    Konstantinou, I., Angelou, E., Boumpouka, C., Tsoumakos, D., Koziris, N.: On the Elasticity of NoSQL Databases over Cloud Management Platforms. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2385–2388. ACM (October 2011)Google Scholar
  9. 9.
    Begoli, E.: A Short Survey on the State of the Art in Architectures and Platforms for Large Scale Data Analysis and Knowledge Discovery from Data. In: Proceedings of the WICSA/ECSA 2012 Companion Volume, pp. 177–183. ACM (August 2012)Google Scholar
  10. 10.
    Fox, G.C.: Large Scale Data Analytics on Clouds. In: Proceedings of the Fourth International Workshop on Cloud Data Management, pp. 21–23. ACM (October 2012)Google Scholar
  11. 11.
    Valvåg, S.V., Johansen, D., Kvalnes, Å.: Position Paper: Elastic Processing and Storage at the Edge of the Cloud. In: Proceedings of the 2013 International Workshop on Hot Topics in Cloud Services, pp. 43–49. ACM (April 2013)Google Scholar
  12. 12.
    Rupprecht, L.: Exploiting In-network Processing for Big Data Management. In: Proceedings of the 2013 Sigmod/PODS Ph.D. Symposium on PhD Symposium, pp. 1–5. ACM (June 2013)Google Scholar
  13. 13.
    Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.-A.: BigBench: Towards an Industry Standard Benchmark for Big Data Analytics. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 1197–1208. ACM (June 2013)Google Scholar
  14. 14.
    SOA Manifesto, http://www.soa-manifesto.org/ (last accessed on August 24, 2013)
  15. 15.
    Laney, D.: 3D Data Management: Controlling Data Volume, Velocity and Variety. Meta Group (Gartner) (February 2001)Google Scholar
  16. 16.
    Welcome to ApacheTM Hadoop®!, http://hadoop.apache.org/ (last accessed on August 21, 2013)
  17. 17.
    Welcome to Apache Pig!, http://pig.apache.org/ (last accessed on August 21, 2013)
  18. 18.
    Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. In: Sixth Symp. Operating System Design and Implementation (OSDI 2004), San Francisco, CA (December 2004)Google Scholar
  19. 19.
    Lustig, I., Dietrich, B., Johnson, C., Dziekan, C.: The Analytics Journey. An IBM view of the structured data analysis landscape: descriptive, predictive and prescriptive analytics. Analytics, 11–18 (November/December 2010), http://www.analytics-magazine.org/november-december-2010/54-the-analytics-journey.html (last accessed on January 17, 2014)
  20. 20.
    Lee, J., et al.: SAP HANA distributed in-memory database system: Transaction, session, and metadata management. In: IEEE 29th Int. Conf. Data Engineering (ICDE), pp. 1165–1173 (April 2013)Google Scholar
  21. 21.
    Plale, B., et al.: CASA and LEAD: Adaptive Cyberinfrastructure for Real-Time Multiscale Weather Forecasting. Computer 39(11), 56–64 (2006)CrossRefGoogle Scholar
  22. 22.
    Sadashiv, N., Kumar, S.M.D.: Cluster, Grid and Cloud Computing: A Detailed Comparison. In: Proc. 6th Int. Conf. Computer Science & Education (ICCSE 2011), pp. 477–482 (2011)Google Scholar
  23. 23.
    Yuxi, L., Jianhua, W.: Research on Comparison of Cloud Computing and Grid Computing. Research J. Applied Sciences, Engineering and Technology 4(2), 120–122 (2012)Google Scholar
  24. 24.
    Foster, I., Zhao, Y., Raicu, I., Lu, S.: Cloud Computing and Grid Computing 360-Degree Compared. In: Proc. Grid Computing Environments Workshop (GCE 2008) (2008)Google Scholar
  25. 25.
    Amazon Web Services, http://aws.amazon.com/ (last accesed on August 24, 2013)
  26. 26.
    Arel, I., Rose, D.C., Karnowski, T.P.: Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier]. IEEE Computational Intelligence Magazine 5(4), 13–18 (2010)CrossRefGoogle Scholar
  27. 27.
    Widodo, A., Budi, I.: Combination of time series forecasts using neural network. In: Int. Conf. Electrical Engineering and Informatics (ICEEI), pp. 1–6 (July 2011)Google Scholar
  28. 28.
    Savola, R., Frühwirth, C., Pietikäinen, A.: Risk-Driven Security Metrics in Agile Software Development – An Industrial Pilot Study. J. Universal Computer Science 18(12), 1679–1702 (2012)Google Scholar
  29. 29.
    Encog Machine Learning Framework, http://www.heatonresearch.com/encog (last accessed on August 21, 2013)

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Magnus Westerlund
    • 1
  • Ulf Hedlund
    • 1
  • Göran Pulkkis
    • 1
  • Kaj-Mikael Björk
    • 1
  1. 1.Arcada University of Applied SciencesHelsinkiFinland

Personalised recommendations