Abstract
Artificial Neural Networks (ANNs) represent a family of powerful machine learning-based techniques used to solve many real-world problems. The various applications of ANNs can be summarized into classification or pattern recognition, prediction and modeling. As with other machine learning techniques, ANNs are getting momentum in the Big Data era for analysing, predicting and Big Data analytics from large data sets. ANNs bring new opportunities for Big Data analysis for extracting accurate information from the data, yet there are also several challenges to be faced not known before with traditional data sets. Indeed, the success of learning and modeling Big Data by ANNs varies with training sample size, depends on data dimensionality, complex data formats, data variety, etc. In particular, ANNs performance is directly influenced by data size, requiring more memory resources. In this context, and due to the assumption that data set may no longer fit into main memory, it is interesting to investigate the performance of ANNs when data is read from main memory or from the disk. This study represents a performance evaluation of Artificial Neural Network (ANN) with multiple hidden layers, when training data is read from memory or from disk. The study shows also the trade-offs between processing time and data size when using ANNs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Oussous, A., Benjelloun, F.-Z., Lahcen, A.A., Belfkih, S.: Big data technologies: a survey. J. King Saud Univ. Comput. Inf. Sci. (2017). https://doi.org/10.1016/j.jksuci.2017.06.001
Zhang, Y., Guo, Q., Wang, J.: Big data analysis using neural networks. Adv. Eng. Sci. 49(1), 9–18 (2017). https://doi.org/10.15961/j.jsuese.2017.01.002
Zhou, L., Pan, S., Wang, J., Vasilakos, A.V.: Machine learning on big data. Neurocomputing 237, 350–361 (2017). https://doi.org/10.1016/j.neucom.2017.01.026
L’Heureux, A., Grolinger, K., ElYamany, H.F., Capretz, M.A.M.: Machine learning with big data: challenges and approaches. IEEE Access 5, 7776–7797 (2017). https://doi.org/10.1109/ACCESS.2017.2696365
Castro, W., Oblitas, J., Santa-Cruz, R., Avila-George, H.: Multilayer perceptron architecture optimization using parallel computing techniques. PLoS ONE 12(12), e0189369 (2017). https://doi.org/10.1371/journal.pone.0189369
Buscema, P.M., Massini, G., Breda, M., Lodwick, W.A., Newman, F., AsadiZeydabadi, M.: Artificial neural networks. In: Artificial Adaptive Systems Using Auto Contractive Maps: Theory, Applications and Extensions, pp. 11–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-75049-1_2
Multilayer Perceptron Classifier. https://spark.apache.org/docs/latest/ml-ann.html
Salchenberger, L.M., Mine Cinar, E., Lash, N.A.: Neural networks: a new tool for predicting thrift failures. Decis. Sci. 23(4), 899–916 (1992)
Ramchoun, H., Amine, M., Idrissi, J., Ghanou, Y., Ettaouil, M.: Multilayer perceptron: architecture optimization and training. Int. J. Interact. Multimed. Artif. Intell. 4(1), 26–30 (2016). https://doi.org/10.9781/ijimai.2016.415
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. In: Advances in Neural Information Processing Systems, vol. 7, pp. 231–238 (1995)
Krawczak, M., Sotirov, S., Sotirova, E.: Modeling parallel optimization of the early stopping method of multilayer perceptron. In: Recent Contributions in Intelligent Systems, pp. 103–113 (2017). https://doi.org/10.1007/978-3-319-41438-6_7
Turchenko, V., Bosilca, G., Bouteiller, A., Dongarra, J.: Efficient parallelization of batch pattern training algorithm on many-core and cluster architectures. In: IEEE 7th International Conference on Intelligent Data Acquisition and Advanced Computing Systems, vol. 2, pp. 692–698 (2013)
Zhang, H.j., Xiao, N.f.: Parallel implementation of multilayered neural networks based on map-reduce on cloud computing clusters. Soft Comput. 20(4), 1471–1483 (2016). https://doi.org/10.1007/s00500-015-1599-3
Ghanou, Y., Bencheikh, G.: Architecture optimization and training for the multilayer perceptron using ant system. Int. J. Comput. Sci. 43(1), 10 (2016)
Edwards, D.J., Holt, G.D., Harris, F.C.: A comparative analysis between the multilayer perceptron neural network and multiple regression analysis for predicting construction plant maintenance costs. J. Qual. Maint. Eng. 6(1), 45–61 (2000). https://doi.org/10.1108/13552510010371376
Apache Spark. http://spark.apache.org/
Amazon Web Services. https://aws.amazon.com
Spark SQL, DataFrames and Datasets Guide. https://spark.apache.org/docs/latest/sql-programming-guide.html#sql
MLlib: RDD-Based API. https://spark.apache.org/docs/latest/ml-guide.html
Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Tsai, D.B., Amde, M., Owen, S., Xin, D., Xin, R., Franklin, M.J., Zadeh, R., Zaharia, M., Talwalkar, A.: MLlib: machine learning in apache spark. J. Mach. Learn. Res. 17(34), 17 (2016). http://dl.acm.org/citation.cfm?id=2946645.2946679
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud 2010), Berkeley, p. 10. USENIX Association (2010)
Sharma, B., Chudnovsky, V., Hellerstein, J.L., Rifaat, R., Das, C.R.: Modeling and synthesizing task placement constraints in Google compute clusters. In: Proceedings of the 2nd ACM Symposium on Cloud Computing (SOCC 2011), 14 p. ACM, New York (2011). https://doi.org/10.1145/2038916.2038919. Article 3
Zhang, Q., Hellerstein, J., Boutaba, R.: Characterizing task usage shapes in Google compute clusters. In: Proceedings of the 5th International Workshop on Large Scale Distributed Systems and Middleware (2011)
Chen, Y., Ganapathi, A.S., Griffith, R., Katz, R.H.: Analysis and lessons from a publicly available Google cluster trace (2010). https://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-95.pdf
Chudnovsky, V.., Rifaat, R., Hellerstein, J., Sharma, B., Das, C.: Modeling and synthesizing task placement constraints in Google compute cluster. In: Symposium on Cloud Computing (2011)
Mittal, A.P., Jain, V., Ahuja, T.: Google file system and hadoop distributed file system- an analogy. https://pdfs.semanticscholar.org/c4e0/26de997cc5eaaf8ae00f082dee3f2b20c649.pdf
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Goga, K., Xhafa, F., Terzo, O. (2019). An Evaluation of Neural Networks Performance for Job Scheduling in a Public Cloud Environment. In: Barolli, L., Javaid, N., Ikeda, M., Takizawa, M. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2018. Advances in Intelligent Systems and Computing, vol 772. Springer, Cham. https://doi.org/10.1007/978-3-319-93659-8_69
Download citation
DOI: https://doi.org/10.1007/978-3-319-93659-8_69
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93658-1
Online ISBN: 978-3-319-93659-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)