A Novel Approach for Gigantic Data Examination Utilizing the Apache Spark and Significant Learning

  • Anilkumar V. BrahmaneEmail author
  • B. Chaitanya Krishna
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 98)


With the spreading certainty of Gigantic Data, particular prompts and advancements are made in this area of Gigantic Data and systems, for example, the Apache Hadoop as well as the Apache Spark are very much widely used and spread in industry and a part of balance over the previous decades and have wrapped up gigantically phenomenal, particularly in affiliations. It is finding the opportunity to be interminably clear that profitable tremendous data evaluation is fundamental to perception artificial experiences issues. All things considered, a diversified-calculation repository executed inside the Apache Spark structure, it is MLlib. Disregarding the way that this library reinforces different AI figurings, there’s still expansion to utilize the Spark course of action ably for out and out time-genuine also, computationally absurd methods like essential acquisition of knowledge. We are trying to put forward an effective structure which consolidations the separative assessment cutoff purposes of the Apache Spark and the pushed AI plan for a fundamental multilayer perceptron (MLP), utilizing pervasive thought of cascade learning. We lead observational evaluation of our structure on two veritable famous datasets. The outcomes around are attracting and show our proposed structure, accordingly sketching out that it is an alter over routine goliath data evaluation strategies that utilization either Spark or Significant learning as individual parts.


Deep learning Apache Spark Big data 


  1. 1.
    Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D. (eds.) HotCloud, vol. 10, no. 10, p. 95 (2010)Google Scholar
  2. 2.
    Freeman, D.T., Amde, M., Owen, S., et al.: Mllib: machine learning in apache spark. J. Mach. Learn. Res. 17(34), 1–7 (2016)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Fu, J., Sun, J., Wang, K.: Spark–a big data processing platform for machine learning. In: 2016 International Conference on Industrial Informatics-Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII), pp. 48–51. IEEE (2016)Google Scholar
  4. 4.
    Nair, L.R., Shetty, S.D.: Streaming twitter data analysis using spark for effective job search. J. Theor. Appl. Inf. Technol. 80(2), 349 (2015)Google Scholar
  5. 5.
    Nodarakis, N., Sioutas, S., Tsakalidis, A.K., Tzimas, G.: Large scale sentiment analysis on Twitter with spark. In: EDBT/ICDT Workshops, pp. 1–8 (2016)Google Scholar
  6. 6.
    Kotsiantis, S., Kanellopoulos, D., Pintelas, P., et al.: Handling imbalanced datasets: a review. GESTS Int. Trans. Comput. Sci. Eng. 30(1), 25–36 (2006)Google Scholar
  7. 7.
    Sonak, A., Patankar, R., Pise, N.: A new approach for handling imbalanced dataset using ann and genetic algorithm. In: 2016 International Conference on Communication and Signal Processing (ICCSP), pp. 1987–1990. IEEE (2016)Google Scholar
  8. 8.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3(Mar), 1157–1182 (2003)Google Scholar
  9. 9.
    Popescu, M.C., Sasu, L.M.: Feature extraction, feature selection and machine learning for image classification: a case study. In: 2014 International Conference on Optimization of Electrical and Electronic Equipment (OPTIM), pp. 968–973. IEEE (2014)Google Scholar
  10. 10.
    Dey, K., Shrivastava, R., Kaushik, S.: A paraphrase and semantic similarity detection system for user generated short-text content on microblogs. In: COLING, pp. 2880–2890 (2016)Google Scholar
  11. 11.
    Lavrač, N., Fürnkranz, J., Gamberger, D.: Explicit feature construction and manipulation for covering rule learning algorithms. In: Advances in Machine Learning I, pp. 121–146. Springer (2010)Google Scholar
  12. 12.
    Silva, L.M., de Sa, J.M., Alexandre, L.A.: Data classification with multilayer perceptrons using a generalized error function. Neural Netw. 21(9), 1302–1310 (2008)CrossRefGoogle Scholar
  13. 13.
    Sharma, C.: Big data analytics using neural networks (2014)Google Scholar
  14. 14.
    Hu, Y.-C.: Pattern classification by multi-layer perceptron using fuzzy integral-based activation function. Appl. Soft Comput. 10(3), 813–819 (2010)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, and classifica-tion. IEEE Trans. Neural Networks 3(5), 683–697 (1992)CrossRefGoogle Scholar
  16. 16.
    Sarwar, S.M., Hasan, M., Ignatov, D.I.: Two-stage cascaded classifier for purchase prediction. arXiv preprint arXiv:1508.03856 (2015)
  17. 17.
    Simonovsky, M., Komodakis, N.: Onionnet: sharing features in cascaded deep classifiers. arXiv preprint arXiv:1608.02728 (2016)
  18. 18.
    Christ, P.F., Elshaer, M.E.A., Ettlinger, F., Tatavarty, S., Bickel, M., Bilic, M.R., Armbruster, M., Hofmann, F., DAnastasi, M., et al.: Automatic liver and lesion segmentation in ct using cascaded fully convolutional neural networks and 3d conditional random fields. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 415–423. Springer (2016)Google Scholar
  19. 19.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of CSEKLEF Deemed to be UniversityVijaywadaIndia

Personalised recommendations