Abstract
Software engineering has evolved over the last 50 years, initially as a response to the so-called software crisis (the problems that organizations had producing quality software systems on time and on budget) of the 1960s and 1970s. Software engineering (SE) has been defined as “the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software; that is, the application of engineering to software”. Software engineering has developed a number of approaches to areas such as software requirements, software design, software testing, and software maintenance. Software development processes such as the waterfall model, incremental development, and the spiral model have been successfully applied to produce high-quality software on time and under budget. More recently, agile software development has gained popularity as an alternative to the more traditional development methods for development of complex systems. Within the last decade or so, advances in technologies such as mobile computing, social networks, cloud computing, and the Internet of things have given rise to massive datasets which have been given the name Big Data (BD). Big Data has been defined as data with 3Vs—high volume, velocity, and variety. Big Data contains so much data that low probability events are captured in the data. These events can be discovered using analytics methods and turned into actionable intelligence which can be used by businesses to gain a competitive advantage. Unfortunately, the very scale of BD often renders inadequate SQL-based relational database systems which have formed the backbone of data intensive systems for the last 30 years, requiring new NoSQL technologies to be effective. In this paper, we will explore how well-established SE technology can be adapted to support successful development of BD projects, as well as how BD techniques can be used to increase the utility of SE processes and techniques. Thus, BD and SE may mutually support and enrich each other.
Similar content being viewed by others
References
Brooks, F.P.: The Mythical Man-Month. Addison-Wesley, (1975)
Abran, A., Moore, J.W., Bourque, P., Dupuis, R., Tripp, L.L.: Guide to the Software Engineering Body of Knowledge, IEEE, (2004)
Sommerville, I.: Software Engineering, 10th edn. Pearson, (2015)
Agile Manifesto, http://agilemanifesto.org
Bourque, P., Fairley, R.E. eds.: SWEBOK: Guide to the Software Engineering Body of Knowledge, Version 3.0, IEEE Computer Society Press, (2014)
Laney, D.: “3-D Data Management: Controlling Data Volume, Velocity and Variety”, META Group Research Note, February 6, (2001)
Kopetz, H.: “Internet of Things”, Real-Time Systems, Real-Time Systems Series. Springer, (2011)
Khan, M.A., Uddin, M.F., Gupta, N.: ”Seven V’s of Big Data Understading Big Data to Extract Value”, Proceedings of the 2015 Zone 1 Conference of the American Society for Engineering Education, pp. 1-5, (2014)
Zikopoulos, P., Eaton, C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data, 1st edn. McGraw-Hill Osborne Media, (2011)
Leavitt, N.: Will NoSQL Databases Live Up to Their Promise? Computer 43(2), 12–14 (2010)
Buse, R.P.L., Zimmerman, T.: “Information Needs for Software Deveopment Analytics”, Proceedings 34th International Conference on Software Engineering – ICSE 2012, pp. 987-996, (2012)
Szyperski, C., Peticlerc, M., Barga, R.: Three Experts on Big Data Engineering. IEEE Software 33(2), 68–72 (2016)
Sena, B., Allian, A.P., Nakagawa, E.Y.: “Characterizing Big Data Software Architectures: A Systematic Mapping Study”, Proceedings of the 11th Brazilian Symposium on Software Components, Architectures, and Reuse, (2017)
Gorton, I., Klein, J.: Distribution, Data, Deployment: Software Architecture Convergence in Big Data Systems. IEEE Software 32(3), 78–85 (2015)
Chen, H.M., Kazman, R., Haziyev, S.: Agile Big Data Analytics for Web-Based Systems: An Architecture-Centric Approach. IEEE Transactions on Big Data 3(2), 234–248 (2016)
Guerriero, M., Tajfar, S., Tamburri, D.A., Di Nitto, E.: “Towards a Model-Driven Design Tool for Big Data Architectures”, Proceedings of the 2nd International Workshop on BIG Data Software Engineering (BIGDSE ’16), ACM, New York, NY, USA, pp. 37-43, (2016)
Osvaldo, S.S., Lopes, D., Silva, A.C., Abdelouahab, Z.: Developing Software Systems to Big Data Platform Based on MapReduce Model: An Approach Based on Model Driven Engineering. Information and Software Technology 92, 30–48 (2017)
Kätevä, J., Laurinen, P., Rautio, T., Suutala, J., Tuovinen, L., Röning, J.: “DBSA: a Device-Based Software Architecture for Data Mining”, Proceedings of the 2010 ACM Symposium on Applied Computing (SAC ’10), pp. 2273-2280, (2010)
Nadal, S., Herrero, V., Romero, O., Abelló, A., Franch, X., Vansummeren, S., Valerio, D.: A Software Reference Architecture for Semantic-Aware Big Data Systems. Information and Software Technology 90, 75–92 (2017)
Zhang, W., Xu, L., Li, Z., Lu, Q., Liu, Y.: A Deep-Intelligence Framework for Online Video Processing. IEEE Software 33(2), 44–51 (2016)
Wu, D., Zhu, L., Xu, X., Sakr, S., Sun, D., Lu, Q.: Building Pipelines for Heterogeneous Execution Environments for Big Data Processing. IEEE Software 33(2), 60–67 (2016)
Chen, H., Kazman, R., Haziyev, S.: Strategic Prototyping for Developing Big Data Systems. IEEE Software 33(2), 36–43 (2016)
Miranskyy, A., Hamou-Lhadj, A., Cialini, E., Larsson, A., Liu, Y.: Operational-Log Analysis for Big Data Systems: Challenges and Solutions. IEEE Software 33(2), 52–59 (2016)
Camilli, M.: “Formal Verification Problems in a Big Data World: Towards a Mighty Synergy”, Companion Proceedings of the 36th International Conference on Software Engineering (ICSE Companion 2014). ACM, New York, NY, USA, pp. 638-641, (2014)
Shapira, G., Chen, Y.: Common Pitfalls of Benchmarking Big Data Systems. IEEE Transactions on Services Computing 9(1), 152–160 (2016)
Saltz, J.: “Acceptance Factors for Using a Big Data Capability and Maturity Model”, Proceedings of the 25th European Conference on Information Systems (ECIS), pp. 2602-2612, (2017)
Lin, Y., Huang, S.J.: The Design of a Software Engineering Life Cycle Process for Big Data Projects. IT Professional (2017). https://doi.org/10.1109/MITP.2017.265105546
Al-Jaroodi, J., Hollein, B., Mohamed, N.: “Applying software engineering processes for big data analytics applications development”, Proceedings IEEE \(7^{{\rm th}}\) Annual Computing and Communication Workshop and Conference (CCWC), (2017)
Sachdeva, V., Chung, L.: “Handling Non-Functional Requirements for Big Data and IOT Projects in Scrum”, Proceedings 7th International Conference on Cloud Computing, Data Science & Engineering - Confluence, (2017)
Dutta, D., Bose, I.: Managing a Big Data project: The case of Ramco Cements Limited. International Journal of Production Economics” 165, 293–306 (2015)
Begel, A., Zimmerman, T.: “Analyze This! 145 Questions for Data Scientists in Software Engineering”, Proceedings 36\(^{{\rm th}}\) International Conference on Software Engineering – ICSE 2014, pp.12-23, (2014)
Kim, M., Zimmermann, T., DeLine, R., Begel, A.: Data Scientists in Software Teams: State of the Art and Challenges. IEEE Transactions on Software Engineering (2017). https://doi.org/10.1109/TSE.2017.2754374
Robbes, R., Kamei, Y., Pinzger, M.: Guest Editorial: Mining Software Repositories. Empirical Software Engineering 22, 1143–1145 (2017)
Choetkiertikul, M., Dam, H.K., Tran, T., Ghose, A.: Predicting the Delay of Issues with Due Dates in Software Projects. Empirical Software Engineering 22, 1223–1263 (2017)
Coelho, R., Almeida, L., Gousios, G., et al.: Exception Handling Bug Hazards in Android: Results From a Mining Study and an Exploratory Survey. Empirical Software Engineering 22(3), 1264–1304 (2017)
Batarseh, F., Gonzalez, A. J.: “Predicting Failures in Agile Software Development through Data Analytics“, Software Quality Journal, pp. 1-18, 2015, https://doi.org/10.1007/s11219-015-9285-3
Sawant, A., Bachelli, A.: fine-GRAPE: Fine-Grained APi Usage Extractor - An Approach and Dataset to Investigate API Usage. Empirical Software Engineering 22(3), 1348–1371 (2017)
Spinellis, D.: A repository of Unix history and evolution. Empirical Software Engineering 22(3), 1372–1404 (2017)
Caneill, M., Germán, D.M., Zacchiroli, S.: The Debsources Dataset: Two Decades of Free and Open Source Software. Empirical Software Engineering 22(3), 1405–1437 (2017)
Hentschel, J., Schmietendorf, A., Dumke, R.R.: “Big Data Benefits for the Software Measurement Community”, 2016 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement (IWSM-MENSURA), (2016), https://doi.org/10.1109/IWSM-Mensura.2016.025
Telea, A., Voinea, L.: Visual Software Analytics for Build Optimization of Large-Scale Software Systems. Computational Statistics 26(4), 635–654 (2011)
González-Torres, A., García-Peñalvo, F.J., Therón-Sánchez, R., Colomo-Palacios, R.: Knowledge Discovery in Software Teams by Means of Evolutionary Visual Software Analytics. Science of Computer Programming 121(1), 55–74 (2016)
Schmid, S., Gerostathopoulos, I., Prehofer, C., Bures, T.: “Self-Adaptation Based on Big Data Analytics: A Model Problem and Tool”, Proceedings of the 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems, pp. 102-108, (2017)
Gorton, I., Bener, A.B., Mockus, A.: Software Engineering for Big Data Systems. IEEE Software 33(2), 32–35 (2016)
Bagriyanik, S., Karahoca, A.: Big Data in Software Engineering: A Systematic Literature Review. Global Journal of Information Technology 6(1), 107–116 (2016)
Rouhani, S., Rotbei, S., Shamizanjani, M.: “Meta-Synthesis of Big Data Impacts on Information Systems Development”, Journal of Management Analytics vol. 4, no. 2, (2017)
Kumar, V.D., Alencar, P.: “Software Engineering for Big Data Systems: Domains, Methodologies and Gaps”, Proceedings of IEEE International Conference on Big Data, (2016)
Kumar, V.D.: “Software Engineering for Big Data Systems”, Masters Degree Thesis, University of Waterloo, (2017)
Otero, C.E., Peter, A.: Research Directions for Engineering Big Data Analytics Software. IEEE Intelligent Systems 30(1), 13–19 (2015)
Madhavji, N.H., Miranskyy, A., Kontogiannis, K.: “Big Picture of Big Data Software Engineering: With Example Research Challenges”, Proceedings of the First International Workshop on BIG Data Software Engineering (BIGDSE ’15), pp. 11-14, (2015)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Arndt, T. Big Data and software engineering: prospects for mutual enrichment. Iran J Comput Sci 1, 3–10 (2018). https://doi.org/10.1007/s42044-017-0003-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42044-017-0003-0