Big Data Analysis Procedure Model for Manufacturing and Logistics: Strategies and Tools for the Practical Application

  • Marco HübnerEmail author
  • Philipp Jahn
  • Gregor Tewaag
Conference paper


In times of a gradual digitalisation of the production, more and more data is collected from diverse sources of a production process [1]. The awareness for the potential insights generated from this data has increased massively in recent years [2, 3]. By now, even small-scale enterprises have the capacities to store the routinely incoming data. Crucial for the success of Big Data projects is the proficient reprocessing [4].

In this paper a model for the general approach to Big Data Analysis will be presented and the essential mathematical tools will be described in performance and suitability. For this, multiple data-mining models have been analysed and joined in a holistic approach. Additionally, different analysis strategies (e.g. correlation, regression, clustering and decision-trees) have been evaluated regarding their uses and limitations.

For verification, the derived model has been tested in collaboration with an industry partner on a multistage production process. Prediction models were developed and verified on a test group of data. For the preparation and analysis of the population, the data-mining workbench KNIME has been used.

It was possible to show, that multivariate linear correlations can be detected and examined using different analysis tools like matrices of the correlation coefficients, principal component analysis (PCA) or multidimensional scaling (MDS). Clusters and rule based decision tree models could be found as well. Based on the findings an optimisation of the assessed production process could be realised. Due to the derived structure and plan of procedure, the advantages of aforesaid models could be concentrated. A reduction of the processing time and an improved error prediction were made possible. Additionally a number of prior unknown factual contexts could be discovered between the collected parameters.


Big Data Analysis Data mining Process optimisation 


  1. 1.
    Bundesministerium für Wirtschaft und Energie: Smart Data - Innovationen aus Daten (2016)Google Scholar
  2. 2.
    Spath, D.: Arbeitswelten 4.0. Fraunhofer Verlag, Stuttgart (2013)Google Scholar
  3. 3.
    Projektgruppe Smart Data: Smart Data - Potentiale und Herausforderungen, Vernetzte Anwendungen und Plattformen für die digitale Gesellschaft (2014)Google Scholar
  4. 4.
    Zikopoulos, P., Eaton, C.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill, New York (2012)Google Scholar
  5. 5.
    Spath, D., Ganschar, O., Gerlach, S., Hämmerle, M., Krause, T., Schlund, S.: Produktionsarbeit der Zukunft-Industrie 4.0, Fraunhofer Verlag, Stuttgart (2013)Google Scholar
  6. 6.
    Azevedo, A., Filipe Santos, M.: KDD, semma and CRISP-DM: a parallel overview, IADS-DM (2008)Google Scholar
  7. 7.
    Morik, K.: Der CRISP-DM Prozess für Data Mining. Technische Universität Dortmund, Dortmund (2016)Google Scholar
  8. 8.
    Krcmar, H.: Knowledge Discovery in Database on the Example of Engineering Change Management. Technische Universität München, München (2010)Google Scholar
  9. 9.
    Sharafi, A.: Knowledge Discovery in Databases - Eine Analyse des Änderungsmanagements in der Produktentwicklung. Springer Gabler, Wiesbaden (2013)CrossRefGoogle Scholar
  10. 10.
    Sharafi, A., Wolf, P., Krcmar, H.: Knowledge Discovery in Databases on the Example of Engineering Change Management, Industrial Conference on Data Mining-Poster and Industry Proceedings (2010)Google Scholar
  11. 11.
    Strüby, R.: Data mining mit der SEMMA Methode, Enterprise Miner for Windows NT (1998)Google Scholar
  12. 12.
    Tombrock, P.: Knowledge Discovery in Datenbanken, Fallstudie, Hochschule für Ökonomie und Management (2016)Google Scholar
  13. 13.
    Berthold, M.R.: Guide to Intelligent Data Analysis. How to Intelligently Make Sense of Real Data. Texts in Computer Science, vol. 42, Springer-Verlag London Limited, London (2010)CrossRefGoogle Scholar
  14. 14.
    Hinrichs, H.: Datenqualität im Data Warehouse, Dissertation, Universität Oldenburg, Oldenburg (2002)Google Scholar
  15. 15.
    Hildebrand, K.: Datenqualität im supply chain management. Dissertation, Fachhochschule Darmstadt (2010)Google Scholar
  16. 16.
    Dias, R.: Nonparametric Regression: Lowess/Loess. Advanced Geographic Data Analysis Scatter-Diagram Smoothing (2014)Google Scholar
  17. 17.
    Jacoby, B.: Regression: Advanced Methods, Michigan (2017)Google Scholar
  18. 18.
    Chatterjee, S., Price, B.: Chatterjee-Price: Praxis der Regressionsanalyse, 2nd edn., Lehr- und Handbücher der Statistik. Oldenbourg, München (1995)Google Scholar
  19. 19.
    Handl, A.: Multivariate Analysemethoden - Theorie und Praxis multivariater Verfahren unter besonderer Berücksichtigung von S-PLUS, Statistik und ihre Anwendungen. Springer-Verlag, Heidelberg (2010)zbMATHGoogle Scholar
  20. 20.
    Cornish, R.: Statistics: Cluster Analysis, Michigan (2007)Google Scholar
  21. 21.
    Blobel, V., Lohrmann, E.: Statistische und numerische Methoden der Datenanalyse, 2nd edn. V. Blobel, Hamburg (2012)Google Scholar
  22. 22.
    Assenmacher, W.: Deskriptive Statistik, 4th edn. Springer-Lehrbuch. Springer, Berlin (2010)Google Scholar
  23. 23.
    Jeseke, M., Grüner, M., Weiß, F.: Big Data in Logistics. A DHL perspective on how to move beyond the hype, Troisdorf (2013)Google Scholar
  24. 24.
    Stackowiak, R., Manta, V., Licht, A.: Improving Logistics & Transportation Performance with Big Data. Architects Guide and Reference Architecture Introduction, Oracle Enterprise Architecture White Paper, Orace Corporation, Redwood Shores (2015)Google Scholar
  25. 25.
    Fleet Owner: Survey: Need for big data in logistics keeps growing, Annual study (2016)Google Scholar
  26. 26.
    Runkler, T.A.: Data Mining. Modelle und Algorithmen Intelligenter Datenanalyse, 2nd edn. Computational Intelligence, Springer Vieweg, Wiesbaden (2015)Google Scholar
  27. 27.
    Sataev, X.: Data Mining. Ausarbeitung im Rahmen der Vorlesung “Next Media”, Vortrag, Hochschule für Angewandte Wissenschaften Hamburg, Hamburg (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Institut für Fabrikanlagen und Logistik (IFA)GarbsenGermany
  2. 2.Institut für Montagetechnolgie (MATCH)GarbsenGermany
  3. 3.Sartorius Lab Instruments GmbH & Co. KGGöttingenGermany

Personalised recommendations