Sustainable Industrial Processes by Embedded Real-Time Quality Prediction

  • Marco StolpeEmail author
  • Hendrik Blom
  • Katharina Morik
Part of the Studies in Computational Intelligence book series (SCI, volume 645)


Sustainability of industrial production focuses on minimizing gas house emissions and the consumption of materials and energy. The iron and steel production offers an enormous potential for resource savings through production enhancements. This chapter describes how embedding data analysis (data mining, machine learning) enhances steel production such that resources are saved. The steps of embedded data analysis are comprehensively presented giving an overview of related work. The challenges of (steel) production for data analysis are investigated. A framework for processing data streams is used for real-time processing. We have developed new algorithms that learn from aggregated data and from vertically distributed data. Two real-world case studies are described: the prediction of the Basic Oxygen Furnace endpoint and the quality prediction in a hot rolling mill process. Both case studies are not academic prototypes, but truly real-world applications.


Support Vector Machine Dynamic Time Warping Sensor Reading Concept Drift Basic Oxygen Furnace 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work has partially been supported by the DFG, Collaborative Research Center 876, projects B3 and A1, The case study on BOF end time prediction has partially been supported by SMS Siemag and developed in collaboration with Hans-Jürgen Odenthal, Jochen Schlüter and Norbert Uebber. The application is pushed forward by Markus Reifferscheidt and Burkhard Dahmen from the SMS group. We thank AG der Dillinger Hüttenwerke, particularly Helmut Lachmund and Dominik Schöne, for testing our methods and insightful discussions. The hot rolling mill application has been developed in cooperation with Jochen Deuse, Daniel Lieber, Benedikt Konrad and Fabian Bohnen from the TU Dortmund University. We thank Ulrich Reichel and Alfred Weiß from the Deutsche Edelstahl Werke.


  1. 1.
    Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference on Management of Data, pp. 207–216. Washington, D.C. (1993)Google Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering (ICDE), pp. 3–14. IEEE, Washington, DC, USA (1995)Google Scholar
  3. 3.
    AlGhazzawi, A., Lennox, B.: Model predictive control monitoring using multivariate statistics. J. Process Control 19(2), 314–327 (2009)CrossRefGoogle Scholar
  4. 4.
    OECD/IEA: International energy outlook 2011. Technical Report DOE/EIA-0484(2011), U.S. Energy Information Administration (2011)Google Scholar
  5. 5.
    Bai, Z., Wei, G., Liu, X., Zhoa, W.: Predictive model of energy cost in steelmaking process based on BP neural network. In: Proceedings of 2nd International Conference on Software Engineering, Knowledge Engineering and Information Engineering, pp. 77–80 (2014)Google Scholar
  6. 6.
    Bhaduri, K., Stolpe, M.: Distributed data mining in sensor networks. In: Aggarwal, C. (ed.) Managing and Mining Sensor Data, chap. 8. Springer, Berlin, Heidelberg (2013)Google Scholar
  7. 7.
    Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 139–148. ACM (2009)Google Scholar
  8. 8.
    Bockermann, C., Blom, H.: The streams framework. Technical report, Technical Report 5, TU Dortmund University, 12, 2012 (2012)Google Scholar
  9. 9.
    Box, G., Jenkins, G., Reinsel, G.: Time Series Analysis. Forecasting and Control, 3rd edn. Prentice Hall, Englewood Cliffs (1994)Google Scholar
  10. 10.
    Brock, W., Mäler, K., Perrings, C.: chap. Resilience and sustainability: the economic analysis of non-linear dynamic systems. In: Panarchy: Understanding Transformations in Human and Natural Systems. Island Press (2001)Google Scholar
  11. 11.
    Candan, K., Rossini, R., Wang, X., Sapino, M.: sDTW: Computing DTW distances using locally relevant constraints based on salient feature alignments. Proc. VLDB Endow. 5(11), 1519–1530 (2012)CrossRefGoogle Scholar
  12. 12.
    Carroll, A., Heiser, G.: An analysis of power consumption in a smartphone. In: Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, USENIXATC’10. USENIX Association, Berkeley, CA, USA (2010)Google Scholar
  13. 13.
    Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge, MA (2006)Google Scholar
  14. 14.
    Chen, J.: A predictive system for blast furnaces by integrating a neural network with qualitative analysis. Eng. Appl. Artif. Intell. 14(1), 77–85 (2001)CrossRefGoogle Scholar
  15. 15.
    Chen, S., Liu, B., Qian, M., Zhang, C.: Kernel k-Means based framework for aggregate outputs classification. In: Proceedings of the International Conference on Data Mining Workshops (ICDMW), pp. 356–361 (2009)Google Scholar
  16. 16.
    Chiu, B., Keogh, E., Lonardi, S.: Probabilistic discovery of time series motifs. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 493–498. ACM, New York, NY, USA (2003)Google Scholar
  17. 17.
    Chukwulebe, B., Robertson, K., Grattan, J.: The methods, aims and practices (map) for BOF endpoint control. Iron Steel Technol. 4(11), 60–70 (2007)Google Scholar
  18. 18.
    Cox, I., Lewis, R., Ransing, R., Laszczewski, H., Berni, G.: Application of neural computing in basic oxygen steelmaking. J. Mater. Process. Technol. 120(1), 310–315 (2002)CrossRefGoogle Scholar
  19. 19.
    Das, G., Gunopulos, D., Mannila, H.: Finding similar time series. In: Principles of Data Mining and Knowledge Discovery. LNCS, vol. 1263, pp. 88–100. Springer, Berlin, Heidelberg (1997)Google Scholar
  20. 20.
    Das, K., Bhaduri, K., Votava, P.: Distributed anomaly detection using 1-class SVM for vertically partitioned data. Stat. Anal. Data Min. 4(4), 393–406 (2011)MathSciNetCrossRefGoogle Scholar
  21. 21.
    De Beer, J.: Future technologies for energy-efficient iron and steel making. In: Potential for Industrial Energy-Efficiency Improvement in the Long Term, pp. 93–166. Springer (2000)Google Scholar
  22. 22.
    De Beer, J., Worrell, E., Blok, K.: Future technologies for energy-efficient iron and steel making. Annu. Rev. Energy Environ. 23(1), 123–205 (1998)CrossRefGoogle Scholar
  23. 23.
    Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii. Lect. Notes Comput. Sci. 1917, 849–858 (2000)CrossRefGoogle Scholar
  24. 24.
    Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD), pp. 226–231. AAAI Press (1996)Google Scholar
  25. 25.
    Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, vol. 23, pp. 419–429. ACM Press, New York, NY, USA (1994)Google Scholar
  26. 26.
    Forero, P., Cano, A., Giannakis, G.: Consensus-based distributed support vector machines. J. Mach. Learn. Res. 11, 1663–1707 (2010)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Fruehan, R., et al.: The Making, Shaping, and Treating of Steel. AISE Steel Foundation Pittsburgh, PA, USA (1998)Google Scholar
  28. 28.
    Gal, A., Keren, S., Sondak, M., Weidlich, M., Blom, H., Bockermann, C.: Grand challenge: the techniball system. In: Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems, pp. 319–324. ACM (2013)Google Scholar
  29. 29.
    Gama, J., Gaber, M.: Learning from Data Streams: Processing Techniques in Sensor Networks. Springer (2007)Google Scholar
  30. 30.
    Ghosh, A., Chatterjee, A.: Iron Making and Steelmaking: Theory and Practice. PHI Learning Pvt Ltd. (2008)Google Scholar
  31. 31.
    Han, M., Zhao, Y.: Dynamic control model of BOF steelmaking process based on ANFIS and robust relevance vector machine. Exp. Syst. Appl. 38(12), 14786–14798 (2011)CrossRefGoogle Scholar
  32. 32.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Statistics, 2nd edn. Springer (2009)Google Scholar
  33. 33.
    Hernández-González, J., Iñza, I., Lozano, J.: Learning Bayesian network classifiers from label proportions. Pattern Recogn. 46(12), 3425–3440 (2013)CrossRefGoogle Scholar
  34. 34.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  35. 35.
    Jeffery, S., Alonso, G., Franklin, M., Hong, W., Widom, J.: Declarative support for sensor data cleaning. In: Pervasive Computing. LNCS, pp. 83–100. Springer, Berlin (2006)Google Scholar
  36. 36.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning (ICML), pp. 200–209. Morgan Kaufmann, San Francisco, CA (1999)Google Scholar
  37. 37.
    Jolliffe, I.: Principal Component Analysis, 2nd edn. Springer (2002)Google Scholar
  38. 38.
    Kano, M., Nakagawa, Y.: Data-based process monitoring, process control, and quality improvement: recent developments and applications in steel industry. Comput. Chem. Eng. 32(1), 12–24 (2008)CrossRefGoogle Scholar
  39. 39.
    Kaplan, R., Norton, D.: Balanced Scorecard. Springer (2007)Google Scholar
  40. 40.
    Keogh, E., Chu, S., Hart, D., Pazzani, M.: Segmenting time series: a survey and novel approach. Data Mining Time Ser. Datab. 57, 1–22 (2004)Google Scholar
  41. 41.
    Kianmehr, K., Koochakzadeh, N.: Privacy-preserving ranking over vertically partitioned data. In: Proceedings of the 2012 Joint EDBT/ICDT Workshops, pp. 216–220. ACM (2012)Google Scholar
  42. 42.
    Kim, W., Yoo, J., Kim, H.: Multi-target tracking using distributed SVM training over wireless sensor networks. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2439–2444 (2012)Google Scholar
  43. 43.
    Kohonen, T.: Self-Organization and Associative Memory. Springer, Berlin (1989)CrossRefzbMATHGoogle Scholar
  44. 44.
    Konrad, B., Lieber, D., Deuse, J.: Striving for zero defect production: Intelligent manufacturing control through data mining in continuous rolling mill processes. In: Robust Manufacturing Control (RoMaC). LNCS, pp. 215–229. Springer, Berlin, Heidelberg (2013)Google Scholar
  45. 45.
    Kriegel, H.P., Kröger, P., Pryakhin, A., Renz, M., Zherdin, A.: Approximate Clustering of Time Series Using Compact Model-based Description, LNCS, vol. 4947, pp. 364–379. Springer, Berlin, Heidelberg (2009)Google Scholar
  46. 46.
    Kueck, H., de Freitas, N.: Learning about individuals from group statistics. In: Uncertainty in Artificial Intelligence (UAI), pp. 332–339. AUAI Press, Arlington, Virginia (2005)Google Scholar
  47. 47.
    Kumbhar, M., Kharat, R.: Privacy preserving mining of association rules on horizontally and vertically partitioned data: A review paper. In: 12th International Conference on Hybrid Intelligent Systems (HIS), pp. 231–235. IEEE (2012)Google Scholar
  48. 48.
    Laha, D.: Ann modeling of a steelmaking process. In: Panigrahi, B., Suganthan, P., Das, S., Dash, S. (eds.) Swarm, Evolutionary, and Memetic Computing. Lecture Notes in Computer Science, vol. 8298, pp. 308–318. Springer International Publishing (2013). doi: 10.1007/978-3-319-03756-1_28.
  49. 49.
    Lee, S., Stolpe, M., Morik, K.: Separable approximate optimization of support vector machines for distributed sensing. In: Machine Learning and Knowledge Discovery in Databases. LNCS, vol. 7524, pp. 387–402. Springer, Berlin, Heidelberg (2012)Google Scholar
  50. 50.
    Lekakh, S.N., Robertson, D.: Application of the combined reactors method for analysis of steelmaking process. In: Celebrating the Megascale: Proceedings of the Extraction and Processing Division Symposium on Pyrometallurgy in Honor of David GC Robertson, pp. 393–400. Wiley (2014)Google Scholar
  51. 51.
    Lieber, D., Stolpe, M., Konrad, B., Deuse, J., Morik, K.: Quality prediction in interlinked manufacturing processes based on supervised and unsupervised machine learning. In: Proceedings of the 46th CIRP Conference on Manufacturing Systems (CMS), vol. 7, pp. 193–198. Elsevier (2013)Google Scholar
  52. 52.
    Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. Data Mining Knowl. Discov. 15(2), 107–144 (2007)MathSciNetCrossRefGoogle Scholar
  53. 53.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)CrossRefGoogle Scholar
  54. 54.
    Lytvynyuk, Y., Schenk, J., Hiebler, M., Sormann, A.: Thermodynamic and kinetic model of the converter steelmaking process. Part 1: The description of the BOF model. Steel Res. Int. 85(4), 537–543 (2014)CrossRefGoogle Scholar
  55. 55.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
  56. 56.
    Mangasarian, O., Wild, E., Fung, G.: Privacy-preserving classification of vertically partitioned data via random kernels. TKDD 2(3) (2008)Google Scholar
  57. 57.
    Mannila, H., Toivonen, H., Verkamo, A.: Discovery of frequent episodes in event sequences. Data Mining Knowl. Discov. 1(3), 259–290 (1997)CrossRefGoogle Scholar
  58. 58.
    Martins, A., Mata, T., Costa, C., Sikdar, S.: Framework for sustainability metrics. Ind. Eng. Chem. Res. 46(10), 2962–2973 (2007)CrossRefGoogle Scholar
  59. 59.
    Matias, Y., Vitter, J., Wang, M.: Dynamic maintenance of wavelet-based histograms. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), pp. 101–110. Morgan Kaufmann, San Francisco, CA, USA (2000)Google Scholar
  60. 60.
    Mierswa, I., Morik, K.: Automatic feature extraction for classifying audio data. Mach. Learn. J. 58, 127–149 (2005)CrossRefzbMATHGoogle Scholar
  61. 61.
    Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: YALE: rapid prototyping for complex data mining tasks. In: Eliassi-Rad, T., Ungar, L.H., Craven, M., Gunopulos, D. (eds.) Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 935–940. ACM Press, New York, USA (2006)Google Scholar
  62. 62.
    Morik, K.: Tailoring representations to different requirements. In: Watanabe, O., Yokomori, T. (eds.) Algorithmic Learning Theory—Proceedings of 10th International Conference on ALT99, Lecture Notes in Artificial Intelligence, pp. 1–12. Springer (1999)Google Scholar
  63. 63.
    Morik, K., Köpcke, H.: Features for learning local patterns in time-stamped data. In: Morik, K., Boulicaut, J.F., Siebes, A. (eds.) Local Pattern Detection: International Seminar, Dagstuhl Castle, Germany, 12–16 Apr 2004, Revised Selected Papers, chap. 7, pp. 98–114. Springer (2005)Google Scholar
  64. 64.
    Morik, K., Wessel, S.: Incremental signal to symbol processing. In: Making Robots Smarter, pp. 185–198. Springer (1999)Google Scholar
  65. 65.
    Moya, M., Koch, M., Hostetler, L.: One-class classifier networks for target recognition applications. In: Proceeding of World Congress on Neural Networks, pp. 797–801. Int. Neural Network Society (1993)Google Scholar
  66. 66.
    Müller, M.: Dynamic time warping. In: Information Retrieval for Music and Motion, pp. 69–84. Springer, Berlin, Heidelberg (2007)Google Scholar
  67. 67.
    Musicant, D., Christensen, J., Olson, J.: Supervised learning by training on aggregate outputs. In: Proceedings of the 7th IEEE International Conference on Data Mining (ICDM), pp. 252–261. IEEE, Washington, DC, USA (2007)Google Scholar
  68. 68.
    Ni, K., Ramanathan, N., Chehade, M.N.H., Balzano, L., Nair, S., Zahedi, S., Kohler, E., Pottie, G., Hansen, M., Srivastava, M.: Sensor network data fault types. ACM Trans. Sensor Netw. (TOSN) 5(3), 1–29 (2009)CrossRefGoogle Scholar
  69. 69.
    Nishida, K., Yamauchi, K.: Detecting concept drift using statistical testing. In: Discovery Science, pp. 264–269. Springer (2007)Google Scholar
  70. 70.
    Pendelberry, S., Ying Chen Su, S., Thurston, M.: A Taguchi-based method for assessing data center sustainability. In: Proceeding of the iEMSs 4th Biennial Meeting: International Congress on Environmental Modelling and Software. Int. Environ. Modelling and Software Society (2010)Google Scholar
  71. 71.
    Quadrianto, N., Smola, A., Caetano, T., Le, Q.: Estimating labels from label proportions. J. Mach. Learn. Res. 10, 2349–2374 (2009)MathSciNetzbMATHGoogle Scholar
  72. 72.
    Rakthanmanon, T., Keogh, E., Lonardi, S., Evans, S.: Mdl-based time series clustering. Knowl. Inf. Syst. 33(2), 371–399 (2012)CrossRefGoogle Scholar
  73. 73.
    Rüping, S.: SVM classifier estimation from group probabilities. In: Proceedings of the 27th International Conference on Machine Learning (ICML) (2010)Google Scholar
  74. 74.
    Ryman, C., Larsson, M.: Reduction of CO\(_2\) emissions from integrated steelmaking by optimised scrap strategies: application of process integration models on the BF-BOF system. ISIJ Int. 46(12), 1752–1758 (2006)CrossRefGoogle Scholar
  75. 75.
    Schlueter, J., Odenthal, H.J., Uebber, N., H., B., K., M.: A novel data-driven prediction model for bof endpoint. In: AISTech Conference Proceedings. Association for Iron & Steel Technology, Warrendale, PA, USA (2013)Google Scholar
  76. 76.
    Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., Williamson, R.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)CrossRefzbMATHGoogle Scholar
  77. 77.
    Schölkopf, B., Smola, A.J.: Learning with Kernels—Support Vector Machines. Optimization, and Beyond, Regularization. MIT Press (2002)Google Scholar
  78. 78.
    Schowe, B., Morik, K.: Fast-ensembles of minimum redundancy feature selection. In: Okun, O., Valentini, G., Re, M. (eds.) Ensembles in Machine Learning Applications, pp. 75–95. Springer (2011)Google Scholar
  79. 79.
    Siebes, A., Vreeken, J., van Leeuwen, M.: Item sets that compress. In: Proceeding of the 6th SIAM International Conference on Data Mining, pp. 395–418 (2006)Google Scholar
  80. 80.
    Smola, A., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14(3), 199–222 (2004)MathSciNetCrossRefGoogle Scholar
  81. 81.
    Spengler, T., Geldermann, J., Hähre, S., Sieverdingbeck, A., Rentz, O.: Development of a multiple criteria based decision support system for environmental assessment of recycling measures in the iron and steel making industry. J. Clean. Prod. 6(1), 37–52 (1998)CrossRefGoogle Scholar
  82. 82.
    Srikant, R., Agrawal, R.: Mining sequential patterns: Generalizations and performance improvements. In: Proceedings of the 5th International Conference on Extending Database Technology. LNCS, vol. 1057, pp. 3–17. Springer, London, UK (1996)Google Scholar
  83. 83.
    Stolpe, M., Bhaduri, K., Das, K., Morik, K.: Anomaly detection in vertically partitioned data by distributed core vector machines. In: Machine Learning and Knowledge Discovery in Databases. Springer (2013)Google Scholar
  84. 84.
    Stolpe, M., Morik, K.: Learning from label proportions by optimizing cluster model selection. In: Machine Learning and Knowledge Discovery in Databases. LNCS, vol. 6913, pp. 349–364. Springer, Berlin, Heidelberg (2011)Google Scholar
  85. 85.
    Stolpe, M., Morik, K., Konrad, B., Lieber, D., Deuse, J.: Challenges for data mining on sensor data of interlinked processes. In: Proceeding of the Next Generation Data Mining Summit (NGDM) (2011)Google Scholar
  86. 86.
    Tax, D.M., Duin, R.P.: Support vector data description. Mach. Learn. 54(1), 45–66 (2004)CrossRefzbMATHGoogle Scholar
  87. 87.
    Teffer, D., Hutton, A., Ghosh, J.: Temporal distributed learning with heterogeneous data using Gaussian mixtures. In: IEEE 11th International Conference on Data Mining Workshops (ICDMW), pp. 196–203 (2011)Google Scholar
  88. 88.
    Tsang, I., Kwok, J., Cheung, P.M.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2005)MathSciNetzbMATHGoogle Scholar
  89. 89.
    Xu, L.F., Li, W., Zhang, M., Xu, S.X., Li, J.: A model of basic oxygen furnaceBOF end-point prediction based on spectrum information of the furnace flame with support vector machine (SVM). Optik—Int. J. Light Electron Opt. 594–598 (2011)Google Scholar
  90. 90.
    Ye, L., Keogh, E.: Time series shaplets: A new primitive for data mining. In: Proceeding of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 947–956. ACM, New York, NY, USA (2009)Google Scholar
  91. 91.
    Yu, F., Liu, D., Kumar, S., Jebara, T., Chang, S.F.: \(\propto \)-SVM for learning with label proportions. arXiv:1306.0886 (2013)
  92. 92.
    Yu, H., Vaidya, J., Jiang, X.: Privacy-preserving SVM classification on vertically partitioned data. In: Proceedings of the 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD), pp. 647–656. Springer, Berlin, Heidelberg (2006)Google Scholar
  93. 93.
    Yunhong, H., Liang, F., Guoping, H.: Privacy-preserving SVM classification on vertically partitioned data without secure multi-party computation. In: 5th International Conference on Natural Computation (ICNC), vol. 1, pp. 543–546 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.TU Dortmund UniversityDortmundGermany

Personalised recommendations