Skip to main content

Resource-Aware Steel Production Through Data Mining

  • 2430 Accesses

Part of the Lecture Notes in Computer Science book series (LNAI,volume 9853)


Today’s steel industry is characterized by overcapacity and increasing competitive pressure. There is a need for continuously improving processes, with a focus on consistent enhancement of efficiency, improvement of quality and thereby better competitiveness. About 70 % of steel is produced using the BF-BOF (Blast Furnace - Blow Oxygen Furnace) route worldwide. The BOF is the first step of controlling the composition of the steel and has an impact on all further processing steps and the overall quality of the end product. Multiple sources of process-related variance and overall harsh conditions for sensors and automation systems in general lead to a process complexity that is not easy to model with thermodynamic or metallurgical approaches. In this paper we want to give an insight how to improve the output quality with machine learning based modeling and which constraints and requirements are necessary for an online application in real-time.


  • Real time regression
  • Model predictive control
  • Prescriptive data analytics

1 Introduction

There are several ways to produce steel. A complete overview can be found in [6]. About 70 % of steelFootnote 1 is produced using the BF-BOF (Blast Furnace - Blow Oxygen Furnace) route [5]. The first step is to smelt ores to raw iron in a blast furnace. Coke is used as the primary energy source and as a reduction agent. The carbon will bind the oxygen of the iron oxides. At the end of the process liquid raw iron is produced and transported to the BOF. The produced liquid raw iron has a temperature of 1,200 ℃ and has a very high concentration of carbon and other unwanted substances. In the given use case [9], the BOF is charged with 150 tons of liquid raw iron and around 30 tons of scrap metal. The amount of unwanted contents (except carbon) will be bound in the slag by blowing pure oxygen on the mixture of liquid raw iron and scrap metal. The whole mixture is stirred by a bottom gas injection. During the process, the raw iron will be heated up to 1600 ℃. The needed energy will be produced by the combustion of the contained carbon in the raw iron. After 20 to 30 min, the process will be stopped based on an analysis of the off-gas composition. The high temperature makes it very expensive and technically challenging [2] to measure the state of the BOF content during the process directly. Usually, there will only be a single measurement at the end of the process. In the given use-case, the quality of the output of the BOF process is described by the temperature, the carbon and phosphorus content of the raw steel and the iron content of the slag at the end of the process. Depending on the difference between the measured and the predefined target value, the process will be repeated until all quality indicators are within the specifications. With only a single measurement at the end of the process, only predictions of the quality indicators can be used to control the process. The prediction of a single quality indicator can be coined as a learning task. After one or multiple refinement steps, casting and rolling, the steel is delivered as coil, plate, sections or bars. The BOF process is the first step of controlling the composition of the steel. The quality of the output has an impact on all further processing steps and the overall quality of the end products. Thus, the quality requirements for the output are usually quite strict. It may happen that up to 20 % of the processes [2] have to be restarted at least once due to quality issues of the output. Hence, the improvement of the prediction is decisive to increase the efficiency and saving resources [7, 10].

2 Process Control

There are multiple possibilities to control the outcome of the process directly. Corrective actions have the largest impact if they are executed as early as possible in the process. The most common approach is to precalculate the amount of blown oxygen and heating, cooling and slagging agents based on thermodynamic and metallurgical calculations [4]. The major challenges are presented by multiple sources of variance in the process. Wear and tear, weather, shift work, the unknown state and composition of the used input materials and the high volume of the BOF lead to conditions, that are hard to model with classical metallurgical approaches. Either these models are provided with numerous parameters and are therefore complex to handle or a too small number of parameters limits the reliability of the models. Nevertheless, the resulting predictions and the corrective actions of the operators deliver usually good results already. But even if the optimal metallurgical model would be used, the overall harsh conditions will lead to wear or failure of sensors and other automation equipment. If not handled properly, the reduced data quality and sensor reliability will reduce the quality of every prediction significantly.

3 The BOF Process from a Data Point of View

The data of the BOF process comprise of continuous and event-based data. These data streams are generated by two different data sources (Level 2 and 3 systems [3]). The data streams can be merged and partitioned in an sequence of BOF processes. The event-based data stream contains the results of the composition analysis and the results of the other external measurements, events like the addition of cooling or heating agent and meta-data about the state of the BOF itself. The continuous data stream contains all in-process measurements, like the off-gas composition, the oxygen and cooling water flow and multiple temperatures. A considerable proportion of the 100 raw features are not usable due to not sufficient positioning of the sensors. Until today data analysis only aimed at a better process understanding of the metallurgical experts. Even if learning algorithms were used to model the process no automatic extraction of features and application of learned models have been performed [11].

4 Offline Analysis and Online Application

The major improvement of successful predictions is tuning the features. For the first time, we have constructed multiple new features to describe the BOF process better and monitor the state more directly [7]. The promising results lead to an implementation and application of a prototype at the steel factory itself [9]. The online application of learned models should move beyond merely hand coding the model into a control program. In some factories there are up to 6 BOFs installed. Every BOF will be in a different physical state and the given input materials will be different for every factory. Consequently, every BOF requires a different set of models and different update policies. The manual management and update of the models would require great efforts. The wear and failure of equipment and sensors will lead to concept drifts [1] or a complete loss of raw data and all extracted features. Therefore, multiple models for the multiple sensor settings and an online monitoring and management of these models are needed. To the best of our knowledge, we are the first who developed an online model management module. We implemented a modular and scalable architecture, that is able to connect to multiple legacy systems, store all data efficiently and dynamically extract new features from these raw data, learn new models and apply these models in real-time [8].

5 From Predictions to Control Assistance

The predictions can be used by the operator to evaluate the potential outcome of multiple corrective actions directly. Moving beyond this manual operation on the basis of predictions, we improved the control assistance further. The improvement can be formulated as an multi-objective optimization problem [9]. The predictions are used as a surrogate function for the real value of the quality indicators. Similar to the metallurgical approach, the optimization algorithm uses the amount of oxygen and additions as variables. The costs of the used input materials can be used to calculate the costs of every potential corrective action and can be included into the optimization problem. The optimization problem is solved continuously (1 Hz) and the results can be used by the operator to adapt the amount of oxygen and additions as early as possible in the process.

6 Results and Conclusion

The prediction of the conditions at the BOF end-point have been improved over all development steps and have been constantly better than the classical approach. Nevertheless, the improvement of the prediction quality is only the first step for a successful control and monitoring of BOF processes. The different implementations have been executed successfully and reliable over multiple years. Only with a modular and scalable architecture and implementation it is possible to cope with the given harsh conditions and individual characteristics of every BOF in real-time.


  1. 1.


  1. Baena-Garcıa, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77–86 (2006)

    Google Scholar 

  2. Chukwulebe, B.O., Robertson, K., Grattan, J.: The methods, aims and practices (map) for bof endpoint control. Iron Steel Technol. 4(11), 60–70 (2007)

    Google Scholar 

  3. International Electrotechnical Commission, et al.: Iec 62264–1 enterprise-control system integration-part 1: Models and terminology. IEC, Genf (2003)

    Google Scholar 

  4. Coudurier, L., Hopkins, D.W., Wilkomirsky, I.: Fundamentals of Metallurgical Processes: International Series on Materials Science and Technology, vol. 27. Elsevier (2013)

    Google Scholar 

  5. De Beer, J.: Future technologies for energy-efficient iron and steel making. In: Potential for Industrial Energy-Efficiency Improvement in the Long Term, pp. 93–166. Springer, Netherlands (2000)

    Google Scholar 

  6. Fruehan, R.J.: The Making, Shaping, and Treating of Steel: Ironmaking volume, vol. 2. AISE Steel Foundation (1999)

    Google Scholar 

  7. Morik, K., Blom, H., Odenthal, H.J., Uebber, N.: Resource-aware steel production through data mining. In: SustKDD Workshop at KDD (2012)

    Google Scholar 

  8. Schlüter, J., Odenthal, H.J., Uebber, N., Blom, H., Beckers, T., Morik, K., AG, S.S.: Reliable bof endpoint prediction by novel data-driven modeling. In: AISTech Conference Proceedings. AISTech (2014)

    Google Scholar 

  9. Schlüter, J., Odenthal, H.J., Uebber, N., Blom, H., Morik, K.: A novel data-driven prediction model for bof endpoint. In: Association for Iron & Steel Technology Conference, Pittsburgh, USA, vol. 6 (2013)

    Google Scholar 

  10. Wolff, B., Lorenz, E., Kramer, O.: Statistical learning for short-term photovoltaic power predictions. In: Lässig, J., Kersting, K., Morik, K. (eds.) Computational Sustainability. SCI, vol. 645, pp. 31–45. Springer, Heidelberg (2016). doi:10.1007/978-3-319-31858-5_3

    CrossRef  Google Scholar 

  11. Xu, L., Li, W., Zhang, M., Xu, S., Li, J.: A model of basic oxygen furnace (bof) end-point prediction based on spectrum information of the furnace flame with support vector machine (svm). Optik-Int. J. Light Electron Optics 122(7), 594–598 (2011)

    CrossRef  Google Scholar 

Download references


This research was supported by SMS-Siemag and in part by the Deutsche Forschungsgemeinschaft (DFG) within the Collaborative Research Center SFB 876 Providing Information by Resource-Constrained Analysis, project B3.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Hendrik Blom .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Blom, H., Morik, K. (2016). Resource-Aware Steel Production Through Data Mining. In: , et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science(), vol 9853. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46130-4

  • Online ISBN: 978-3-319-46131-1

  • eBook Packages: Computer ScienceComputer Science (R0)