Skip to main content
Log in

Automated phase-type distribution fitting via expectation maximization

  • Original Article
  • Published:
Journal of Reliable Intelligent Environments Aims and scope Submit manuscript

Abstract

In numerous practical domains such as reliability and performance engineering, finance, healthcare, and supply chain management, a common challenge revolves around accurately modeling intricate time-based data and event duration. The inherent complexities of real-world systems often make it challenging to use conventional statistical distributions. The phase-type (PH) distributions emerge as a remarkably adaptable class of distributions suited for modeling scenarios like failure or response times. These distributions are helpful in analytical and simulation-driven system evaluation approaches and are frequently used to fit empirical datasets. This paper introduces a strategy that leverages user-friendly tools, graphical adjustment features, and integration with existing tools to streamline the process of fitting PH distributions to empirical data. The simplicity of this procedure empowers domain experts to more accurately model complex systems, resulting in enhanced decision-making, more efficient resource allocation, improved reliability assessments, and optimized system performance across an extensive spectrum of practical domains where the analysis of time-based data remains pivotal. Furthermore, this study presents a method for the automated determination of parameters within a fitted Hyper-Erlang distribution. This method utilizes the Bayesian Information Criterion (BIC) within a Bayesian optimization framework integrated into an Expectation-Maximization (EM) algorithm. Consequently, it enables deriving a given dataset’s probability density function (PDF) through a combination of Hyper-Erlang distributions. Subsequently, the PDF serves as a tool for assessing system performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

The data detailed in Sect. 6.1 can be sourced from [10, 11]. For a more comprehensive understanding of the cloud infrastructure discussed in Sect. 6.2, readers are referred to Refs. [36, 37].

Notes

  1. https://erdogant.github.io/distfit/pages/html/index.html.

References

  1. Asmussen S, Koole G (1993) Marked point processes as limits of Markovian arrival streams. J Appl Prob 30(2):365–372

    Article  MathSciNet  Google Scholar 

  2. Reinecke P, Krauß T, Wolter K (2012) Hyperstar: Phase-type fitting made easy. In: 2012 Ninth International Conference on Quantitative Evaluation of Systems, p. 201–202 . IEEE

  3. Horváth G, Telek M (2016) Butools 2: a rich toolbox for markovian performance evaluation. In: VALUETOOLS, p. 135–141

  4. Buchholz P, Kriege J, Felko I, Buchholz P, Kriege J, Felko I (2014) Phase-type distributions. Input modeling with phase-type distributions and markov models: theory and applications. Springer, Cham, pp 5–28

    Chapter  Google Scholar 

  5. Neuts MF (1975) Probability distributions of phase type. Liber Amicorum Prof. Emeritus H. Florin. Dept. Math. Univ. Louvain, Leuven

    Google Scholar 

  6. Neuts MF (1981) Matrix-geometric solutions in stochastic models, Volume 2 of Johns Hopkins Series in the Mathematical Sciences. Johns Hopkins University Press, Baltimore

    Google Scholar 

  7. Bolch G, Greiner S, De Meer H, Trivedi KS (2006) Queueing networks and Markov chains: modeling and performance evaluation with computer science applications. Wiley, New York

    Book  Google Scholar 

  8. Thummler A, Buchholz P, Telek M (2005) A novel approach for fitting probability distributions to real trace data with the em algorithm. In: 2005 International Conference on Dependable Systems and Networks (DSN’05), pp. 712–721 . IEEE

  9. Stewart WJ (2009) Probability, Markov chains, queues, and simulation: the mathematical basis of performance modeling. Princeton University Press, Princeton

    Book  Google Scholar 

  10. Maciel PRM (2023) Performance, reliability, and availability evaluation of computational systems, Volume I: performance and background. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  11. Maciel PRM (2023) Performance, reliability, and availability evaluation of computational systems, Volume II: performance and background. Chapman and Hall/CRC, Boca Raton

    Book  Google Scholar 

  12. Pereira P, Araujo J, Torquato M, Dantas J, Melo C, Maciel P (2020) Stochastic performance model for web server capacity planning in fog computing. J Supercomput 76:9533–9557

    Article  Google Scholar 

  13. Ram M (2019) Reliability engineering: methods and applications. CRC Press, Boca Raton

    Book  Google Scholar 

  14. Maciel PR, Trivedi KS, Matias R, Kim DS (2012) Dependability modeling. Performance and dependability in service computing: concepts, techniques and research directions. IGI Global, Hershey, pp 53–97

    Chapter  Google Scholar 

  15. Bailey TL, Elkan C, et al (1994) Fitting a mixture model by expectation maximization to discover motifs in bipolymers. UCSD Technical Report

  16. Okamura H, Watanabe R, Dohi T (2014) Variational Bayes for phase-type distribution. Commun Stat Simul Comput 43(8):2031–2044

    Article  MathSciNet  Google Scholar 

  17. Okamura H, Dohi T (2016) Ph fitting algorithm and its application to reliability engineering. J Oper Res Soc Japan 59(1):72–109

    MathSciNet  Google Scholar 

  18. Prados-Garzon J, Ameigeiras P, Ramos-Munoz JJ, Andres-Maldonado P, Lopez-Soler JM (2017) Analytical modeling for virtualized network functions. In: 2017 IEEE International Conference on Communications Workshops (ICC Workshops), pp. 979–985 . IEEE

  19. Barde S, Ko YM, Shin H (2020) Fitting discrete phase-type distribution from censored and truncated observations with pre-specified hazard sequence. Oper Res Lett 48(3):233–239

    Article  MathSciNet  Google Scholar 

  20. Zhang J, Zheng J, Okamura H, Dohi T (2021) An efficient algorithm for computation of information matrix in phase-type fitting. Int J Comput Methods Eng Sci Mech 22(3):193–199

    Article  MathSciNet  Google Scholar 

  21. Bladt M, Rojas-Nandayapa L (2018) Fitting phase-type scale mixtures to heavy-tailed data and distributions. Extremes 21(2):285–313

    Article  MathSciNet  Google Scholar 

  22. Albrecher H, Bladt M, Bladt M, Yslas J (2022) Continuous scaled phase-type distributions. Stoch Models 39:293–322

    Article  MathSciNet  Google Scholar 

  23. Alkaff A, Qomarudin MN (2020) Modeling and analysis of system reliability using phase-type distribution closure properties. Appl Stoch Models Bus Ind 36:548–569

    Article  MathSciNet  Google Scholar 

  24. Wu B, Cui L, Fang C (2020) Generalized phase-type distributions based on multi-state systems. IISE Trans 52(1):104–119

    Article  Google Scholar 

  25. Wang G, Hu L, Zhang T, Wang Y (2021) Reliability modeling for a repairable (k1, k2)-out-of-n: G system with phase-type vacation time. Appl Math Modell 91:311–321

    Article  Google Scholar 

  26. He Q-M, Liu B, Wu H (2022) Continuous approximations of discrete phase-type distributions and their applications to reliability models. Perform Eval 154:102284

    Article  Google Scholar 

  27. Li J, Chen J, Zhang X (2019) Time-dependent reliability analysis of deteriorating structures based on phase-type distributions. IEEE Trans Reliab 69(2):545–557

    Article  Google Scholar 

  28. Zheng Z, Li C, Liu Y, Xi Z (2022) A phase-type expansion approach for the performability of composite web services. IEEE Trans Reliab 71:579–589

    Article  Google Scholar 

  29. Alkaff A, Qomarudin MN, Purwantini E, Wiratno SE (2021) Dynamic reliability modeling for general standby systems. Comput Ind Eng 161:107615

    Article  Google Scholar 

  30. Balali F, Seifoddini H, Nasiri A (2020) Data-driven predictive model of reliability estimation using degradation models: a review. Life Cycle Reliab Saf Eng 9(1):113–125

    Article  Google Scholar 

  31. Reinecke P, Krauß T, Wolter K (2013) Phase-type fitting using hyperstar. In: European Workshop on Performance Engineering, pp. 164–175 . Springer

  32. Horváth G, Telek M (2019) Markovian performance evaluation with butools. Systems modeling: methodologies and tools. Springer, Cham, pp 253–268

    Chapter  Google Scholar 

  33. Fang Y (2001) Hyper-erlang distribution model and its application in wireless mobile networks. Wirel Netw 7:211–219

    Article  Google Scholar 

  34. Zhang T, Zhao Q, Shin K, Nakamoto Y (2018) Bayesian-optimization-based peak searching algorithm for clustering in wireless sensor networks. J Sensor Actuator Netw 7(1):2

    Article  Google Scholar 

  35. Bladt M (2022) Phase-type distributions for claim severity regression modeling. ASTIN Bull 52(2):417–448

  36. Pereira P, Araujo J, Melo C, Santos V, Maciel P (2021) Analytical models for availability evaluation of edge and fog computing nodes. J Supercomput 77(9):9905–9933

  37. Pereira P, Melo C, Araujo J, Dantas J, Santos V, Maciel P (2022) Availability model for edge-fog-cloud continuum: an evaluation of an end-to-end infrastructure of intelligent traffic management service. J Supercomput 78(3):4421–4448

    Article  Google Scholar 

  38. Wang L, Luo X, Li Y, Tang J (2020) A reliability modeling method based on phase-type distribution aiming at shock model. IEEE Access 8:154881–154897

    Article  Google Scholar 

  39. Acal C, Ruiz-Castro JE, Maldonado D, Roldán JB (2021) One cut-point phase-type distributions in reliability: an application to resistive random access memories. Mathematics 9(21):2734

    Article  Google Scholar 

Download references

Acknowledgements

We want to thank the Coordination of Improvement of Higher Education Personnel—CAPES, National Council for Scientific and Technological Development—CNPq, Fundação de Amparo à Ciência e Tecnologia de Pernambuco—FACEPE, and MoDCS Research Group for their support.

Funding

The primary author of the submitted article, Marco Mialaret, hereby declares that he has been the recipient of a doctoral scholarship from the Fundação de Amparo à Ciência e Tecnologia do Estado de Pernambuco (FACEPE). This financial support has significantly aided in the progression and completion of the research presented in the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

Paulo Pereira: Role: Data Provider and Reviewer Paulo was instrumental in providing the experimental data that formed the backbone of our research. His insights and feedback during the review process were invaluable in refining the manuscript and ensuring its accuracy. Antonio Sa Barreto: Role: Automation Methodology Developer and Reviewer Antonio played a crucial role in the development of the automation methodology. His expertise in the field and innovative approach significantly enhanced the research’s depth and applicability. Thiago Pinheiro: Role: Software Developer Thiago was responsible for the development of the Java-based tool used in the research. His technical acumen ensured that the tool was robust, user-friendly, and perfectly aligned with the research’s objectives. Paulo Maciel: Role: Advisor As the guiding force behind the research, Paulo provided consistent direction, mentorship, and oversight throughout the project. His vast experience and keen insights ensured that the research maintained its focus and achieved its objectives.

Corresponding author

Correspondence to Marco Mialaret.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mialaret, M., Pereira, P., Sá Barreto, A. et al. Automated phase-type distribution fitting via expectation maximization. J Reliable Intell Environ (2024). https://doi.org/10.1007/s40860-024-00220-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s40860-024-00220-4

Keywords

Navigation