Abstract
For slip distribution inversion with Bayesian theory, traditionally, the Markov Chain Monte Carlo (MCMC) method is well applied to generate a posterior probability density function with a sampling strategy. However, its computational cost may be expensive, and it fails to accommodate large volume data sets and estimate higher dimensional parameters of interest. In this study, we introduce variational inference theory into the study of coseismic slip distribution, and present a variational Bayesian slip distribution inversion approach. Furthermore, synthetic tests show that the variational Bayesian approach can efficiently and accurately invert the designed slip distribution; therefore, we conclude that the proposed algorithm is appropriate to invert the slip distribution parameters, which might be superior to MCMC sampling due to its excellent convergence speed and low computational burden. Taking the Dingri earthquake on March 20, 2020, as an example, we further verify the practicability of the variational Bayesian method in actual earthquakes. Additionally, the inversion results show that the main fault slip region of the Dingri earthquake occurs at depths of 2 ~ 8 km on the surface, the maximum slip amount is 0.54 m, and the coseismic release seismic moment is 5.58 × 1017 Nm, corresponding to a moment magnitude of Mw 5.79.
Similar content being viewed by others
Data availability
The data sets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Akaike H (1980). Likelihood and the Bayes procedure. In: Bernardo JM, DeGroot MH, Lindly DV, Smith AFM, Bayesian Statistics, University Press, Valencia, Spain, pp 143–166
Amey RMJ, Hooper A, Walters RJ (2018) A Bayesian method for incorporating self-similarity into earthquake slip inversions. J Geophys Res Solid Earth 123(7):6052–6071
Amey RMJ, Hooper A, Moshishita Y (2019) Going to any lengths: solving for fault size and fractal slip for the 2016, Mw 6.2 Central Tottori earthquake, Japan, using a transdimensional inversion scheme. J Geophys Res Solid Earth 124(4):4001–4016
Amiri-Simkooei AR (2016) Non-negative least-squares variance component estimation with application to GNSS time series. J Geodesy 90(5):451–466
Anzidei M, Boschi E, Cannelli V et al (2009) Coseismic deformation of the destructive April 6, 2009 L’Aquila earthquake (central Italy) from GNSS data. Geophys Res Lett. https://doi.org/10.1029/2009GL039145
Arakawa A, Taniguchi M, Hayashi T et al (2016) Variational Bayesian method of estimating variance components. Anim Sci J 87(7):863–872
Avouac JP, Meng LS, Wei SJ et al (2015) Lower edge of locked main Himalayan Thrust unzipped by the 2015 Gorkha earthquake. Nat Geosci 8(9):708–711
Bagnardi M, Hooper A (2018) Inversion of surface deformation data for rapid estimates of source parameters and uncertainties: a Bayesian approach. Geochem Geophys Geosyst 19(7):2194–2211
Bishop CM (2006) Pattern recognition and machine learning. Springer, New York
Blei DM, Jordan MI (2006) Variational inference for Dirichlet process mixtures. Bayesian Anal 1(1):121–143
Blei DM, Kucukelbir A, McAuliffe JD (2017) Variational inference: a review for statisticians. J Am Stat Assoc 112(518):859–877
Chappell MA, Groves AR, Whitcher B et al (2008) Variational Bayesian inference for a nonlinear forward model. IEEE Trans Signal Process 57(1):223–236
Cheloni D, D’agostino N, D’anastasio E et al (2010) Coseismic and initial post-seismic slip of the 2009 Mw 6.3 L’Aquila earthquake, Italy, from GNSS measurements. Geophys J Int 181(3):1539–1546
Cheloni D, De Novellis V, Albano M et al (2017) Geodetic model of the 2016 Central Italy earthquake sequence inferred from InSAR and GNSS data. Geophys Res Lett 44(13):6778–6787
Chiaraluce L, Di Stefano R, Tinti E et al (2017) The 2016 Central Italy seismic sequence: a first look at the mainshocks, aftershocks, and source models. Seismol Res Lett 88(3):757–771
Cubas N, Lapusta N, Avouac JP et al (2015) Numerical modeling of long-term earthquake sequences on the NE Japan megathrust: Comparison with observations and implications for fault friction. Earth Planet Sci Lett 419:187–198
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc Ser B (Methodol) 39(1):1–22
Elliott JR, Jolivet R, González PJ et al (2016) Himalayan megathrust geometry and relation to topography revealed by the Gorkha earthquake. Nat Geosci 9(2):174–180
Feng WP, Li ZH (2010) A novel hybrid PSO/simplex algorithm for determining earthquake source parameter using InSAR data. Prog Geophys 25(4):1189–1196 (In Chinese)
Feng WP, Li ZH, Elliott JR et al (2013) The 2011 Mw 6.8 Burma earthquake: fault constraints provided by multiple SAR techniques. Geophys J Int 195(1):650–660
Feng WP, Tian YF, Zhang Y et al (2017) A slip gap of the 2016 Mw 6.6 Muji, Xinjiang, China, earthquake inferred from Sentinel-1 TOPS interferometry. Seismol Res Lett 88(4):1054–1064
Fukahata Y, Wright TJ (2008) A non-linear geodetic data inversion using ABIC for slip distribution on a fault with an unknown dip angle. Geophys J Int 173(2):353–364
Fukuda J, Johnson KM (2008) A fully Bayesian inversion for spatial distribution of fault slip with objective smoothing. Bull Seismol Soc Am 98(3):1128–1146
Fukuda J, Johnson KM (2010) Mixed linear–non-linear inversion of crustal deformation data: Bayesian inference of model, weighting and regularization parameters. Geophys J Int 181(3):1441–1458
Funning GJ, Barke RM, Lamb SH et al (2005) The 1998 Aiquile, Bolivia earthquake: a seismically active fault revealed with InSAR. Earth Planet Sci Lett 232(1–2):39–49
Galetzka J, Melgar D, Genrich JF et al (2015) Slip pulse and resonance of the Kathmandu basin during the 2015 Gorkha earthquake. Nepal Science 349(6252):1091–1095
Gao H, Liao MS, Feng GC (2021) An improved quadtree sampling method for InSAR seismic deformation inversion. Remote Sensing 13(9):1678
Hang Y, Barbot S, Dauwels J et al (2020) Outlier-insensitive Bayesian inference for linear inverse problems (OutIBI) with applications to space geodetic data. Geophys J Int 221(1):334–350
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109
Helmert FR (1907) Die Ausgleichungsrechnung nach der Methode der kleinsten Quadrate: mit Anwendungen auf die Geod&sie, die Physik und die Theorie der Messinstrumente. BG Teubner
Jiang GY, Liu L, Barbour AJ et al (2021) Physics-based evaluation of the maximum magnitude of potential earthquakes induced by the Hutubi (China) underground gas storage. J Geophys Res Solid Earth 126(4):e2020JB021379
Jin BT, Zou J (2010) Hierarchical Bayesian inference for ill-posed problems via variational method. J Comput Phys 229(19):7317–7343
Jónsson S, Zebker H, Segall P et al (2002) Fault slip distribution of the 1999 Mw 7.1 Hector Mine, California, earthquake, estimated from satellite radar and GNSS measurements. Bulletin Seismol Soc Am 92(4):1377–1389
Jordan MI, Ghahramani Z, Jaakkola TS et al (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233
Kaverina A, Dreger D, Price E (2002) The combined inversion of seismic and geodetic data for the source process of the 16 October 1999 Mw 7.1 Hector Mine, California, earthquake. Bulletin Seismol Soc Am 92(4):1266–1280
Kerman J (2011) Neutral noninformative and informative conjugate beta and gamma prior distributions. Electron J Stat 5:1450–1470
King GCP, Stein RS, Lin J (1994) Static stress changes and the triggering of earthquakes. Bull Seismol Soc Am 84(3):935–953
Koch KR, Kusche J (2002) Regularization of geopotential determination from satellite data by variance components. J Geodesy 76(5):259–268
Kubik K (1970) The estimation of the weights of measured quantities within the method of least squares. Bulletin Géodésique 95(1):21–40
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Li ZH, Elliott JR, Feng WP et al (2011) The 2010 Mw 6.8 Yushu (Qinghai, China) earthquake: constraints provided by InSAR and body wave seismology. J Geophys Res 116:B10302
Metropolis N, Rosenbluth AW, Rosenbluth MN et al (1953) Equation of state calculations by fast computing machines. J Chem Phys 21(6):1087–1092
Mosegaard K, Tarantola A (1995) Monte Carlo sampling of solutions to inverse problems. J Geophys Res 100(B7):12431–12447
Nawaz MA, Curtis A (2018) Variational Bayesian inversion (VBI) of quasi-localized seismic attributes for the spatial distribution of geological facies. Geophys J Int 214(2):845–875
Nawaz MA, Curtis A (2019) Rapid discriminative variational Bayesian inversion of geophysical data for the spatial distribution of geological properties. J Geophys Res Solid Earth 124(6):5867–5887
Okada Y (1985) Surface deformation due to shear and tensile faults in a half-space. Bull Seismol Soc Am 75(4):1135–1154
Phillips RF (2002) Least absolute deviations estimation via the EM algorithm. Stat Comput 12(3):281–285
Robert C, Casella G (2013) Monte Carlo statistical methods. Springer Science & Business Media
Schaffrin B (1981) Ausgleichung Mit Bedingungs-Ungleichungen. AVN 88:227–238
Schwintzer P (1990) Sensitivity analysis in least squares gravity modelling by means of redundancy decomposition of stochastic prior information. Internal Report, Deutsches Geodätisches Forschungsinstitut
Sjöberg LE (1984) Non-negative variance component estimation in the Gauss-Helmert adjustment model. Manuscr Geod 9:247–280
Styron R, Pagani M (2020) The GEM global active faults database. Earthquake Spectra 36(1_suppl):160–180
Sun JB, Shen ZK, Bürgmann R et al (2013) A three-step maximum a posteriori probability method for InSAR data inversion of coseismic rupture with application to the 14 April 2010 Mw 6.9 Yushu, China, earthquake. J Geophys Res Solid Earth 118(8):4599–4627
Tikhonov AN (1963) Regularization of incorrectly posed problems. Sov Math Dokl 4:1624–1627
Walters RJ, Elliott JR, D'agostino N et al. (2009) The 2009 L'Aquila earthquake (central Italy): a source mechanism and implications for seismic hazard. Geophys Res Lett 36(17).
Wang LY, Gu WW (2020) A-optimal design method to determine the regularization parameter of coseismic slip distribution inversion. Geophys J Int 221(1):440–450
Wang LY, Wu QW (2022) A variational Bayesian approach to self-tuning robust adjustment for joint inversion of nonlinear volcano source model with t-distributed random errors. J Surv Eng 148(2):04021032
Wang Q, Zhang PZ, Freymueller JT et al (2001) Present-day crustal deformation in China constrained by global positioning system measurements. Science 294(5542):574–577
Wang Q, Qiao XJ, Lan QG et al (2011) Rupture of deep faults in the 2008 Wenchuan earthquake and uplift of the Longmen Shan. Nat Geosci 4(9):634–640
Wang LY, Gao H, Feng GC et al (2018) Source parameters and triggering links of the earthquake sequence in central Italy from 2009 to 2016 analyzed with GNSS and InSAR data. Tectonophysics 744:285–295
Wang YZ, Chen S, Chen K (2021) Source model and tectonic implications of the 2020 Dingri Mw5.7 earthquake constrained by InSAR data. Earthquake 41(1):116–128 (In Chinese)
Wen YM, He P, Xu CJ et al (2012) Source parameters of the 2009 L’Aquila earthquake, Italy from Envisat and ALOS satellite SAR image. Chin J Geophys 55(1):53–56 (In Chinese)
Williamson A, Newman A, Cummins P (2017) Reconstruction of coseismic slip from the 2015 Illapel earthquake using combined geodetic and tsunami waveform data. J Geophys Res Solid Earth 122(3):2119–2130
Wright TJ, Lu Z, Wicks C (2004) Constraining the slip distribution and fault geometry of the Mw 7.9, 3 november 2002, denali fault earthquake with interferometric synthetic aperture radar and global positioning system data. Bull Seismol Soc Am 94(6B):S175–S189
Xu PL (1998) Truncated SVD methods for discrete linear ill-posed problems. Geophys J Int 135(2):505–514
Xu PL (2009) Iterative generalized cross-validation for fusing heteroscedastic data of inverse ill-posed problems. Geophys J Int 179(1):182–200
Xu PL (2021) A new look at Akaike’s Bayesian information criterion for inverse ill-posed problems. J Frankl Inst 358(7):4077–4102
Xu PL, Shen YZ, Fukuda Y et al (2006) Variance component estimation in linear inverse ill-posed models. J Geodesy 80(2):69–81
Xu CJ, Ding KH, Cai JQ et al (2009) Methods of determining weight scaling factors for geodetic-geophysical joint inversion. J Geodyn 47(1):39–46
Xu CJ, Deng CY, Zhou LX (2016) Coseismic slip distribution inversion method based on the variance component estimation. Geomat Inf Sci Wuhan Univ 41(1):37–44 (In Chinese)
Xu GY, Xu CJ, Wen YM et al (2019) Coseismic and postseismic deformation of the 2016 Mw 6.2 Lampa earthquake, southern peru, constrained by interferometric synthetic aperture Radar. J Geophys Res Solid Earth 124(4):4250–4272
Xu GY, Xu CJ, Wen YM et al (2020a) The complexity of the 2018 Kaktovik earthquake sequence in the northeast of the brooks range Alaska. Geophys Res Lett 47(19):e2020 L088012
Xu Q, Chen Q, Zhao JJ et al (2020b) Sequential modeling of the 2016 Central Italy earthquake cluster using multi-source satellite observations and quantitative assessment of Coulomb stress change. Geophys J Int 221(1):451–466
Yabuki T, Matsu’ura M (1992) Geodetic data inversion using a Bayesian information criterion for spatial distribution of fault slip. Geophys J Int 109(2):363–375
Yin A, Harrison TM (2000) Geologic evolution of the Himalayan-Tibetan orogen. Annu Rev Earth Planet Sci 28(1):211–280
Zhang X, Curtis A (2020) Seismic tomography using variational inference methods. J Geophys Res Solid Earth 125(4):e2019JB018589
Zhang X, Curtis A (2021) Bayesian full-waveform inversion with realistic priors. Geophysics 86(5):1–20
Zhang XT, Jiang XH, Xue Y et al (2020) Summary of the Dingri Ms5.9 earthquake in Tibet on March 20, 2020. Seismol Geomagn Observat Res 41(4):199–209 (In Chinese)
Zhao XB, Curtis A, Zhang X (2022) Bayesian seismic tomography using normalizing flows. Geophys J Int 228(1):213–239
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant No.41874001, No.42174011 and No.42104008). The authors are grateful to all of the anonymous reviewers and editors for their careful review and valuable suggestions, which improved the quality of this paper. We thank Prof. Wanpeng Feng of Sun Yat-Sen University for providing PSOKINV software. We thank Prof. Yangmao Wen of Wuhan University for providing the L'Aquila seismic data. We thank Dr. Hua Gao for providing the Norcia and Dingri seismic data. We particularly thank Dr. Wangwang Gu for his help in this research and those valuable discussions. Part of the figures in this paper were prepared with the Generic Mapping Tools. In this paper, MCMC sampling Bayesian method code is written with reference to the slipBERI package, and we are grateful for this. Part of the formula derivation in the paper refer to the lecture notes of Prof. Yida Xu from the University of Technology Sydney, Australia.
Author information
Authors and Affiliations
Contributions
Longxiang Sun designed the study, performed the experiments, wrote and revised the manuscript; Leyang Wang revised the manuscript. Guangyu Xu and Qiwen Wu analyzed the experimental results.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that there are no conflicts of interest.
Appendices
Appendix A
-
I.
The relevant definitions of variational inference are expounded from the perspective of Bayesian theory (Jordan et al. 1999; Blei et al. 2017):
$$ p({\mathbf{m}}|{\mathbf{d}}) = \frac{{p({\mathbf{d}}|{\mathbf{m}})p({\mathbf{m}})}}{{p({\mathbf{d}})}} = \frac{{p({\mathbf{m}},{\mathbf{d}})}}{{p({\mathbf{d}})}} $$(A1)where \(p({\mathbf{d}})\) is expressed as follows:
$$ p({\mathbf{d}}) = \frac{{p({\mathbf{m}},{\mathbf{d}})}}{{p({\mathbf{m}}|{\mathbf{d}})}} $$(A1.1)Take logarithmic form on both sides of the Eq. (A1.1):
$$ \begin{aligned} \log p({\mathbf{d}}) = & \log p({\mathbf{m}},{\mathbf{d}}) - \log p({\mathbf{m}}|{\mathbf{d}}) \\ = & \log (\frac{{p({\mathbf{m}},{\mathbf{d}})}}{{q({\mathbf{m}})}}) - \log (\frac{{p({\mathbf{m}}|{\mathbf{d}})}}{{q({\mathbf{m}})}}) \\ \end{aligned} $$(A1.2)Solve the expectation for \({\mathbf{m}}\) on both sides of the Eq.(A1.2):
$$ \begin{aligned} \int\nolimits_{{\mathbf{m}}}{\log p({\mathbf{d}})} q({\mathbf{m}})d{\mathbf{m}} = &\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (\frac{{p({\mathbf{m}},{\mathbf{d}})}}{{q({\mathbf{m}})}})d{\mathbf{m}}\\&- \int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (\frac{{p({\mathbf{m}}|{\mathbf{d}})}}{{q({\mathbf{m}})}})d{\mathbf{m}}\\ \log p({\mathbf{d}}) = & \int\nolimits_{{\mathbf{m}}}{q({\mathbf{m}})} \log (p({\mathbf{m}},{\mathbf{d}}))d{\mathbf{m}} \\&-\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (q({\mathbf{m}}))d{\mathbf{m}} \\ & - \int\nolimits_{{\mathbf{m}}}{q({\mathbf{m}})} \log (\frac{{p({\mathbf{m}}|{\mathbf{d}})}}{{q({\mathbf{m}})}})d{\mathbf{m}}\\ \end{aligned} $$(A1.3)where,\(q({\mathbf{m}})\)is defined as the variational distribution, which is used to approximate the posterior distribution, \(p({\mathbf{m}}|{\mathbf{d}})\);\(\log p({\mathbf{d}})\) is defined as the evidence term, which is a constant; \(\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})}\log (p({\mathbf{m}},{\mathbf{d}}))d{\mathbf{m}} -\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (q({\mathbf{m}}))d{\mathbf{m}}\)is defined as the Evidence Lower Bound (ELBO), recorded as \({\text{ELBO(}}q{)}\)(Blei et al. 2017):
$$ \begin{aligned} {\text{ELBO(}}q{)} =\, &\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (p({\mathbf{m}},{\mathbf{d}}))d{\mathbf{m}}\\& -\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (q({\mathbf{m}}))d{\mathbf{m}} \\ =\, & {\text{E}}_{q} [\log p({\mathbf{m}},{\mathbf{d}})] - {\text{E}}_{q} [\log q({\mathbf{m}})] \\ \end{aligned}$$(A1.4)and \(-\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (\frac{{p({\mathbf{m}}|{\mathbf{d}})}}{{q({\mathbf{m}})}})d{\mathbf{m}}\)is defined as the Kullback–Leibler (KL) divergence, which represents the closeness, or distance, of the variational distribution,\(q({\mathbf{m}})\),to the posterior distribution, \(p({\mathbf{m}}|{\mathbf{d}})\),recorded as \({\text{KL}}[q({\mathbf{m}})||p({\mathbf{m}}|{\mathbf{d}})]\)(Kullback and Leibler 1951):
$$ \begin{aligned}{\text{KL}}[q({\mathbf{m}})||p({\mathbf{m}}|{\mathbf{d}})] =\, & -\int\nolimits_{{\mathbf{m}}} {q({\mathbf{m}})} \log (\frac{{p({\mathbf{m}}|{\mathbf{d}})}}{{q({\mathbf{m}})}})d{\mathbf{m}}\\ =\, & {\text{E}}_{q} [\log q({\mathbf{m}})] - {\text{E}}_{q} [\log p({\mathbf{m}}|{\mathbf{d}})] \\ \end{aligned}$$(A1.5) -
II.
From the perspective of the mean field approximation method, to prove that maximizing the ELBO is equivalent to minimizing the KL divergence. Expound the principle that the variational distribution approximates the posterior probability distribution of the parameters to be solved (Bishop 2006; Blei et al. 2017).
The Evidence Lower Bound,\({\text{ELBO(}}q{)}\), can be expressed as follows:
$$ \begin{aligned} {\text{ELBO(}}q{)} =\, &{\text{E}}_{q} [\log p({\mathbf{m}},{\mathbf{d}})] - {\text{E}}_{q}[\log q({\mathbf{m}})] \\ = \,& \int_{{\mathbf{m}}} {q({\mathbf{m}})}\log (p({\mathbf{m}},{\mathbf{d}}))d{\mathbf{m}}\\& -\int_{{\mathbf{m}}} {q({\mathbf{m}})} \log (q({\mathbf{m}}))d{\mathbf{m}} \\ = & \int {\prod\limits_{i = 1}^{n}{q_{i} (m_{i} )} } \log (p({\mathbf{m}},{\mathbf{d}}))d{\mathbf{m}}\\&- \int {\prod\limits_{i = 1}^{n} {q_{i} (m_{i} )} } \sum\limits_{i =1}^{n} {\log (q_{i} (m_{i} )} )d{\mathbf{m}} \\ \end{aligned}$$(A2)Considering the simplification of formula derivation, we denote the first and second items on the right end of the formula (A2) as part1 and part2, respectively.
$$ \begin{aligned} {\text{part1}} = & \int {\prod\limits_{i = 1}^{n} {q_{i} (m_{i} )} } \log (p({\mathbf{m}},{\mathbf{d}}))d{\mathbf{m}} \\ = & \int_{{m_{1} }} {\int_{{m_{2} }} { \cdots \int_{{m_{n} }} {\prod\limits_{i = 1}^{n} {q_{i} (m_{i} )} } } } \log (p({\mathbf{m}},{\mathbf{d}})dm_{1} dm_{2} \cdots dm_{n} \\ = & \int_{{m_{j} }} {q_{j} } (m_{j} )(\int_{{m_{i \ne j} }} { \cdots \int {\log (p({\mathbf{m}},{\mathbf{d}})} } \prod\limits_{i \ne j}^{n} {q_{i} (m_{i} )} dm_{i} )dm_{j} \\ = & \int_{{m_{j} }} {q_{j} } (m_{j} ){\text{E}}_{{q(\nabla m_{j} )}} [\log (p({\mathbf{m}},{\mathbf{d}})]dm_{j} \\ \end{aligned} $$(A2.1)where \({\text{E}}_{{q(\nabla m_{j} )}}\) represents the expectation of other parameters except for the j-th type parameter.
$$ \begin{aligned} {\text{part2}} = & \int {\prod\limits_{i = 1}^{n} {q_{i} (m_{i} )} } \sum\limits_{i = 1}^{n} {\log (q_{i} (m_{i} )} )d{\mathbf{m}} \\ = & \sum\limits_{i = 1}^{n} {\left( {\int_{{m_{i} }} {q_{i} (m_{i} )} \log (q_{i} (m_{i} ))dm_{i} } \right)} \\ \end{aligned} $$(A2.2)In part2, for the variational approximation of the posterior distribution of the specified parameter,\(m_{j}\), the cumulative sum of the remaining items in the sequence can be regarded as a constant and denoted as \({\text{const}}{.}\). Thus, part2 can be expressed as follows:
$$ {\text{part2}} = \int_{{m_{j} }} {q_{j} (m_{j} )} \log (q_{j} (m_{j} ))dm_{j} + {\text{const}}{.} $$(A2.3)Hence, the Evidence Lower Bound,\({\text{ELBO(}}q_{j} {)}\), only related to \(m_{j}\) can be simplified as follows:
$$ \begin{aligned} {\text{ELBO(}}q_{j} {)} =\, &{\text{part1 - part2}} \\ =\, & \int_{{m_{j} }} {q_{j} } (m_{j}){\text{E}}_{{\nabla m_{j} }} [\log (p({\mathbf{m}},{\mathbf{d}})]dm_{j} \\&- \int_{{m_{j} }} {q_{j} (m_{j})} \log (q_{j} (m_{j} ))dm_{j} + {\text{const}}{.} \\ \end{aligned}$$(A2.4)At this time, we assume that \({\text{log(}}\tilde{p}_{j} {(}m_{j} {,}{\mathbf{d}}{))} = {\text{E}}_{{q(\nabla m_{j} )}} [\log (p({\mathbf{m}},{\mathbf{d}})]\),then
$$ \begin{aligned} {\text{ELBO(}}q_{j} {)} = &\int_{{m_{j} }} {q_{j} } (m_{j} ){\text{log(}}\tilde{p}_{j} {(}m_{j}{,}{\mathbf{d}}{))}dm_{j} \\&- \int_{{m_{j} }} {q_{j} (m_{j} )} \log (q_{j} (m_{j} ))dm_{j} + {\text{const}}{.} \\ = & {\text{E}}_{{q_{j}}} [{\text{log(}}\tilde{p}_{j} {(}m_{j} {,}{\mathbf{d}}{))}] \\&-{\text{E}}_{{q_{j} }} [\log (q_{j} (m_{j} ))] + {\text{const}}{.} \\= & - {\text{KL[}}q_{j} (m_{j} )||\tilde{p}_{j} {(}m_{j}{|}{\mathbf{d}}{)]} + {\text{const}}{.} \\ \end{aligned}$$(A2.5)Since KL divergence is non-negative, maximizing the ELBO means that KL divergence is zero. At this time, the variational distribution fully approximates the posterior probability distribution of the parameters to be solved. That is, the following relationship is satisfied:
$$ \log (q_{j} (m_{j} )) = {\text{log(}}\tilde{p}_{j} {(}m_{j} {,}{\mathbf{d}}{))} = {\text{E}}_{{q(\nabla m_{j} )}} [\log (p({\mathbf{m}},{\mathbf{d}}))] $$(A2.6)
Appendix B
-
I.
The detailed formula of the variational distribution,\(q({\mathbf{s}})\), corresponding to the slip parameter,\({\mathbf{s}}\), is derived as follows:
$$ \begin{aligned} \log q({\mathbf{s}}) =\, & {\text{E}}_{{q(\nabla {\mathbf{s}})}} \left[ {\log p({\mathbf{s}},{{\varvec{\upsigma}}}^{2} ,\alpha^{2} ,{\mathbf{d}})} \right] \\ =\, & {\text{E}}_{{q(\nabla {\mathbf{s}})}} \left[ {\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} ) + \log p({\mathbf{s}}|\alpha^{2} )} \right] \\ =\, & {\text{E}}_{{q(\nabla {\mathbf{s}})}} \left[ {\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )] + {\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{s}}|\alpha^{2} )} \right] \\ \end{aligned} $$(B1)where \({\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )]\) and \({\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{s}}|\alpha^{2} )]\) are given by
$$ \begin{aligned} &{\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )] \\&\quad={\text{E}}_{{q(\nabla {\mathbf{s}})}} \left[ \log \left\{\prod\limits_{k = 1}^{K} \exp \left[ - \tfrac{{\sigma_{k}^{2}}}{2}({\mathbf{d}}{}_{k} \right.\right.\right.\\&\qquad\left.\left.\left.- {\mathbf{G}}_{k} {\mathbf{s}})^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} ({\mathbf{d}}_{k} -{\mathbf{G}}_{k} {\mathbf{s}}) \right] \right\} \right] +{\text{const}}. \\&\quad= {\text{E}}_{{q(\nabla {\mathbf{s}})}} \left[\sum\limits_{k = 1}^{K} \left( - \tfrac{{\sigma_{k}^{2}}}{2}({\mathbf{d}}{}_{k}\right.\right.\\&\left.\left. - {\mathbf{G}}_{k} {\mathbf{s}})^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} ({\mathbf{d}}_{k} -{\mathbf{G}}_{k} {\mathbf{s}}) \right) \right] + {\text{const}}.\\&\quad= - \frac{1}{2}\sum\limits_{k = 1}^{K} \left\{{\text{E}}({{\varvec{\upsigma}}}_{k}^{2} )({\mathbf{d}}{}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{d}}_{k} -{\mathbf{d}}{}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{G}}_{k} {\mathbf{s}} \right.\\&\qquad\left.- {\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{d}}_{k} + {\mathbf{s}}^{T} {\mathbf{G}}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{G}}_{k} {\mathbf{s}})\right\} + {\text{const}}. \\ &\quad= - \frac{1}{2}\sum\limits_{k =1}^{K} \left\{ {\text{E(}}{{\varvec{\upsigma}}}_{k}^{2}{)}{\mathbf{s}}^{T} {\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1} {\mathbf{G}}_{k} {\mathbf{s}} \right.\\&\qquad\left.-2{\text{E(}}{{\varvec{\upsigma}}}_{k}^{2} {)}{\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{d}}_{k} \right\} + {\text{const}}. \\ \end{aligned}$$(B1.1)and
$$ \begin{aligned}&{\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{s}}|\alpha^{2} )]\\&\quad ={\text{E}}_{{q(\nabla {\mathbf{s}})}} \left[ {\log (\exp [ -\frac{{\alpha^{2} }}{2}({\mathbf{Ls}})^{T} {\mathbf{Ls}}])} \right]+ {\text{const}}. \\&\quad= {\text{E}}_{{q(\nabla {\mathbf{s}})}}\left[ { - \frac{{\alpha^{2} }}{2}({\mathbf{Ls}})^{T} {\mathbf{Ls}}}\right] + {\text{const}}. \\&\quad = - \frac{1}{2}{\text{E}}(\alpha^{2}){\mathbf{s}}^{T} {\mathbf{L}}^{T} {\mathbf{Ls}} + {\text{const}}.\\\end{aligned} $$(B1.2)Finally, substituting the Eqs. (B1.1) and (B1.2) into the Eq. (B1), the logarithmic form of the analytical solution of the variational distribution,\(q({\mathbf{s}})\),can be expressed as follows:
$$ \begin{aligned}\log q({\mathbf{s}}) =\, &{\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )] \\&+{\text{E}}_{{q(\nabla {\mathbf{s}})}} [\log p({\mathbf{s}}|\alpha^{2} )] \\= & - \frac{1}{2}\sum\limits_{k =1}^{K} \{ {\text{E}}({{\varvec{\upsigma}}}_{k}^{2}){\mathbf{s}}^{T} {\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1} {\mathbf{G}}_{k} {\mathbf{s}} \\&-2{\text{E}}({{\varvec{\upsigma}}}_{k}^{2} ){\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1}{\mathbf{d}}_{k} \}\\& + \left( { - \frac{1}{2}{\text{E}}(\alpha^{2}){\mathbf{s}}^{T} {\mathbf{L}}^{T} {\mathbf{Ls}}} \right) +{\text{const}}. \\= & - \frac{1}{2}\left\{ {\mathbf{s}}^{T}\sum\limits_{k = 1}^{K} {\text{E}}({{\varvec{\upsigma}}}_{k}^{2}){\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{G}}_{k} {\mathbf{s}} + {\mathbf{s}}^{T}{\text{E}}(\alpha^{2} ) {\mathbf{L}}^{T} {\mathbf{Ls}}\right.\\&\left. -2\sum\limits_{k = 1}^{K} {({\text{E(}}{{\varvec{\upsigma}}}_{k}^{2}{)}{\mathbf{s}}^{T} {\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1} {\mathbf{d}}_{k} )} \right\} + {\text{const}}. \\= & -\frac{1}{2}\left\{ {\mathbf{s}}^{T} (\sum\limits_{k = 1}^{K}{{\text{E}}({{\varvec{\upsigma}}}_{k}^{2} ){\mathbf{G}}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{G}}_{k} +{\text{E}}(\alpha^{2} )} {\mathbf{L}}^{T} {\mathbf{L}}){\mathbf{s}}\right.\\&\left.- 2\sum\limits_{k = 1}^{K}{({\text{E}}({{\varvec{\upsigma}}}_{k}^{2} ){\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{d}}_{k} )} \right\} + {\text{const}}. \\\end{aligned}$$(B1.3) -
II.
The detailed formula of the variational distribution,\(q({{\varvec{\upsigma}}}^{2} )\), of the hyperparameter,\({{\varvec{\upsigma}}}^{2}\), is derived as follows:
$$ \begin{aligned} \log q({{\varvec{\upsigma}}}^{2} ) =\, & {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{s}},{{\varvec{\upsigma}}}^{2} ,\alpha^{2} ,{\mathbf{d}})] \\ =\, & {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} ) + \log p({{\varvec{\upsigma}}}^{2} )] \\ =\, & {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )] + {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({{\varvec{\upsigma}}}^{2} )] \\ \end{aligned} $$(B2)where \({\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )]\) is given by
$$\begin{aligned} &{\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )] \\&\quad={\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} \left[ \log \left( {\prod\limits_{k = 1}^{K} {(\sigma_{k}^{2} )^{{{{N_{k} }\mathord{\left/ {\vphantom {{N_{k} } 2}} \right. \kern-0pt} 2}}} } }\right)\right.\\&\qquad+ \log \left( \prod\limits_{k = 1}^{K} \exp [ -\tfrac{{\sigma_{k}^{2} }}{2}({\mathbf{d}}{}_{k} - {\mathbf{G}}_{k}{\mathbf{s}})^{T} \right.\\&\qquad\left.\left.(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}({\mathbf{d}}_{k} - {\mathbf{G}}_{k} {\mathbf{s}})] \right)\right] + {\text{const}}{.} \\&\quad = {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} \left[ \sum\limits_{k = 1}^{K}\left\{ \frac{{N_{k} }}{2}\log (\sigma_{k}^{2} ) -\tfrac{{\sigma_{k}^{2} }}{2}({\mathbf{d}}{}_{k} - {\mathbf{G}}_{k}{\mathbf{s}})^{T} \right.\right.\\&\qquad\left.\left.(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}({\mathbf{d}}_{k} - {\mathbf{G}}_{k} {\mathbf{s}}) \right\}\right] + {\text{const}}{.} \\&\quad= \sum\limits_{k = 1}^{K} \left\{\frac{{N_{k} }}{2}\log (\sigma_{k}^{2} ) - \tfrac{{\sigma_{k}^{2}}}{2}{\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}}[({\mathbf{d}}{}_{k} - {\mathbf{G}}_{k} {\mathbf{s}})^{T}\right.\\&\qquad\left.(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} ({\mathbf{d}}_{k} -{\mathbf{G}}_{k} {\mathbf{s}})] \right\} + {\text{const}}{.} \\\end{aligned}$$(B2.1)where \({\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} {[}({\mathbf{d}}{}_{k} -{\mathbf{G}}_{k} {\mathbf{s}})^{T} (\sum_{{\mathbf{d}}}^{k} )^{ - 1}({\mathbf{d}}_{k} - {\mathbf{G}}_{k} {\mathbf{s}}){]}\) can be expanded and the specific content is as follows:
$$\begin{aligned}& {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [({\mathbf{d}}{}_{k} -{\mathbf{G}}_{k} {\mathbf{s}})^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}({\mathbf{d}}_{k} - {\mathbf{G}}_{k} {\mathbf{s}})]\\&\quad ={\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}}[{\mathbf{d}}{}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{d}}_{k} - 2{\mathbf{d}}{}_{k}^{T}\\&\qquad (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1} {\mathbf{G}}_{k} {\mathbf{s}} + {\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{G}}_{k} {\mathbf{s}}] \\ &\quad= {\mathbf{d}}{}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{d}}_{k} -2{\mathbf{d}}{}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}\\&\qquad{\mathbf{G}}_{k} {\text{E}}({\mathbf{s}}) +{\text{E}}({\mathbf{s}}^{T} {\mathbf{G}}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{G}}_{k} {\mathbf{s}})\\ \end{aligned}$$(B2.2)From the expectation calculation formula of quadratic form \({\mathbf{Y}}^{T}{\mathbf{RY}}\), we know that \({\text{E(}}{\mathbf{Y}}^{T} {\mathbf{RY}}) =tr({\mathbf{R}}{\text{D}}({\mathbf{Y}})) +{\text{E(}}{\mathbf{Y}}{)}^{T} {\mathbf{R}}{\text{E(}}{\mathbf{Y}}{)}\), where \({\mathbf{R}}\) is a known invertible matrix, and \({\text{E(}}{\mathbf{Y}}{)}\)and \({\text{D}}({\mathbf{Y}})\) are the expectation and variance of vector \({\mathbf{Y}}\),respectively. Therefore, \({\text{E(}}{\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{G}}_{k}{\mathbf{s}}{)}\) is given by
$$\begin{aligned} {\text{E(}}{\mathbf{s}}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{G}}_{k} {\mathbf{s}}{)} =\,& tr({\mathbf{G}}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{G}}_{k}{\text{D}}({\mathbf{s}})) \\&+ {\text{E(}}{\mathbf{s}}{)}^{T}{\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{G}}_{k} {\text{E(}}{\mathbf{s}}{)}\end{aligned}$$(B2.3)Hence, the final expression of \({\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2})]\) is as follows:
$$ \begin{aligned}& {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{{\varvec{\upsigma}}}^{2} )] \\&\quad=\,\sum\limits_{k = 1}^{K} \left\{ \frac{{N_{k} }}{2}\log (\sigma_{k}^{2} ) - \tfrac{{\sigma_{k}^{2}}}{2}({\mathbf{d}}{}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1}{\mathbf{d}}_{k} \right.\hfill \\&- 2{\mathbf{d}}{}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1} {\mathbf{G}}_{k} {\text{E}}({\mathbf{s}})+ \;tr({\mathbf{G}}_{k}^{T} (\sum\nolimits_{{\mathbf{d}}}^{k})^{ - 1} {\mathbf{G}}_{k} {\text{D}}({\mathbf{s}})) \\&\left. +{\text{E}}({\mathbf{s}})^{T} {\mathbf{G}}_{k}^{T}(\sum\nolimits_{{\mathbf{d}}}^{k} )^{ - 1} {\mathbf{G}}_{k}{\text{E}}({\mathbf{s}})) \right\} + {\text{const}}{.} \hfill \\\end{aligned} $$(B2.4)In Eq. (A2), the calculation formula of \({\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({{\varvec{\upsigma}}}^{2})]\) is given by
$$ \begin{aligned} &{\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} [\log p({{\varvec{\upsigma}}}^{2} )] \\&\quad= {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} \left[ \log \left( \prod\limits_{k = 1}^{K} {(({{\varvec{\upsigma}}}_{k}^{2})^{{a_{k} - 1}} )} ) \right.\right.\\&\qquad\left.\left.+ \prod\limits_{k = 1}^{K} \exp ( -{\text{b}}_{k} {{\varvec{\upsigma}}}_{k}^{2} \right) \right] +{\text{const}}{.} \\ &\quad= {\text{E}}_{{q(\nabla {{\varvec{\upsigma}}}^{2} )}} \left[ {\sum\limits_{k = 1}^{K}{\left\{ {(a_{k} - 1)\log ({{\varvec{\upsigma}}}_{k}^{2} ) -{\text{b}}_{k} {{\varvec{\upsigma}}}_{k}^{2} } \right\}} } \right] +{\text{const}}{.} \\ &\quad= (a_{k} - 1)\log ({{\varvec{\upsigma}}}_{k}^{2} ) - {\text{b}}_{k}{{\varvec{\upsigma}}}_{k}^{2} \\ \end{aligned}$$(B2.5)Finally, substituting the Eqs. (B2.4) and (B2.5) into the Eq. (B2), the logarithmic form of the analytical solution of the variational distribution,\(q({{\varvec{\upsigma}}}^{2})\), can be expressed as follows:
$$ \begin{gathered} \log q({\varvec{\upsigma}}^{2} ) = {\text{E}}_{{q(\nabla {\varvec{\upsigma }}^{2} )}} [\log p({\mathbf{d}}|{\mathbf{s}},{\varvec{\upsigma }}^{2} )] + {\text{E}}_{{q(\nabla {\varvec{\upsigma }}^{2} )}} [\log p({\varvec{\upsigma }}^{2} )] \hfill \\ \quad \quad \quad \quad = \sum\limits_{{k = 1}}^{K} {\left\{ {\left( {\frac{{N_{k} }}{2} + a{}_{k} - 1} \right)\log (\sigma _{k} ^{2} ) - \left( {\frac{1}{2}{\mathbf{d}}_{k} ^{T} \left( {\sum\nolimits_{{\mathbf{d}}}^{k} } \right)^{{ - 1}} {\mathbf{d}}_{k} - {\mathbf{d}}_{k} ^{T} \left( {\sum\nolimits_{{\mathbf{d}}}^{k} } \right)^{{ - 1}} {\mathbf{G}}_{k} {\text{E(}}{\mathbf{s}})} \right.} \right.} \hfill \\ \quad \quad \quad \quad \quad \quad \left. {\left. { + \frac{1}{2}tr({\mathbf{G}}_{k} ^{T} \left(\sum\nolimits_{{\mathbf{d}}}^{k} \right)^{{ - 1}} {\mathbf{G}}_{k} {\text{D}}({\mathbf{s}})) + \frac{1}{2}{\text{E(}}{\mathbf{s}})^{T} {\mathbf{G}}_{k} ^{T} \left( {\sum\nolimits_{{\mathbf{d}}}^{k} } \right)^{{ - 1}} {\mathbf{G}}_{k} {\text{E(}}{\mathbf{s}}) + b_{k} )} \right)\sigma _{k} ^{2} } \right\} + {\text{const}}{\text{.}} \hfill \\ \end{gathered} $$(B2.6) -
III.
The detailed formula of the variational distribution,\(q(\alpha^{2} )\), of the hyperparameter,\(\alpha^{2}\), is derived as follows:
$$ \begin{aligned} \log q(\alpha^{2} ) =\, &{\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p({\mathbf{s}},{{\varvec{\upsigma}}}^{2} ,\alpha^{2} ,{\mathbf{d}})]\\ =\, & {\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p({\mathbf{s}},|\alpha^{2} ) + \log p(\alpha^{2} )] \\ =\, &{\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p({\mathbf{s}},|\alpha^{2} )] + {\text{E}}_{{q(\nabla \alpha^{2} )}}[\log p(\alpha^{2} )] \\ \end{aligned}$$(B3)where \({\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p({\mathbf{s}},|\alpha^{2} )]\)and \({\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p(\alpha^{2} )]\) are given by
$$ \begin{aligned} &{\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p({\mathbf{s}},|\alpha^{2} )] \\&\quad={\text{E}}_{{q(\nabla \alpha^{2} )}} \left[ {\frac{M}{2}\log (\alpha^{2} ) - \frac{{\alpha^{2} }}{2}{\mathbf{s}}^{T}{\mathbf{L}}^{T} {\mathbf{Ls}}} \right] + {\text{const}}{.} \\ &\quad=\frac{M}{2}\log (\alpha^{2} ) - \frac{{\alpha^{2}}}{2}{\text{E[}}{\mathbf{s}}^{T} {\mathbf{L}}^{T} {\mathbf{Ls}}{]} +{\text{const}}{.} \\ &\quad= \frac{M}{2}\log (\alpha^{2} ) -\frac{{\alpha^{2} }}{2}(tr({\mathbf{L}}^{T}{\mathbf{L}}{\text{D}}({\mathbf{s}})) \\&\qquad+{\text{E(}}{\mathbf{s}}{)}^{T} {\mathbf{L}}^{T}{\mathbf{L}}{\text{E(}}{\mathbf{s}}{))} + {\text{const}}{.} \\\end{aligned} $$(B3.1)and
$$ \begin{aligned} &{\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p(\alpha^{2} )] \\&\quad= {\text{E}}_{{q(\nabla \alpha^{2} )}} [(a_{0} - 1)\log (\alpha^{2} ) - b_{0} \alpha^{2} ] +{\text{const}}{.} \\ &\quad= (a_{0} - 1)\log (\alpha^{2} ) - b_{0}\alpha^{2} + {\text{const}}{.} \\ \end{aligned}$$(B3.2)Finally, substituting the Eqs. (B3.1) and (B3.2) into the Eq. (B3), the logarithmic form of the analytical solution of the variational distribution,\(q(\alpha^{2})\), can be expressed as follows:
$$ \begin{aligned} \log q(\alpha^{2} ) =\, &{\text{E}}_{{q(\nabla \alpha^{2} )}} [\log p({\mathbf{s}},|\alpha^{2} )] + {\text{E}}_{{q(\nabla \alpha^{2} )}}[\log p(\alpha^{2} )] \\ =\, & (\frac{M}{2} + a_{0} - 1)\log (\alpha^{2} ) - \left(\frac{1}{2}tr({\mathbf{L}}^{T}{\mathbf{L}}{\text{D}}({\mathbf{s}})) \right.\\&\left.+\frac{1}{2}{\text{E}}({\mathbf{s}})^{T} {\mathbf{L}}^{T}{\mathbf{L}}{\text{E}}({\mathbf{s}}) + b_{0} \right)\alpha^{2} +{\text{const}}{.} \\ \end{aligned} $$(B3.3)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, L., Wang, L., Xu, G. et al. A new method of variational Bayesian slip distribution inversion. J Geod 97, 10 (2023). https://doi.org/10.1007/s00190-023-01701-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00190-023-01701-9