Skip to main content
Log in

Bayes optimal estimation and its approximation algorithm for difference with and without treatment under IRSLC model

  • Regular Paper
  • Published:
International Journal of Data Science and Analytics Aims and scope Submit manuscript

Abstract

We consider verifying the effect of the treatment under the situation in which a response is given when a treatment is applied to units with features. In estimating the effect, there are problems such as the treatment can be given only once to the unit and the features of the unit cannot be controlled. For such problems, conventional studies have some mathematical models. However, in this paper, we propose the different data generative model in which there are latent classes of units with the same response, each latent class contains units with similar features. We call this model the identical response structure latent class model (IRSLC model). Under the proposed model, we calculate the Bayes optimal decision and its approximation algorithm for the difference with and without the treatment for the entire population. We conducted experiments using the synthetic data of the model assumed by the proposed method or the conventional method. Then, we compared our method with previous studies to confirm the characteristics of the proposed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Berger, J.O.: Statistical decision theory and Bayesian analysis. Springer, UK (1985)

    Book  MATH  Google Scholar 

  2. Bishop, C.M.: Pattern recognition and machine learning. Springer, UK (2006)

    MATH  Google Scholar 

  3. Chen, H., Harinen, T., Lee, J.-Y., Yung, M., Z, Zhao.: Causalml: Python package for causal machine learning, (2020)

  4. Imbens, G.W., Rubin, D.B.: Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, UK (2015)

    Book  MATH  Google Scholar 

  5. Ishiwatari, T., Saito, S., Nakahara, Y., Iikubo, Y., Matsushima, T.: Bayes optimal estimation and its approximation algorithm for difference with and without treatment under URLC model, International Symposium on Information Theory and Its Applications, (2022)

  6. Rudas, S.J.K.: Linear regression for uplift modeling. Data Min. Knowl. Discovery 32(5), 1275–1305 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  7. Murayama, H., Saito, S., Iikubo, Y., Nakahara, Y., Matsushima, T.: Cluster’s number free bayes prediction of general framework on mixture of regression models. J. Stat. Theory Appl. 20, 425–449 (2021)

    Article  Google Scholar 

  8. Pearl, J.: Causal diagrams for empirical research. Biometrika 82(4), 669–710 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  9. Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  10. Sharma, A., Kiciman E.: et al. DoWhy: A Python package for causal inference. https://github.com/microsoft/dowhy, (2019)

Download references

Acknowledgements

This work was supported in part by JSPS KAKENHI Grant Numbers JP17K06446, JP19K04914, JP19K14989, JP22K02811, and JP22K14254.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taisuke Ishiwatari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proof of proposition 2

Appendix A: Proof of proposition 2

By substituting the approximate posterior distribution \(q^{(T)}\)\(({\varvec{z}}_k^N, {\varvec{\theta }}_k, k) =q^{(T)}(k) q^{(T)}({\varvec{z}}^N_k,{\varvec{\theta }}_k\mid k)\) in the posterior distribution \(p({\varvec{z}}_k^N,{\varvec{\theta }}_k,k\mid {\varvec{D}})\), we have

$$\begin{aligned} d^*({\varvec{D}})&=\sum _{k=1}^{k_{max}} \int _{\Theta _k} p({\varvec{\theta }}_k, k \mid {\varvec{D}}) \nonumber \\&\quad \quad \cdot \sum _{l_k = 1}^{k} \pi _{l_k} \left( {\varvec{s}}_{wl_k}^\top {\varvec{m}}_{l_k}+\check{s}_{wl_k,1}\right) d {\varvec{\theta }}_k \end{aligned}$$
(A1)
$$\begin{aligned}&=\sum _{k=1}^{k_{max}}\sum _{{\varvec{z}}^{N}_k\in \mathcal Z_k^{N}}\int _{\Theta _k} \sum _{l_k = 1}^{k} \pi _{l_k} \left( {\varvec{s}}_{wl_k}^\top {\varvec{m}}_{l_k}+\check{s}_{wl_k,1}\right) \nonumber \\&\quad \cdot p({\varvec{z}}^{N}_k,{\varvec{\theta }}_k,k\mid {\varvec{D}})d{\varvec{\theta }}_k \end{aligned}$$
(A2)
$$\begin{aligned}&\approx \sum _{k=1}^{k_{max}}\sum _{{\varvec{z}}^{N}_k\in \mathcal Z^{N}}\int _{\Theta _k} \sum _{l_k = 1}^{k} \pi _{l_k} \left( {\varvec{s}}_{wl_k}^\top {\varvec{m}}_{l_k}+\check{s}_{wl_k,1}\right) \nonumber \\&\quad \cdot q^{(T)}({\varvec{z}}^{N}_k,{\varvec{\theta }}_k,k)d{\varvec{\theta }}_k \end{aligned}$$
(A3)
$$\begin{aligned}&=\sum _{k=1}^{k_{\max }} \int \int \int \int \left\{ \sum _{l_k = 1}^{k} \pi _{l_k} ({\varvec{s}}_{wl_k}^\top {\varvec{m}}_{l_k}+\check{s}_{wl_k,1})\right\} \nonumber \\&\quad \cdot q^{(T)}\left( \check{{\varvec{S}}}_{k},\check{{\varvec{S}}}_{\textrm{wk}}\right) q^{(T)}(k) q^{(T)}(\varvec{\pi }_k)q^{(T)}(\varvec{M_k},\varvec{\Lambda _k}) \nonumber \\&\quad d \check{{\varvec{S}}}_k d \check{{\varvec{S}}}_{\textrm{wk}}d {\varvec{\pi }}_k d{\varvec{\Lambda }}_k d{\varvec{M}}_k \end{aligned}$$
(A4)
$$\begin{aligned}&= \sum _{k=1}^{k_{\max }} q^{(T)}(k) \nonumber \\&\quad \cdot \int \int \int \int \left\{ \sum _{l_k = 1}^{k} \pi _{l_k} ({\varvec{s}}_{wl_k}^\top {\varvec{m}}_{l_k}+\check{s}_{wl_k,1})\right\} \nonumber \\&\quad \cdot \textrm{Dir}\left( {\varvec{\pi }}_k\mid {\varvec{\alpha }}_k^{(T)}\right) \mathcal N\left( {\varvec{m}}_{l_k}\mid {\varvec{\mu }}_{l_k}^{(T)},(\gamma _{l_k}^{(T)}{\varvec{L}}_{l_k})^{-1}\right) \nonumber \\&\quad \cdot \mathcal W\left( {\varvec{L}}_{l_k}\mid {\varvec{W}}_{l_k}^{(T)},\nu _{l_k}^{(T)}\right) \nonumber \\&\quad \cdot \prod _{l_k=1}^k \mathcal {N}\left( \left[ \begin{array}{c}\check{{\varvec{s}}}_{l_{k}} \\ \check{{\varvec{s}}}_{wl_{k}}\end{array}\right] \mid \left[ \begin{array}{c}\check{{\varvec{\beta }}}_{l_{k}}^{(T)} \\ \check{{\varvec{\beta }}}_{wl_{k}}^{(T)}\end{array}\right] , \left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) ^{-1}\right) \nonumber \\&\quad d \check{{\varvec{S}}}_k d \check{{\varvec{S}}}_{\textrm{wk}}d {\varvec{\pi }}_k d{\varvec{\Lambda }}_k d{\varvec{M}}_k . \end{aligned}$$
(A5)

To calculate (A5), we define

$$\begin{aligned} \left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) ^{-1}: =\left[ \begin{array}{ll}\left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) _{(11)}^{-1} &{} \left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) _{(12)}^{-1} \\ \left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) _{(21)}^{-1} &{} \left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) _{(22)}^{-1}\end{array}\right] . \end{aligned}$$
(A6)

Then, we can rewrite (A5) as

$$\begin{aligned}&\sum _{k=1}^{k_{\max }} q^{(T)}(k) \int \int \int \int \left\{ \sum _{l_k = 1}^{k} \pi _{l_k} ({\varvec{s}}_{wl_k}^\top {\varvec{m}}_{l_k}+\check{s}_{wl_k,1})\right\} \nonumber \\&\quad \cdot \prod _{l_k=1}^k\mathcal N\left( \check{{\varvec{s}}}_{\textrm{wk}}\mid \check{{\varvec{\beta }}}_{wl_{k}}^{(T)}, \left( {\widetilde{{\varvec{\Lambda }}}_{l_{k}}}^{(T)}\right) _{(22)}^{-1}\right) \textrm{Dir}\left( {\varvec{\pi }}_k\mid {\varvec{\alpha }}_k^{(T)}\right) \nonumber \\&\quad \cdot \mathcal N\left( {\varvec{m}}_{l_k}\mid {\varvec{\mu }}_{l_k}^{(T)},(\gamma _{l_k}^{(T)}{\varvec{L}}_{l_k})^{-1}\right) \nonumber \\&\quad \cdot \mathcal W\left( {\varvec{L}}_{l_k}\mid {\varvec{W}}_{l_k}^{(T)},\nu _{l_k}^{(T)}\right) d \check{{\varvec{S}}}_{\textrm{wk}}d {\varvec{\pi }}_k d{\varvec{\Lambda }}_k d{\varvec{M}}_k \end{aligned}$$
(A7)
$$\begin{aligned}&=\sum _{k=1}^{k_{\max }} q^{(T)}(k) \sum _{l_k=1}^{k}\frac{\alpha _{l_k}^{(T)}}{\hat{\alpha }_k^{(T)}}\left\{ \left( {{\varvec{\beta }}}_{w{l_k}}^{(T)}\right) ^\top {\varvec{\mu }}_{l_k}^{(T)} + \check{\beta }_{w{l_k,1}}^{(T)}\right\} , \end{aligned}$$
(A8)

where \((\check{\beta }_{w{l_k,1}}^{(T)},\dots , \check{\beta }_{w{l_k,p+1}}^{(T)})\) are elements of \(\check{{\varvec{\beta }}}_{w{l_k}}^{(T)}\) and

$$\begin{aligned} \hat{\alpha }_k^{(T)}&:= \sum _{{l_k}=1}^k\alpha _{l_k}^{(T)}, \end{aligned}$$
(A9)
$$\begin{aligned} {{\varvec{\beta }}}_{w{l_k}}^{(T)}&:=\left( \check{\beta }_{w{l_k,2}}^{(T)},\dots , \check{\beta }_{w{l_k},p+1}^{(T)}\right) . \end{aligned}$$
(A10)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ishiwatari, T., Saito, S., Nakahara, Y. et al. Bayes optimal estimation and its approximation algorithm for difference with and without treatment under IRSLC model. Int J Data Sci Anal (2023). https://doi.org/10.1007/s41060-023-00468-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41060-023-00468-8

Keywords

Navigation