Skip to main content
Log in

Generalised linear model trees with global additive effects

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

Model-based trees are used to find subgroups in data which differ with respect to model parameters. In some applications it is natural to keep some parameters fixed globally for all observations while asking if and how other parameters vary across subgroups. Existing implementations of model-based trees can only deal with the scenario where all parameters depend on the subgroups. We propose partially additive linear model trees (PALM trees) as an extension of (generalised) linear model trees (LM and GLM trees, respectively), in which the model parameters are specified a priori to be estimated either globally from all observations or locally from the observations within the subgroups determined by the tree. Simulations show that the method has high power for detecting subgroups in the presence of global effects and reliably recovers the true parameters. Furthermore, treatment–subgroup differences are detected in an empirical application of the method to data from a mathematics exam: the PALM tree is able to detect a small subgroup of students that had a disadvantage in an exam with two versions while adjusting for overall ability effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

Download references

Acknowledgements

We thank Andrea Farnham for improving the language. We are thankful to the Swiss National Fund for funding this Project with Grants 205321_163456 and IZSEZ0_177091 and mobility Grant 205321_163456/2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heidi Seibold.

Electronic supplementary material

Appendices

Full factorial simulation

The simulation study described in Sect. 3 takes a ceteris paribus approach and varies one simulation variable at a time while keeping the others at a standard value. We did an additional simulation study where we vary all variables, which leads to \(8 \cdot 5 \cdot 2 \cdot 4 \cdot 4 \cdot 4 = 5120\) (see Table 1) different scenarios. For each scenario we simulated two data sets and ran all algorithms on each. In the following we show a small selection of interesting graphics based on the simulations. For the full results of the simulation studies we refer to the online material.

Figure 8 shows the marginal results of the ARI for \({\varDelta }_\beta \), the number of predictive factors, the number of observations and quantitative versus qualitative interactions. We average over the other simulation variables and the two repetitions. For sake of easy visualisation, we restrict the plotted variable to few levels. Similarly Figs. 9 and 10 show the marginal results of the proportion of correct treatment assignment and mean absolute error in estimated treatment effect for the number of predictive factors, \({\varDelta }_\beta \), the number of observations and quantitative versus qualitative interactions. Figure 11 shows the results for the MAE for \(n = 900\) and one prognostic factor to show when LM tree 1 starts to improve (see Sect. 3.4).

Fig. 9
figure 9

Proportion of observations in all trees where better treatment is correctly identified in the full factorial design with two simulated data sets per design (Question 3.3)

Fig. 10
figure 10

Mean absolute difference between true and estimated treatment effect (mean absolute error, MAE) in the full factorial design with two simulated data sets per design (Question 3.4)

Fig. 11
figure 11

Mean absolute difference between true and estimated treatment effect (mean absolute error, MAE) in the full factorial design with two simulated data sets per design (Question 3.4). Limited data to scenarios with 900 observations and one prognostic factor

Figure 8 shows that PALM tree can handle simple subgroups with one predictive factor even when the number of observations is low, but the difference in treatment effects must be reasonably high. All other algorithms perform worse, with LM tree 2 and STIMA being the strongest competitors in the low-n-scenarios. OTR performs reasonably well if qualitative subgroups are present. For \(n = 500\) the performance of PALM tree rises already at lower levels of \({\varDelta }_\beta \). The performance of PALM tree and LM tree 2 is very similar and STIMA also performs well. By design OTR ignores any non-qualitative subgroups.

When quantitative treatment subgroups exist, all methods are good at deciding the correct treatment regime (see Fig. 9), especially when the number of observations is reasonably high (300). With \(n = 100\) PALM tree, LM tree 2, STIMA and even LM tree 1 still perform very well. OTR is the weakest competitor here. With low numbers of observations (\(n = 100\)), low treatment effect differences (\({\varDelta }_\beta = 0.5\)) and qualitative differences, the performance of all algorithms is close to random guessing (0.5), irrespective of the number of predictive factors. With higher \({\varDelta }_\beta \) PALM tree performs reasonably well, followed by LM tree 2, STIMA and OTR (order depending on the number of predictive factors). For \(n = 300\) and \({\varDelta }_\beta = 0.5\) STIMA and LM tree 1 perform worst, but STIMA catches up with the other algorithms when \({\varDelta }_\beta = 1.5\), whereas LM tree 1 stays at the bottom. Section 3.3 discusses these results in the context of the results in the star-like simulation study.

Section 3.4 already partly discussed Figs. 10 and 11. Figure 10 shows that across different scenarios the MAE increases with increasing number of predictive factors. PALM tree is among the best performers everywhere. In comparison to the other algorithms it performs particularly well in low-n-qualitative scenarios with \({\varDelta }_\beta = 1.5\).

Computation times

The computation times for all methods except STIMA are very reasonable in these applications. For a summary of computation times in the full factorial desing see Table 3. STIMA reached a maximum of 17.4 h and almost half the models took half an hour or longer.

Table 3 Quantiles of computation times per algorithm in seconds

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Seibold, H., Hothorn, T. & Zeileis, A. Generalised linear model trees with global additive effects. Adv Data Anal Classif 13, 703–725 (2019). https://doi.org/10.1007/s11634-018-0342-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-018-0342-1

Keywords

Mathematics Subject Classification

Navigation