Skip to main content
Log in

Coalition Feature Interpretation and Attribution in Algorithmic Trading Models

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

The ability to correctly interpret a prediction model’s output is critically important in many problem spheres. Accurate interpretation generates user trust in the model, provides insight into how a model may be improved, and supports understanding of the process being modeled. Absence of this capability has constrained algorithmic trading from making use of more powerful predictive models, such as XGBoost and Random Forests. Recently, the adaptation of coalitional game theory has led to the development of consistent methods of determining feature importance for these models (SHAP).This study designs and tests a novel method of integrating the capabilities of SHAP into predictive models for algorithmic trading.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Bach, S. (2015). On pixel-wise explanations for non-linear classifier decisions by layerwise relevance propagation. PLoS ONE, 10(7), e0130140.

    Article  Google Scholar 

  • Hall, P., & Gill, N. (2018). An introduction to machine learning interpretability. Sebastopol: O’Reilly Media, Inc.

    Google Scholar 

  • Jansen, S. (2018). Machine learning for algorithmic trading. Birmingham: Packt Publishing Ltd.

    Google Scholar 

  • Koshiyama, A., Firoozye, N., & Treleaven, P. (2020). Algorithms in future capital markets. http://dx.doi.org/10.2139/ssrn.3527511.

  • Lipovetsky, S., & Conklin, M. (2001). Analysis of regression in game theory approach. Applied Stochastic Models in Business and Industry, 17(4), 319–330.

    Article  Google Scholar 

  • Lundberg, S., & Erion, G. (2018). Consistent individualized feature attribution for tree ensembles. arXiv:1802.03888.

  • Lundberg, S., & Lee, S. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (Vol. 30, pp. 4768–4777). Curran Associates, Inc.

  • Lundberg, S., Nair, B., Vavilala, M.S., Mayumi, H., Eisses, M., Adams, T., et al. (2017). Explainable machine learning predictions to help anesthesiologists prevent hypoxemia during surgery. bioRxiv, 206540.

  • Mussard, S., & Terraza, V. (2008). The Shapely decomposition for portfolio risk. Applied Economics Letters, 15(9), 713–715.

    Article  Google Scholar 

  • Ribeiro, M., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM.

  • Rida, A. (2019). Machine and deep learning for credit scoring: A compliant approach. Master’s Thesis, School of Operations Research and Industrial Engineering, University of California, Berkeley, CA.

  • Shrikumar, A. (2016). Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713.

  • Shrikumar, A., Greenside, P., & Kundaje, A. (2017). Learning important features through propagating activation differences. arXiv preprint arXiv:1704.02685.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James V. Hansen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Random Forests was selected for this study because it is much easier to tune parameters than with XGBoost. Random Forests contain adjustable hyperparameters whose values can dramatically affect performance. This flexibility contributes to their robustness. Common hyperparameters include the number of trees used in the forest, the maximum depth of each tree, and the minimum number or proportion of samples in a leaf. Random search for the best hyperparameter values avoids the exhaustive enumeration of grid search and replaces it with selecting random subsets of hyperparameter combinations. Random Forests are also harder to overfit than XGBoost.

A Random Forest is an ensemble of unpruned classification or regression trees induced from bootstrap samples of the training data, using random feature selection in the tree induction process. Prediction is made by aggregating (majority vote for classification or averaging for regression) the predictions of the ensemble. Formally an ensemble of \( k \) trees is an estimator comprised of a collection of randomized trees \( \{ h\left( {x, \theta_{k} } \right), k = 1, \ldots , K \) }, where the \( \theta_{k} \) are independent identically distributed random vectors, and \( x \) is an input vector. Letting \( \theta \) represent the generic random vector \( \theta_{k} \) having the same distribution as \( k \), as \( K \) goes to infinity, the mean-squared generalization error of the random forest goes almost surely to that of \( E\left[ {h\left( {x, \theta_{k} } \right)} \right] \), thus mitigating the possibility of overfitting the model.

Random forests increase diversity among the classifiers by altering the feature sets over the different tree induction processes and resampling the data. The procedure to build a forest with K trees is as follows:

figure c

The italicized part of the algorithm is where random forests depart from the normal bagging procedure. Specifically, when building a decision tree using traditional bagging, the best feature is selected from a given set of features \( F \) in each node and the set of features does not change over the different runs of the induction procedure. Conversely, with random forests a different random subset of size \( g\left( {\left| F \right|} \right) \) is evaluated at each node (e.g., \( g\left( x \right) = 0.15x \;or\; g\left( x \right) = \surd x, \,etc. \)) with the best feature selected from this subset. This has been shown to increase variability.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hansen, J.V. Coalition Feature Interpretation and Attribution in Algorithmic Trading Models. Comput Econ 58, 849–866 (2021). https://doi.org/10.1007/s10614-020-10053-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-020-10053-x

Keywords

Navigation