Coalition Feature Interpretation and Attribution in Algorithmic Trading Models

Hansen, James V.

doi:10.1007/s10614-020-10053-x

Coalition Feature Interpretation and Attribution in Algorithmic Trading Models

Published: 28 September 2020

Volume 58, pages 849–866, (2021)
Cite this article

Computational Economics Aims and scope Submit manuscript

James V. Hansen ORCID: orcid.org/0000-0001-9785-2776¹

440 Accesses
Explore all metrics

Abstract

The ability to correctly interpret a prediction model’s output is critically important in many problem spheres. Accurate interpretation generates user trust in the model, provides insight into how a model may be improved, and supports understanding of the process being modeled. Absence of this capability has constrained algorithmic trading from making use of more powerful predictive models, such as XGBoost and Random Forests. Recently, the adaptation of coalitional game theory has led to the development of consistent methods of determining feature importance for these models (SHAP).This study designs and tests a novel method of integrating the capabilities of SHAP into predictive models for algorithmic trading.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

Article Open access 20 January 2024

A brief review of portfolio optimization techniques

Article 15 September 2022

A systematic review of fundamental and technical analysis of stock market predictions

Article 20 August 2019

References

Bach, S. (2015). On pixel-wise explanations for non-linear classifier decisions by layerwise relevance propagation. PLoS ONE, 10(7), e0130140.
Article Google Scholar
Hall, P., & Gill, N. (2018). An introduction to machine learning interpretability. Sebastopol: O’Reilly Media, Inc.
Google Scholar
Jansen, S. (2018). Machine learning for algorithmic trading. Birmingham: Packt Publishing Ltd.
Google Scholar
Koshiyama, A., Firoozye, N., & Treleaven, P. (2020). Algorithms in future capital markets. http://dx.doi.org/10.2139/ssrn.3527511.
Lipovetsky, S., & Conklin, M. (2001). Analysis of regression in game theory approach. Applied Stochastic Models in Business and Industry, 17(4), 319–330.
Article Google Scholar
Lundberg, S., & Erion, G. (2018). Consistent individualized feature attribution for tree ensembles. arXiv:1802.03888.
Lundberg, S., & Lee, S. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems (Vol. 30, pp. 4768–4777). Curran Associates, Inc.
Lundberg, S., Nair, B., Vavilala, M.S., Mayumi, H., Eisses, M., Adams, T., et al. (2017). Explainable machine learning predictions to help anesthesiologists prevent hypoxemia during surgery. bioRxiv, 206540.
Mussard, S., & Terraza, V. (2008). The Shapely decomposition for portfolio risk. Applied Economics Letters, 15(9), 713–715.
Article Google Scholar
Ribeiro, M., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM.
Rida, A. (2019). Machine and deep learning for credit scoring: A compliant approach. Master’s Thesis, School of Operations Research and Industrial Engineering, University of California, Berkeley, CA.
Shrikumar, A. (2016). Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713.
Shrikumar, A., Greenside, P., & Kundaje, A. (2017). Learning important features through propagating activation differences. arXiv preprint arXiv:1704.02685.

Download references

Author information

Authors and Affiliations

Marriott School, Brigham Young University, Provo, UT, 84602, USA
James V. Hansen

Authors

James V. Hansen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James V. Hansen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Random Forests was selected for this study because it is much easier to tune parameters than with XGBoost. Random Forests contain adjustable hyperparameters whose values can dramatically affect performance. This flexibility contributes to their robustness. Common hyperparameters include the number of trees used in the forest, the maximum depth of each tree, and the minimum number or proportion of samples in a leaf. Random search for the best hyperparameter values avoids the exhaustive enumeration of grid search and replaces it with selecting random subsets of hyperparameter combinations. Random Forests are also harder to overfit than XGBoost.

A Random Forest is an ensemble of unpruned classification or regression trees induced from bootstrap samples of the training data, using random feature selection in the tree induction process. Prediction is made by aggregating (majority vote for classification or averaging for regression) the predictions of the ensemble. Formally an ensemble of \( k \) trees is an estimator comprised of a collection of randomized trees \( \{ h\left( {x, \theta_{k} } \right), k = 1, \ldots , K \) }, where the \( \theta_{k} \) are independent identically distributed random vectors, and \( x \) is an input vector. Letting \( \theta \) represent the generic random vector \( \theta_{k} \) having the same distribution as \( k \), as \( K \) goes to infinity, the mean-squared generalization error of the random forest goes almost surely to that of \( E\left[ {h\left( {x, \theta_{k} } \right)} \right] \), thus mitigating the possibility of overfitting the model.

Random forests increase diversity among the classifiers by altering the feature sets over the different tree induction processes and resampling the data. The procedure to build a forest with K trees is as follows:

The italicized part of the algorithm is where random forests depart from the normal bagging procedure. Specifically, when building a decision tree using traditional bagging, the best feature is selected from a given set of features \( F \) in each node and the set of features does not change over the different runs of the induction procedure. Conversely, with random forests a different random subset of size \( g\left( {\left| F \right|} \right) \) is evaluated at each node (e.g., \( g\left( x \right) = 0.15x \;or\; g\left( x \right) = \surd x, \,etc. \)) with the best feature selected from this subset. This has been shown to increase variability.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hansen, J.V. Coalition Feature Interpretation and Attribution in Algorithmic Trading Models. Comput Econ 58, 849–866 (2021). https://doi.org/10.1007/s10614-020-10053-x

Download citation

Accepted: 20 September 2020
Published: 28 September 2020
Issue Date: October 2021
DOI: https://doi.org/10.1007/s10614-020-10053-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coalition Feature Interpretation and Attribution in Algorithmic Trading Models

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

A brief review of portfolio optimization techniques

A systematic review of fundamental and technical analysis of stock market predictions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Coalition Feature Interpretation and Attribution in Algorithmic Trading Models

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

A brief review of portfolio optimization techniques

A systematic review of fundamental and technical analysis of stock market predictions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation