Skip to main content

Robustification of an On-line EM Algorithm for Modelling Asset Prices Within an HMM

  • Chapter
  • First Online:
Hidden Markov Models in Finance

Abstract

In this paper, we establish a robustification of Elliott’s on-line EM algorithm for modelling asset prices within a hidden Markov model (HMM). In this HMM framework, parameters of the model are guided by a Markov chain in discrete time, parameters of the asset returns are therefore able to switch between different regimes. The parameters are estimated through an on-line algorithm, which utilizes incoming information from the market and leads to adaptive optimal estimates. We robustify this algorithm step by step against additive outliers appearing in the observed asset prices with the rationale to better handle possible peaks or missings in asset returns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In mathematical rigor, the IF, when it exists, is the Gâteaux derivative of T into the direction of the tangent δ x F. For details, also on stronger notions like Hadamard or Fréchet differentiability, see, e.g., Fernholz [8] or [25, Chap. 1].

  2. 2.

    Names OBRE and MBRE are used, e.g., in Hampel et al. [16] while OMSE is used in Ruckdeschel and Horbenko [30].

  3. 3.

    Usually Λ θ is the logarithmic derivative of the density w.r.t. the parameter, i.e., \(\varLambda _{\theta }(x) = \partial /\partial \theta \log p_{\theta }(x)\).

References

  1. Ang, A., Bekaert, G.: International asset allocation with regime shifts. Rev. Financ. Stud. 15, 1137–1187 (2002)

    Article  Google Scholar 

  2. Cai, J.: A Markov model of switching-regime ARCH. J. Bus. Econ. Stat. 12, 309–316 (1994)

    Google Scholar 

  3. Elliott, R.J.: Exact adaptive filters for Markov chains observed in Gaussian noise. Automatica 30, 1399–1408 (1994)

    Article  Google Scholar 

  4. Elliott, R.J., Hinz, J.: A method for portfolio choice. Appl. Stoch. Models Bus. Ind. 19, 1–11 (2003)

    Article  Google Scholar 

  5. Elliott, R.J., van der Hoek, J.: An application of hidden Markov models to asset allocation problems. Financ. Stoch. 1, 229–238 (1997)

    Article  Google Scholar 

  6. Elliott, R.J., Aggoun, L., Moore, J.B.: Hidden Markov Models: Estimation and Control. Applications of Mathematics, vol. 29. Springer, New York (1995)

    Google Scholar 

  7. Erlwein, C., Mamon, R., Davison, M.: An examination of HMM-based investment strategies for asset allocation. Appl. Stoch. Model. Bus. Ind. 27(3), 204–221 (2009)

    Article  Google Scholar 

  8. Fernholz, L.T.: Von Mises Calculus for Statistical Functionals. Lecture Notes in Statistics, vol. 19. Spinger, New York (1983)

    Google Scholar 

  9. Fox, A.J.: Outliers in time series. J. R. Stat. Soc. Ser. B 34, 350–363 (1972)

    Google Scholar 

  10. Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis and density estimation. J. Am. Stat. Assoc. 97, 611–631 (2002)

    Article  Google Scholar 

  11. Fraley, C., Raftery, A.E., Murphy, T.B., Scrucca, L.: mclust version 4 for R: normal mixture modeling for model-based clustering, classification, and density estimation. Technical report No. 597, Department of Statistics, University of Washington (2012)

    Google Scholar 

  12. Gray, S.F.: Modeling the conditional distribution of interest rates as a regime-switching process. J. Financ. Econ. 42, 27–62 (1996)

    Article  Google Scholar 

  13. Guidolin, M., Timmermann, A.: Asset allocation under multivariate regime switching. J. Econ. Dyn. Control 31, 3503–3544 (2007)

    Article  Google Scholar 

  14. Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and the business cycle. Econom. J. Econom. Soc. 57, 357–384 (1989)

    Google Scholar 

  15. Hampel, F.R.: Contributions to the theory of robust estimation. Dissertation, University of California, Berkely (1968)

    Google Scholar 

  16. Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J. Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)

    Google Scholar 

  17. Huber, P.J.: Robust Statistics. Wiley, New York (1981)

    Book  Google Scholar 

  18. Kohl, M.: RobLox: optimally robust influence curves and estimators for location and scale. R package version 0.8.2. http://robast.r-forge.r-project.org/ (2012)

  19. Kohl, M., Deigner, H.P.: Preprocessing of gene expression data by optimally robust estimator. BMC Bioinform. 11, 583 (2010)

    Article  Google Scholar 

  20. Kohl, M., Rieder, H., Ruckdeschel, P.: Infinitesimally robust estimation in general smoothly parametrized models. Stat. Methods Appl. 19, 333–354 (2010)

    Article  Google Scholar 

  21. Knight, F.H.: Risk, Uncertainty, and Profit. Houghton Mifflin, Boston (1921)

    Google Scholar 

  22. Mamon, R., Erlwein, C., Gopaluni, B.: Adaptive signal processing of asset price dynamics with predictability analysis. Inform. Sci. 178, 203–219 (2008)

    Article  Google Scholar 

  23. Markowitz, H.: Portfolio selection. J. Financ. 7(1), 77–91 (1952)

    Google Scholar 

  24. Maronna, R.A., Martin, R.D. Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, Chichester (2006)

    Book  Google Scholar 

  25. Rieder, H.: Robust Asymptotic Statistics. Springer, New York (1994)

    Book  Google Scholar 

  26. Rieder, H., Kohl, M., Ruckdeschel, P.: The cost of not knowing the radius. Stat. Methods Appl. 17(1), 13–40 (2008)

    Article  Google Scholar 

  27. Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)

    Book  Google Scholar 

  28. Rousseeuw, P.J., van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)

    Article  Google Scholar 

  29. Ruckdeschel, P.: Ansätze zur Robustifizierung des Kalman Filters. Bayreuther Mathematische Schriften, vol. 64. Mathematisches Inst. der Univ. Bayreuth, Bayreuth (2001)

    Google Scholar 

  30. Ruckdeschel, P., Horbenko, N.: Robustness properties of estimators in generalized pareto models. Technical report ITWM No182. http://www.itwm.fraunhofer.de/fileadmin/ITWM-Media/Zentral/Pdf/Berichte_ITWM/2010/bericht_182.pdf (2010)

  31. Sass, J., Haussmann, U.G.: Optimizing the terminal wealth under partial information: the drift process as a continuous time Markov chain. Financ. Stoch. 8, 553–577 (2004)

    Article  Google Scholar 

  32. Zakai, M.: On the optimal filtering of diffusion processes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 11, 230–243 (1969)

    Article  Google Scholar 

Download references

Acknowledgements

We thank an anonymous referee and the editor for valuable comments. Financial support for C. Erlwein-Sayer from Deutsche Forschungsgemeinschaft (DFG) within the project “Regimeswitching in zeitstetigen Finanzmarktmodellen: Statistik und problemspezifische Modellwahl” (RU-893/4-1) is gratefully acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christina Erlwein-Sayer .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1.1 Definition: Weighted Medians and MADs

For weights w j  ≥ 0 and observations y j , the weighted median m = m(y, w) is defined as

$$\displaystyle{m = \mathrm{argmin}_{f}\sum _{j}w_{j}\vert y_{j} - f\vert \,.}$$

With \(y^{\prime}_{j} = \vert y_{j} - m\vert \), the scaled weighted MAD s = s(y, w) is defined as

$$\displaystyle{s = {c}^{-1}\mathrm{argmin}_{ t}\sum _{j}w_{j}\vert y^{\prime}_{j} - t\vert,}$$

where c is a consistency factor. It warrants consistent estimation of σ in case of Gaussian observations, i.e., \(c = \mbox{ E}\,\mathrm{argmin}_{t}\sum _{j}w_{j}\vert \vert \tilde{y}_{j}\vert - t\vert \) for \(\tilde{y}_{j}\stackrel{\mathrm{i.i.d.}}{\sim }\mathcal{N}(0,1)\). c can be obtained empirically for a sufficiently large sample size M, e.g., M = 10, 000, setting \(c = \frac{1} {M}\sum _{k=1}^{M}c_{ k}\), \(c_{k} = \mathrm{argmin}_{t}\sum _{j}w_{j}\vert \vert y^{\prime\prime}_{j,k}\vert - t\vert \), \(y^{\prime\prime}_{j,k}\stackrel{\mathrm{i.i.d.}}{\sim }\mathcal{N}(0,1)\).

As to the (finite sample) breakdown point FSBP of the weighted median (and at the same time for the scaled weighted MAD), we define \(w_{j}^{0} = w_{i,j}/\sum _{j^{\prime}}w_{j^{\prime}}\), and for each i define the ordered weights w (j) 0 such that w (1) 0 ≥ w (2) 0 ≥  ≥ w (k) 0. Then the FSBP in both cases is \({k}^{-1}\min \{j_{0} = 1,\ldots,k\,\mid \,\sum _{j=1}^{j_{0}}w_{(j)}^{0} \geq k/2\}\) which (for equivariant estimators) can be shown to be the largest possible value. So using weighted medians and MADs, we achieve a decent degree of robustness against outliers. E.g., assume we have 10 observations with weights 5 × 0. 05; 3 × 0. 1; 0. 2; 0. 25. Then we need at least three outliers (placed at weights 0. 1, 0. 2, 0. 25, respectively) to produce a breakdown.

1.1.2 Proof of Theorem 1.2

Proof.

Let us solve \(\max _{\partial \mathcal{U}}\min _{f}[\ldots ]\) first, which amounts to \(\min _{\partial \mathcal{U}}\mbox{ E}_{\mathrm{re}}[\vert \mbox{ E}_{\mathrm{re}}[{Y }^{\mathrm{id}}\vert {Y }^{\mathrm{re}}]{\vert }^{2}]\). For fixed element \({P}^{{Y }^{\mathrm{di}} }\) assume a dominating σ-finite measure μ, i.e., \(\mu \gg {P}^{{Y }^{\mathrm{di}} }\), \(\mu \gg {P}^{{Y }^{\mathrm{id}} }\); this gives us a μ-density q(y) of \({P}^{{Y }^{\mathrm{di}} }\). Determining the joint (real) law \({P}^{{Y }^{\mathrm{id}},{Y }^{\mathrm{re}} }(d\tilde{y},dy)\) as

$$\displaystyle{ P({Y }^{\mathrm{id} } \in A,{Y }^{\mathrm{re} } \in B) =\int I_{A}(\tilde{y})I_{B}(y)[(1 - r)I(\tilde{y} = y) + rq(y)]\,{p}^{{Y }^{\mathrm{id}} }(\tilde{y})\,\mu (d\tilde{y})\mu (dy)\,, }$$
(1.47)

we deduce that μ(dy)-a.e.

$$\displaystyle{ \mbox{ E}_{\mathrm{re}}[{Y }^{\mathrm{id} }\vert {Y }^{\mathrm{re} } = y] = \frac{rq(y)\mbox{ E}{Y }^{\mathrm{id}} + (1 - r)y{p}^{{Y }^{\mathrm{id}} }(y)} {rq(y) + (1 - r){p}^{{Y }^{\mathrm{id}} }(y)} =: \frac{a_{1}q(y) + a_{2}(y)} {a_{3}q(y) + a_{4}(y)}\,. }$$
(1.48)

Hence we have to minimize

$$\displaystyle{F(q):=\int \frac{\vert a_{1}q(y) + a_{2}(y){\vert }^{2}} {a_{3}q(y) + a_{4}(y)} \,\,\mu (dy)}$$

in \(M_{0} =\{ q \in L_{1}(\mu )\,\vert \;q \geq 0,\;\int q\,d\mu = 1\}\). To this end, we note that F is convex on the non-void, convex cone M = { q ∈ L 1(μ) |  q ≥ 0} so, for some \(\tilde{\rho }\geq 0\), we may consider the Lagrangian

$$\displaystyle{L_{\tilde{\rho }}(q):= F(q) +\tilde{\rho }\int q\,d\mu }$$

for some positive Lagrange multiplier \(\tilde{\rho }\). Pointwise minimization in y of \(L_{\tilde{\rho }}(q)\) gives

$$\displaystyle{q_{s}(y) = \frac{1 - r} {r} (\vert D(y)\vert \big/s\, - 1)_{+}\,\,{p}^{Y }(y)}$$

for some constant \(s = s(\tilde{\rho }) = {(\,\vert \mbox{ E}{Y }^{\mathrm{}id}{\vert }^{2} +\tilde{\rho } /r)}^{1/2}\), Pointwise in y, \(\hat{q}_{s}\) is antitone and continuous in s ≥ 0 and \(\lim _{s\rightarrow 0[\infty ]}q_{s}(y) = \infty [0]\), hence by monotone convergence,

$$\displaystyle{H(s) =\int \hat{ q}_{s}(y)\,\mu (dy)}$$

too, is antitone and continuous and \(\lim _{s\rightarrow 0[\infty ]}H(s) = \infty [0]\). So by continuity, there is some ρ ∈ (0, ) with H(ρ) = 1. On M 0, ∫ qd μ = 1, but \(\hat{q}_{\rho } = q_{s=\rho } \in M_{0}\) and is optimal on M ⊃ M 0 hence it also minimizes F on M 0. In particular, we get representation (1.26) and note that, independently from the choice of μ, the least favorable \(P_{0}^{{Y }^{\mathrm{di}} }\) is dominated according to \(P_{0}^{{Y }^{\mathrm{di}} } \ll {P}^{{Y }^{\mathrm{id}} }\), i.e.; non-dominated \({P}^{{Y }^{\mathrm{di}} }\) are even easier to deal with.

As next step we show that

$$\displaystyle{ \max \nolimits _{\partial \mathcal{U}}\min \nolimits _{f}[\ldots ] =\min \nolimits _{f}\max \nolimits _{\partial \mathcal{U}}[\ldots ] }$$
(1.49)

To this end we first verify (1.25) determining f 0(y) as \(f_{0}(y) = \mbox{ E}_{\mathrm{re};\hat{P}}[X\vert {Y }^{\mathrm{re}} = y]\). Writing a sub/superscript “re; P” for evaluation under the situation generated by \(P = {P}^{{Y }^{\mathrm{di}} }\) and \(\hat{P}\) for \(P_{0}^{{Y }^{\mathrm{di}} }\), we obtain the risk for general P as

$$\displaystyle\begin{array}{rcl} \mathrm{MSE}_{\mathrm{re;}\,P}[f_{0}({Y }^{\mathrm{re},\,P })]& =& (1 - r)\mbox{ E}_{\mathrm{id}}\vert {Y }^{\mathrm{id} } - f_{0}({Y }^{\mathrm{id} }){\vert }^{2} + r\mbox{ tr}\text{Cov}{Y }^{\mathrm{id} } + \\ & & \quad + r\,\mbox{ E}_{P}\min (\vert D({Y }^{\mathrm{di;},q }){\vert }^{2}{,\rho }^{2})\,. {}\end{array}$$
(1.50)

This is maximal for any P that is concentrated on the set \(\big\{\,\vert D({Y }^{\mathrm{di;},q})\vert >\rho \,\big\}\), which is true for \(\hat{P}\). Hence (1.49) follows, as for any contaminating P

$$\displaystyle{\mathrm{MSE}_{\mathrm{re;}\,P}[f_{0}({Y }^{\mathrm{re;} \,P })] \leq \mathrm{MSE}_{\mathrm{re;}\,\hat{P}}[f_{0}({Y }^{\mathrm{re;} \,\hat{P} })]\,.}$$

Finally, we pass over from \(\partial \mathcal{U}\) to \(\mathcal{U}\): Let f r , \(\hat{P}_{r}\) denote the components of the saddle-point for \(\partial \mathcal{U}(r)\), as well as ρ(r) the corresponding Lagrange multiplier and w r the corresponding weight, i.e., \(w_{r} = w_{r}(y) =\min (1,\rho (r)\,/\,\vert D(y)\vert )\). Let R(f, P, r) be the MSE of procedure f at the SO model \(\partial \mathcal{U}(r)\) with contaminating \({P}^{{Y }^{\mathrm{di}} } = P\). As can be seen from (1.26), ρ(r) is antitone in r; in particular, as \(\hat{P}_{r}\) is concentrated on { | D(Y ) | ≥ ρ(r)} which for r ≤ s is a subset of { | D(Y ) | ≥ ρ(s)}, we obtain

$$\displaystyle{R(f_{s},\hat{P}_{s},s) = R(f_{s},\hat{P}_{r},s)\qquad \mbox{ for}\;r \leq s\,.}$$

Note that R(f s , P, 0) = R(f s , Q, 0) for all P, Q – hence passage to \(\tilde{R}(f_{s},P,r) = R(f_{s},P,r) - R(f_{s},P,0)\) is helpful – and that

$$\displaystyle{ \mbox{ tr}\text{Cov}{Y }^{\mathrm{id} } = \mbox{ E}_{\mathrm{id}}\Big[\mbox{ tr}\text{Cov}_{\mathrm{id}}[{Y }^{\mathrm{id} }\vert {Y }^{\mathrm{id} }] + \vert D({Y }^{\mathrm{id} }){\vert }^{2}\Big]\,. }$$
(1.51)

Abbreviate \(\bar{w}_{s}({Y }^{\mathrm{id}}) = 1 - {(1 - w_{ s}({Y }^{\mathrm{id}}))}^{2} \geq 0\) to see that

$$\displaystyle\begin{array}{rcl} & & \tilde{R}(f_{s},P,r) = r\Big\{\mbox{ E}_{\mathrm{id}}\Big[\vert D({Y }^{\mathrm{id} }){\vert }^{2}\bar{w}_{ s}({Y }^{\mathrm{id} })\Big] + \mbox{ E}_{P}\min {(\vert D({Y }^{\mathrm{id} })\vert,\rho (s))}^{2}\,\Big\} \leq {}\\ & &\leq r\Big\{\mbox{ E}_{\mathrm{id}}\Big[\vert D({Y }^{\mathrm{id} }){\vert }^{2}\bar{w}_{ s}({Y }^{\mathrm{id} })\Big] +\rho {(s)}^{2}\,\Big\} =\tilde{ R}(f_{ s},\hat{P}_{r},r) <\tilde{ R}(f_{s},\hat{P}_{s},s)\,. {}\\ \end{array}$$

Hence the saddle-point extends to \(\mathcal{U}(r)\); in particular the maximal risk is never attained in the interior \(\mathcal{U}(r)\setminus \partial \mathcal{U}(r)\). (1.28) follows by plugging in the results. □ 

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Erlwein-Sayer, C., Ruckdeschel, P. (2014). Robustification of an On-line EM Algorithm for Modelling Asset Prices Within an HMM. In: Mamon, R., Elliott, R. (eds) Hidden Markov Models in Finance. International Series in Operations Research & Management Science, vol 209. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7442-6_1

Download citation

Publish with us

Policies and ethics