Robustification of an On-line EM Algorithm for Modelling Asset Prices Within an HMM

Erlwein-Sayer, Christina; Ruckdeschel, Peter

doi:10.1007/978-1-4899-7442-6_1

Christina Erlwein-Sayer⁴ &
Peter Ruckdeschel⁴

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 209))

3095 Accesses
1 Citations

Abstract

In this paper, we establish a robustification of Elliott’s on-line EM algorithm for modelling asset prices within a hidden Markov model (HMM). In this HMM framework, parameters of the model are guided by a Markov chain in discrete time, parameters of the asset returns are therefore able to switch between different regimes. The parameters are estimated through an on-line algorithm, which utilizes incoming information from the market and leads to adaptive optimal estimates. We robustify this algorithm step by step against additive outliers appearing in the observed asset prices with the rationale to better handle possible peaks or missings in asset returns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In mathematical rigor, the IF, when it exists, is the Gâteaux derivative of T into the direction of the tangent δ _x − F. For details, also on stronger notions like Hadamard or Fréchet differentiability, see, e.g., Fernholz [8] or [25, Chap. 1].
2.
Names OBRE and MBRE are used, e.g., in Hampel et al. [16] while OMSE is used in Ruckdeschel and Horbenko [30].
3.
Usually Λ _θ is the logarithmic derivative of the density w.r.t. the parameter, i.e., $\varLambda _{\theta }(x) = \partial /\partial \theta \log p_{\theta }(x)$.

References

Ang, A., Bekaert, G.: International asset allocation with regime shifts. Rev. Financ. Stud. 15, 1137–1187 (2002)
Article Google Scholar
Cai, J.: A Markov model of switching-regime ARCH. J. Bus. Econ. Stat. 12, 309–316 (1994)
Google Scholar
Elliott, R.J.: Exact adaptive filters for Markov chains observed in Gaussian noise. Automatica 30, 1399–1408 (1994)
Article Google Scholar
Elliott, R.J., Hinz, J.: A method for portfolio choice. Appl. Stoch. Models Bus. Ind. 19, 1–11 (2003)
Article Google Scholar
Elliott, R.J., van der Hoek, J.: An application of hidden Markov models to asset allocation problems. Financ. Stoch. 1, 229–238 (1997)
Article Google Scholar
Elliott, R.J., Aggoun, L., Moore, J.B.: Hidden Markov Models: Estimation and Control. Applications of Mathematics, vol. 29. Springer, New York (1995)
Google Scholar
Erlwein, C., Mamon, R., Davison, M.: An examination of HMM-based investment strategies for asset allocation. Appl. Stoch. Model. Bus. Ind. 27(3), 204–221 (2009)
Article Google Scholar
Fernholz, L.T.: Von Mises Calculus for Statistical Functionals. Lecture Notes in Statistics, vol. 19. Spinger, New York (1983)
Google Scholar
Fox, A.J.: Outliers in time series. J. R. Stat. Soc. Ser. B 34, 350–363 (1972)
Google Scholar
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis and density estimation. J. Am. Stat. Assoc. 97, 611–631 (2002)
Article Google Scholar
Fraley, C., Raftery, A.E., Murphy, T.B., Scrucca, L.: mclust version 4 for R: normal mixture modeling for model-based clustering, classification, and density estimation. Technical report No. 597, Department of Statistics, University of Washington (2012)
Google Scholar
Gray, S.F.: Modeling the conditional distribution of interest rates as a regime-switching process. J. Financ. Econ. 42, 27–62 (1996)
Article Google Scholar
Guidolin, M., Timmermann, A.: Asset allocation under multivariate regime switching. J. Econ. Dyn. Control 31, 3503–3544 (2007)
Article Google Scholar
Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series and the business cycle. Econom. J. Econom. Soc. 57, 357–384 (1989)
Google Scholar
Hampel, F.R.: Contributions to the theory of robust estimation. Dissertation, University of California, Berkely (1968)
Google Scholar
Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J. Stahel, W.A.: Robust Statistics: The Approach Based on Influence Functions. Wiley, New York (1986)
Google Scholar
Huber, P.J.: Robust Statistics. Wiley, New York (1981)
Book Google Scholar
Kohl, M.: RobLox: optimally robust influence curves and estimators for location and scale. R package version 0.8.2. http://robast.r-forge.r-project.org/ (2012)
Kohl, M., Deigner, H.P.: Preprocessing of gene expression data by optimally robust estimator. BMC Bioinform. 11, 583 (2010)
Article Google Scholar
Kohl, M., Rieder, H., Ruckdeschel, P.: Infinitesimally robust estimation in general smoothly parametrized models. Stat. Methods Appl. 19, 333–354 (2010)
Article Google Scholar
Knight, F.H.: Risk, Uncertainty, and Profit. Houghton Mifflin, Boston (1921)
Google Scholar
Mamon, R., Erlwein, C., Gopaluni, B.: Adaptive signal processing of asset price dynamics with predictability analysis. Inform. Sci. 178, 203–219 (2008)
Article Google Scholar
Markowitz, H.: Portfolio selection. J. Financ. 7(1), 77–91 (1952)
Google Scholar
Maronna, R.A., Martin, R.D. Yohai, V.J.: Robust Statistics: Theory and Methods. Wiley, Chichester (2006)
Book Google Scholar
Rieder, H.: Robust Asymptotic Statistics. Springer, New York (1994)
Book Google Scholar
Rieder, H., Kohl, M., Ruckdeschel, P.: The cost of not knowing the radius. Stat. Methods Appl. 17(1), 13–40 (2008)
Article Google Scholar
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (1987)
Book Google Scholar
Rousseeuw, P.J., van Driessen, K.: A fast algorithm for the minimum covariance determinant estimator. Technometrics 41, 212–223 (1999)
Article Google Scholar
Ruckdeschel, P.: Ansätze zur Robustifizierung des Kalman Filters. Bayreuther Mathematische Schriften, vol. 64. Mathematisches Inst. der Univ. Bayreuth, Bayreuth (2001)
Google Scholar
Ruckdeschel, P., Horbenko, N.: Robustness properties of estimators in generalized pareto models. Technical report ITWM N^o182. http://www.itwm.fraunhofer.de/fileadmin/ITWM-Media/Zentral/Pdf/Berichte_ITWM/2010/bericht_182.pdf (2010)
Sass, J., Haussmann, U.G.: Optimizing the terminal wealth under partial information: the drift process as a continuous time Markov chain. Financ. Stoch. 8, 553–577 (2004)
Article Google Scholar
Zakai, M.: On the optimal filtering of diffusion processes. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 11, 230–243 (1969)
Article Google Scholar

Download references

Acknowledgements

We thank an anonymous referee and the editor for valuable comments. Financial support for C. Erlwein-Sayer from Deutsche Forschungsgemeinschaft (DFG) within the project “Regimeswitching in zeitstetigen Finanzmarktmodellen: Statistik und problemspezifische Modellwahl” (RU-893/4-1) is gratefully acknowledged.

Author information

Authors and Affiliations

Department of Financial Mathematics, Fraunhofer ITWM, Fraunhofer-Platz 1, D-67663, Kaiserslautern, Germany
Christina Erlwein-Sayer & Peter Ruckdeschel

Authors

Christina Erlwein-Sayer
View author publications
You can also search for this author in PubMed Google Scholar
Peter Ruckdeschel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christina Erlwein-Sayer .

Editor information

Editors and Affiliations

Dept of Statistical & Actuarial Sciences, University of Western Ontario, London, Ontario, Canada
Rogemar S. Mamon
School of Mathematics, University of Adelaide, Adelaide, Australia
Robert J. Elliott

Appendix

1.1.1 Definition: Weighted Medians and MADs

For weights w _j ≥ 0 and observations y _j, the weighted median m = m(y, w) is defined as

$$\displaystyle{m = \mathrm{argmin}_{f}\sum _{j}w_{j}\vert y_{j} - f\vert \,.}$$

With $y^{\prime}_{j} = \vert y_{j} - m\vert $, the scaled weighted MAD s = s(y, w) is defined as

$$\displaystyle{s = {c}^{-1}\mathrm{argmin}_{ t}\sum _{j}w_{j}\vert y^{\prime}_{j} - t\vert,}$$

where c is a consistency factor. It warrants consistent estimation of σ in case of Gaussian observations, i.e., $c = \mbox{ E}\,\mathrm{argmin}_{t}\sum _{j}w_{j}\vert \vert \tilde{y}_{j}\vert - t\vert $ for $\tilde{y}_{j}\stackrel{\mathrm{i.i.d.}}{\sim }\mathcal{N}(0,1)$. c can be obtained empirically for a sufficiently large sample size M, e.g., M = 10, 000, setting $c = \frac{1} {M}\sum _{k=1}^{M}c_{ k}$, $c_{k} = \mathrm{argmin}_{t}\sum _{j}w_{j}\vert \vert y^{\prime\prime}_{j,k}\vert - t\vert $, $y^{\prime\prime}_{j,k}\stackrel{\mathrm{i.i.d.}}{\sim }\mathcal{N}(0,1)$.

As to the (finite sample) breakdown point FSBP of the weighted median (and at the same time for the scaled weighted MAD), we define $w_{j}^{0} = w_{i,j}/\sum _{j^{\prime}}w_{j^{\prime}}$, and for each i define the ordered weights w _(j) ⁰ such that w ₍₁₎ ⁰ ≥ w ₍₂₎ ⁰ ≥ … ≥ w _(k) ⁰. Then the FSBP in both cases is ${k}^{-1}\min \{j_{0} = 1,\ldots,k\,\mid \,\sum _{j=1}^{j_{0}}w_{(j)}^{0} \geq k/2\}$ which (for equivariant estimators) can be shown to be the largest possible value. So using weighted medians and MADs, we achieve a decent degree of robustness against outliers. E.g., assume we have 10 observations with weights 5 × 0. 05; 3 × 0. 1; 0. 2; 0. 25. Then we need at least three outliers (placed at weights 0. 1, 0. 2, 0. 25, respectively) to produce a breakdown.

1.1.2 Proof of Theorem 1.2

Proof.

Let us solve $\max _{\partial \mathcal{U}}\min _{f}[\ldots ]$ first, which amounts to $\min _{\partial \mathcal{U}}\mbox{ E}_{\mathrm{re}}[\vert \mbox{ E}_{\mathrm{re}}[{Y }^{\mathrm{id}}\vert {Y }^{\mathrm{re}}]{\vert }^{2}]$. For fixed element ${P}^{{Y }^{\mathrm{di}} }$ assume a dominating σ-finite measure μ, i.e., $\mu \gg {P}^{{Y }^{\mathrm{di}} }$, $\mu \gg {P}^{{Y }^{\mathrm{id}} }$; this gives us a μ-density q(y) of ${P}^{{Y }^{\mathrm{di}} }$. Determining the joint (real) law ${P}^{{Y }^{\mathrm{id}},{Y }^{\mathrm{re}} }(d\tilde{y},dy)$ as

$$\displaystyle{ P({Y }^{\mathrm{id} } \in A,{Y }^{\mathrm{re} } \in B) =\int I_{A}(\tilde{y})I_{B}(y)[(1 - r)I(\tilde{y} = y) + rq(y)]\,{p}^{{Y }^{\mathrm{id}} }(\tilde{y})\,\mu (d\tilde{y})\mu (dy)\,, }$$

(1.47)

we deduce that μ(dy)-a.e.

$$\displaystyle{ \mbox{ E}_{\mathrm{re}}[{Y }^{\mathrm{id} }\vert {Y }^{\mathrm{re} } = y] = \frac{rq(y)\mbox{ E}{Y }^{\mathrm{id}} + (1 - r)y{p}^{{Y }^{\mathrm{id}} }(y)} {rq(y) + (1 - r){p}^{{Y }^{\mathrm{id}} }(y)} =: \frac{a_{1}q(y) + a_{2}(y)} {a_{3}q(y) + a_{4}(y)}\,. }$$

(1.48)

Hence we have to minimize

$$\displaystyle{F(q):=\int \frac{\vert a_{1}q(y) + a_{2}(y){\vert }^{2}} {a_{3}q(y) + a_{4}(y)} \,\,\mu (dy)}$$

in $M_{0} =\{ q \in L_{1}(\mu )\,\vert \;q \geq 0,\;\int q\,d\mu = 1\}$. To this end, we note that F is convex on the non-void, convex cone M = { q ∈ L ₁(μ) | q ≥ 0} so, for some $\tilde{\rho }\geq 0$, we may consider the Lagrangian

$$\displaystyle{L_{\tilde{\rho }}(q):= F(q) +\tilde{\rho }\int q\,d\mu }$$

for some positive Lagrange multiplier $\tilde{\rho }$. Pointwise minimization in y of $L_{\tilde{\rho }}(q)$ gives

$$\displaystyle{q_{s}(y) = \frac{1 - r} {r} (\vert D(y)\vert \big/s\, - 1)_{+}\,\,{p}^{Y }(y)}$$

for some constant $s = s(\tilde{\rho }) = {(\,\vert \mbox{ E}{Y }^{\mathrm{}id}{\vert }^{2} +\tilde{\rho } /r)}^{1/2}$, Pointwise in y, $\hat{q}_{s}$ is antitone and continuous in s ≥ 0 and $\lim _{s\rightarrow 0[\infty ]}q_{s}(y) = \infty [0]$, hence by monotone convergence,

$$\displaystyle{H(s) =\int \hat{ q}_{s}(y)\,\mu (dy)}$$

too, is antitone and continuous and $\lim _{s\rightarrow 0[\infty ]}H(s) = \infty [0]$. So by continuity, there is some ρ ∈ (0, ∞) with H(ρ) = 1. On M ₀, ∫ q d μ = 1, but $\hat{q}_{\rho } = q_{s=\rho } \in M_{0}$ and is optimal on M ⊃ M ₀ hence it also minimizes F on M ₀. In particular, we get representation (1.26) and note that, independently from the choice of μ, the least favorable $P_{0}^{{Y }^{\mathrm{di}} }$ is dominated according to $P_{0}^{{Y }^{\mathrm{di}} } \ll {P}^{{Y }^{\mathrm{id}} }$, i.e.; non-dominated ${P}^{{Y }^{\mathrm{di}} }$ are even easier to deal with.

As next step we show that

$$\displaystyle{ \max \nolimits _{\partial \mathcal{U}}\min \nolimits _{f}[\ldots ] =\min \nolimits _{f}\max \nolimits _{\partial \mathcal{U}}[\ldots ] }$$

(1.49)

To this end we first verify (1.25) determining f ₀(y) as $f_{0}(y) = \mbox{ E}_{\mathrm{re};\hat{P}}[X\vert {Y }^{\mathrm{re}} = y]$. Writing a sub/superscript “re; P” for evaluation under the situation generated by $P = {P}^{{Y }^{\mathrm{di}} }$ and $\hat{P}$ for $P_{0}^{{Y }^{\mathrm{di}} }$, we obtain the risk for general P as

$$\displaystyle\begin{array}{rcl} \mathrm{MSE}_{\mathrm{re;}\,P}[f_{0}({Y }^{\mathrm{re},\,P })]& =& (1 - r)\mbox{ E}_{\mathrm{id}}\vert {Y }^{\mathrm{id} } - f_{0}({Y }^{\mathrm{id} }){\vert }^{2} + r\mbox{ tr}\text{Cov}{Y }^{\mathrm{id} } + \\ & & \quad + r\,\mbox{ E}_{P}\min (\vert D({Y }^{\mathrm{di;},q }){\vert }^{2}{,\rho }^{2})\,. {}\end{array}$$

(1.50)

This is maximal for any P that is concentrated on the set $\big\{\,\vert D({Y }^{\mathrm{di;},q})\vert >\rho \,\big\}$, which is true for $\hat{P}$. Hence (1.49) follows, as for any contaminating P

$$\displaystyle{\mathrm{MSE}_{\mathrm{re;}\,P}[f_{0}({Y }^{\mathrm{re;} \,P })] \leq \mathrm{MSE}_{\mathrm{re;}\,\hat{P}}[f_{0}({Y }^{\mathrm{re;} \,\hat{P} })]\,.}$$

Finally, we pass over from $\partial \mathcal{U}$ to $\mathcal{U}$: Let f _r, $\hat{P}_{r}$ denote the components of the saddle-point for $\partial \mathcal{U}(r)$, as well as ρ(r) the corresponding Lagrange multiplier and w _r the corresponding weight, i.e., $w_{r} = w_{r}(y) =\min (1,\rho (r)\,/\,\vert D(y)\vert )$. Let R(f, P, r) be the MSE of procedure f at the SO model $\partial \mathcal{U}(r)$ with contaminating ${P}^{{Y }^{\mathrm{di}} } = P$. As can be seen from (1.26), ρ(r) is antitone in r; in particular, as $\hat{P}_{r}$ is concentrated on { | D(Y ) | ≥ ρ(r)} which for r ≤ s is a subset of { | D(Y ) | ≥ ρ(s)}, we obtain

$$\displaystyle{R(f_{s},\hat{P}_{s},s) = R(f_{s},\hat{P}_{r},s)\qquad \mbox{ for}\;r \leq s\,.}$$

Note that R(f _s, P, 0) = R(f _s, Q, 0) for all P, Q – hence passage to $\tilde{R}(f_{s},P,r) = R(f_{s},P,r) - R(f_{s},P,0)$ is helpful – and that

$$\displaystyle{ \mbox{ tr}\text{Cov}{Y }^{\mathrm{id} } = \mbox{ E}_{\mathrm{id}}\Big[\mbox{ tr}\text{Cov}_{\mathrm{id}}[{Y }^{\mathrm{id} }\vert {Y }^{\mathrm{id} }] + \vert D({Y }^{\mathrm{id} }){\vert }^{2}\Big]\,. }$$

(1.51)

Abbreviate $\bar{w}_{s}({Y }^{\mathrm{id}}) = 1 - {(1 - w_{ s}({Y }^{\mathrm{id}}))}^{2} \geq 0$ to see that

$$\displaystyle\begin{array}{rcl} & & \tilde{R}(f_{s},P,r) = r\Big\{\mbox{ E}_{\mathrm{id}}\Big[\vert D({Y }^{\mathrm{id} }){\vert }^{2}\bar{w}_{ s}({Y }^{\mathrm{id} })\Big] + \mbox{ E}_{P}\min {(\vert D({Y }^{\mathrm{id} })\vert,\rho (s))}^{2}\,\Big\} \leq {}\\ & &\leq r\Big\{\mbox{ E}_{\mathrm{id}}\Big[\vert D({Y }^{\mathrm{id} }){\vert }^{2}\bar{w}_{ s}({Y }^{\mathrm{id} })\Big] +\rho {(s)}^{2}\,\Big\} =\tilde{ R}(f_{ s},\hat{P}_{r},r) <\tilde{ R}(f_{s},\hat{P}_{s},s)\,. {}\\ \end{array}$$

Hence the saddle-point extends to $\mathcal{U}(r)$; in particular the maximal risk is never attained in the interior $\mathcal{U}(r)\setminus \partial \mathcal{U}(r)$. (1.28) follows by plugging in the results. □

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Erlwein-Sayer, C., Ruckdeschel, P. (2014). Robustification of an On-line EM Algorithm for Modelling Asset Prices Within an HMM. In: Mamon, R., Elliott, R. (eds) Hidden Markov Models in Finance. International Series in Operations Research & Management Science, vol 209. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7442-6_1

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7442-6_1
Published: 03 April 2014
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7441-9
Online ISBN: 978-1-4899-7442-6
eBook Packages: Business and EconomicsBusiness and Management (R0)

Publish with us

Policies and ethics

Robustification of an On-line EM Algorithm for Modelling Asset Prices Within an HMM

Abstract

Access this chapter

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

1.1.1 Definition: Weighted Medians and MADs

1.1.2 Proof of Theorem 1.2

Proof.

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation