A geometric treatment of time-varying volatilities

In this article, we propose a new framework for addressing multivariate time-varying volatilities. By employing methods of differential geometry, our model respects the geometric structure of the covariance space, i.e., symmetry and positive definiteness, in a way that is independent of any local coordinate parametrization. Its parsimonious specification makes it particularly suitable for large dimensional systems. Simulation studies suggest that our model embraces much of the nonlinear behaviour of the covariance dynamics. Applied to the US and the UK stock markets, the model performs well, especially when applied to risk measurement. In a broad context, our framework presents a new approach treating nonlinear properties observed in the financial market, and numerous areas of application can be further considered.


Introduction
Since the introduction of the ARCH model by Engle (1982), time-varying volatility models have played an important role in finance and have been successfully applied to various financial problems. From a portfolio viewpoint, it is natural to extend the GARCH-type models to a multivariate system, as first done by Bollerslev et al. (1988). Since their work, various types of multivariate time-varying volatility models have flourished. Some of the notable examples are the BEKK model of Engle and Kroner (1995), the DCC model of Engle (2002), and the matrix exponential GARCH model of Kawakatsu (2006). For more comprehensive references, we refer the reader to the survey papers by Bauwens et al. (2006) and Silvennoinen and Teräsvirta (2009).
The areas of application of time-varying volatility models are extensive; potentially all the areas where covariance dynamics comes into play, such as asset pricing and portfolio optimization, can be considered. It was first applied to asset pricing by Bollerslev et al. (1988), and later by Santis and Gerard (1998) and Hafner and Herwartz (1998). Cross relations in volatilities of several markets were studied by Kearney and Patton (2000) and Karolyi (1995), and Lien and Tse (2002) computed time-varying hedge ratios using a multivariate GARCH model. Wang and Chen (2007) explore the dynamics between the spot market and its derivative markets using a multivariate GARCH-M model.
Despite the wide areas of application, development of multivariate GARCH models has been neither as active nor as rapid as univariate GARCH models. Perhaps the most obvious reason for this is the proverbial ''curse of dimensionality'': because there are nðn þ 1Þ=2 elements in the covariance matrix of an n-variable system and the number of parameters to be estimated, in the most general case, will be of Oðn 4 Þ, the computational burden rapidly becomes untenable for even moderately large n.
Another important-and more subtle-difficulty in designing a time series model of a covariance matrix is the need to preserve its geometric structure, i.e., positive definiteness and symmetry. While symmetry can be easily guaranteed, ensuring that a covariance matrix remains positive definite in a mathematically consistent way (we illuminate on this point further below) is not necessarily straightforward. Earlier treatments have often been ad hoc and not taken the underlying geometric structure of the positive definite matrices into account. For example, the vech()-type model proposed by Bollerslev et al. (1988) takes the crude and obvious approach of transforming the covariance matrix into a column vector, and specifying the vector dynamics accordingly; in this case positive definiteness is ensured only after imposing the relevant algebraic constraints on the vector parameters. Some other recent models address the positive definiteness issue by imposing other sets of assumptions: see, for example, Tse and Tsui (2002) and Engle (2002). The well-known BEKK model of Engle and Kroner (1995) and its variants ensure positive definiteness by assuming a quadratic form for each term in the dynamics equation.
More recently, in work that is particularly relevant to ours, Kawakatsu (2006) extends the exponential GARCH model of Nelson (1991) to a multivariate system, and assumes a vech()-type time series model for the log of the covariance matrix. Since the matrix exponential of a symmetric matrix is always symmetric and positive definite, the geometric structure of the covariance matrix is preserved in this model without additional constraints. The exponential map, it turns out, plays a particularly important role in the geometric characterization of matrix groups, and the work of Kawakatsu (2006) is a promising first step that implies the important fundamental connections between geometry and the positive definite matrices.
Why is taking geometry into account important? This point can perhaps be best illustrated via an example of a point moving on the surface of a sphere. In principle, one could describe the motion of the point using standard three-dimensional Cartesian coordinates x 2 R, with further imposition of the constraint that length of x be one, i.e., kxk ¼ 1. Imposing the constraint is cumbersome, and it is more convenient to use a set of local twodimensional coordinates for the sphere; spherical coordinates immediately come to mind. However, there are many alternative choices of two-dimensional coordinates that parametrize the sphere, e.g., stereographic projection. In principle, there should be no reason to prefer one choice of local coordinates over another, as long as essential geometric quantities like length of the curve, and the area of a patch, are preserved. In other words, it is important to distinguish geometric properties from those that are coordinate-dependent. Any meaningful analysis must take into account the geometric properties of the underlying space, and preserve them in a coordinate-invariant way.
Likewise, the space of covariance matrices is not a vector space, but a curved space. More precisely, it has the structure of a differentiable manifold that we denote PðnÞ. PðnÞ is a differentiable Riemannian manifold and this fact motivates us to specify the dynamics of the covariance matrix using an appropriate set of local coordinates, and also taking care to ensure that the geometric structure of PðnÞ is always respected. In fact, there is welldeveloped machinery to specify continuous-time dynamics on differentiable manifolds, and we adopt this as the starting point for developing a general multivariate volatility model.
Beyond the aesthetic appeal of describing the dynamics on a curved space in an intrinsic fashion, however, the most critical reason, that also has a profound effect on computational results, is that ignoring the geometric structure more often than not will lead to arbitrary results without physical meaning. To illustrate this with an example pointed out by Fletcher and Joshi (2004), one can define a linear average of a collection of positive definite symmetric matrices by taking their sum and dividing by the number of elements. This procedure ensures that the resulting average is also positive-definite. However, such linear averages fail to interpolate natural properties. The linear average of matrices of the same determinant, for example, can result in a matrix with a larger determinant. This issue is even worse for second-order statistics. The standard principal component analysis as formulated on vector spaces is no longer valid, since the ''straight lines'' corresponding to the eigenmodes of variation do not remain on the space P(n)-standard vector space principal component analysis fails to preserve positive-definiteness.
The aforementioned limitations of the traditional approaches can be overcome using methods of differential geometry. With a proper Riemannian metric, a ''straight line'' (geodesic) connecting two covariance matrices that remains on PðnÞ can be defined, and the distance can be measured. The notion of geodesic can then be utilized to define statistics on PðnÞ, such as mean and covariance. A version of principal component analysis for covariance matrices can also be developed. By employing these tools from differential geometry, the objective of the paper is to propose a new framework for modelling covariance dynamics from a differential geometry perspective, and develop multivariate volatility models under the framework. Our models preserve symmetry and positive definiteness of the covariance matrix as it evolves, without imposing any ad hoc restrictions. We demonstrate benefits of the framework via simulation studies and compare our models with existing ones in empirical studies. Both simulation and empirical studies suggest that our models perform well and possess properties desired for covariance dynamics.
The remainder of the paper is organized as follows. Section 2 reviews the geometry of P(n), including its Riemannian structure, evaluation of geodesics in a given direction and also between two arbitrary points, and its natural extension to the computation of intrinsic sample mean and covariance on P(n). Section 3 constructs differential equations on P(n), with a focus on linear and quadratic equations-these are chosen for their ability to model diverse correlation phenomena observed in various multivariate time series data, without sacrificing analytic and computational tractability. From the differential equations, we develop a geometrically well-defined multivariate volatility model. We also propose a variation of the model by employing the principal component analysis. In Sect. 4, the characteristics of our model are illustrated via simulation analysis and its performance is assessed through empirical studies. We conclude in Sect. 5 with a discussion of open problems and future extensions of this geometric approach to time-varying volatility.

Geometry of P(n)
In this section, geometric properties of the covariance space are briefly introduced. More complete description can be found in Fletcher et al. (2003), Fletcher and Joshi (2004), Moakher (2005), Lenglet et al. (2006).
The covariance space P(n) is defined as P(n) is a differentiable manifold whose tangent space at a point P 2 PðnÞ can be identified with n Â n symmetric matrices, SðnÞ. A basis for SðnÞ can be constructed in the usual way, i.e., the basis element E ij 2 SðnÞ, where i ! j, is a symmetric matrix whose ij and ji elements are one, and the remaining elements zero. A Riemannian structure can be constructed via the Riemannian metric given by hX; Yi P ¼ trðP À1 XP À1 YÞ, X; Y 2 SðnÞ. In terms of this metric, the length of a curve PðtÞ 2 PðnÞ, a t b, is given by This notion of length is invariant not only under reparametrization of [a, b], but also under congruent transformations of the form MPM > , where M is any fixed element in the general linear group, GL(n). Using the fact that P(n) is a complete space (i.e., the geodesics are well-defined for all t), the minimal geodesic cðtÞ : ½0; 1 ! ½A; B connecting two points A; B 2 PðnÞ is given by where GG > ¼ A; G 2 GL þ ðnÞ, the identity component of GL(n), i.e., a subgroup of GL(n) with positive determinants. The tangent vector of the geodesic at A is defined by the Riemannian log map The inverse of the Riemannian log map, the Riemannian exponential map is also defined. Given an element X 2 SðnÞ, the minimal geodesic emanating from some A 2 PðnÞ in the direction of the tangent vector X can be computed as follows: Defining the distance between A and B in the usual way by the length of the above minimal geodesic, we have where jj Á jj is the Frobenius norm, and k 1 ; . . .; k n are the eigenvalues of the matrix AB À1 . Since AB À1 is symmetric positive-definite, the eigenvalues of AB À1 are all positive, and log k i is well defined for each i. Note also that dðA; cðtÞÞ ¼ t dðA; BÞ.
With the above metric structure on P(n), we now discuss sample means and covariances on P(n). A widely used formula for the sample mean of N symmetric positive-definite matrices fP 1 ; . . .; P N g is the arithmetic mean 1 N P N i¼1 P i . While the arithmetic mean clearly lies in P(n), it has a number of undesirable properties: e.g., the arithmetic mean of matrices with an equal determinant can have a larger determinant. We therefore focus on the intrinsic mean, defined as arg min P2PðnÞ For the case of two points, the intrinsic mean is simply the midpoint of the minimal geodesic. For arbitrary N, the above intrinsic mean is unique on P(n). Fletcher and Joshi (2004) provides a simple steepest descent algorithm for numerically obtaining the intrinsic mean, using the gradient of (7) given by P N i¼1 logð PP À1 i Þ. An intrinsic notion of sample covariance on P(n) can also be defined via the Riemannian structure of P(n). Given N elements, P 1 ; . . .; P N , and the intrinsic mean P, the covariance matrix relative to P is defined by where X i is the tangent vector at P of the geodesic connecting P and P i , i.e., Based on these mean and covariance formulas, a generalized normal distribution on P(n) can be constructed by taking the curvature into account; see Lenglet et al. (2006) for details.
The usual principal component analysis defined on vector spaces is no longer valid on P(n). Instead, principal geodesic analysis (PGA), a version of principal component analysis on P(n), can be defined by utilizing the notions of geodesic and intrinsic mean. PGA allows us to identify the principal axes of variation of covariance matrices, which then can be used to develop a parsimonious covariance dynamics model. Details of PGA is provided in ''Appendix''.

Geometrically well defined volatility models
Assume that an n-variable system, r t is governed by the following dynamics equation: where H t 2 PðnÞ is the covariance matrix of e t . We assume a constant mean and focus on the covariance dynamics; generalization to a time-varying mean model should be straightforward. The covariance matrix of the shock is assumed to have dynamics of the form where F t is a time-varying n Â n symmetric matrix which depends on the information set at t. As dH t is the differential of H t , it is defined in the tangent space SðnÞ, and it suffices for F t to be symmetric. P(n) is a Riemannian symmetric space that is geodesically complete, and as such the minimal geodesics provide a natural way of discretizing general differential equations on P(n). Using the Riemannian exponential map defined in (5), H t can be obtained by the formula Another class of dynamics we consider for H t is based on the assumption that H t is mapped by F t from a constant covariance matrix H 1 , i.e., In the event that H 1 is an identity matrix, this model nests the matrix exponential GARCH model by Kawakatsu (2006). In (11), F t is the tangent vector of the geodesic connecting H tÀ1 and H t , and it is the tangent vector of the geodesic connecting H 1 and H t in (12). Various specifications of F t can be considered. The remainder of the section is devoted to developing standard models for F t .

Geometric GARCH models
We assume that F t depends only on the current and past covariances and shocks. In this case, a natural extension of the univariate GARCH model would yield the equation We can also construct a quadratic form In Eqs. (13) and (14), A p ; S p and B q are n Â n matrices. In the event that B q are zero, we have a standard matrix Riccati equation in H, with well-known methods for obtaining solutions: The functional form of F t can be easily extended to accommodate phenomena observed in the market, e.g., asymmetric volatility effect as reported in Kroner and Ng (1998). Asymmetric effect of shocks on F t is modeled by augmenting Eq. (13) with terms of g t ¼ je t j À e t : F t could be more generally specified by transforming it to a vector as in Kawakatsu (2006), but the above specification is sufficient to capture the essential features of covariance dynamics. The proposed model is geometrically well-defined: the covariance matrix remains in PðnÞ as it evolves without further restrictions on the dynamics, as long as the differential dH t is an element of SðnÞ. We call this specification of time-varying volatilities Geometric GARCH, or simply GGARCH model.

PCA based specifications
Another formulation of F t we consider is based on the principal component analysis. The PCA can be applied in two ways depending on the class of the covariance dynamics. For the covariance dynamics defined in (11), the usual PCA can be applied to the tangent vectors of the geodesics connecting H tÀ1 and H t . In conjunction with the dynamics equation in (12), the PGA can be applied to the geodesics connecting H 1 , and H t . In either case, if the variation of the covariance matrix can be approximated by K\nðn þ 1Þ=2 principal directions, fV k g, F t can be written in the form where a kt are time-varying factor loadings. We specify a kt as a scalar function of the lagged terms of the covariance matrix and residuals, fH tÀp ; e tÀq g. In particular, the following functional form is considered.
where A p , B q , and C q are n Â n upper triangular matrices. Another method worth consideration is to use the distance function. For example, the following specification can be considered.
where a k;p , b k;q , and c k;q are nonnegative scalar coefficients. 1 By defining the factor loadings as the distance between the current covariance matrix and a function of the residuals, this specification has a better geometric interpretation. If K is sufficiently small, the PCA-based specifications ensure a parsimonious representation of the covariance dynamics.
Although the principal directions can, in principle, be obtained by applying PCA or PGA to a series of covariance matrices, covariance matrices are unobservable and need to be estimated. A good proxy would be a realized covariance obtained from high frequency data. If high frequency data is not available, sample covariance matrix calculated from a sub-sample could be used.

Parsimonious representations
The GGARCH models involve Oðn 2 Þ number of parameters in the most general case and unless some restrictions are imposed, the models will be numerically untenable for a large dimensional system. An obvious and popular practice is to restrict the coefficient matrices to diagonal matrices or scalars. Table 1 compares the number of parameters in each GGARCH specification and in some existing models. Note that, while BEKK and DCC models have the number of parameters of Oðn 2 Þ regardless of the degree of restriction, diagonal or more restricted versions of the GGARCH model have parameter numbers of O(n) or a constant. This is because the covariance dynamics is defined in the tangent space and there is no constant term in the dynamics equation. A constant term in the tangent space would imply that the covariance matrix tends to move toward a certain direction regardless of its current position, which is not intuitive. Empirical studies also support this: while a constant matrix plays a dominant role in BEKK and DCC models, it turns out to be insignificant in GGARCH models. This gives our model a huge benefit when applied to large dimensional systems.

Empirical studies
In this section, we illustrate the characteristics of the proposed models via simulations. They are then applied to a set of real market data and compared with widely used models, i.e., BEKK and DCC. Although the case studies introduced in this section are far from exhaustive, they convey some important lessons about our model and shed light on the future directions of research.

Properties of the covariance space
Riemannian exponential map serves as a building block of our covariance dynamics models and it is important to understand its properties in order to correctly specify the tangent vector F t . For illustration purposes, we consider a simple bivariate system; this facilitates visualization and is still sufficient to understand the basic properties.

The shortest path
Consider two covariance matrices be a covariance matrix on the geodesic connecting H 0 and H 1 . The geodesic is visualized in Fig. 1. In the figure, the first panel plots the trajectory of h 11 and h 12 , and the second panel plots the distance between HðtÞ and Hðt þ 0:1Þ for t ¼ 0; . . .; 0:9, with Hð0Þ ¼ H 0 and Hð1Þ ¼ H 1 . The solid lines represent the geodesic and the dotted lines represent the linear interpolation. Interestingly, the trajectory of the variance term is convex and that of the covariance term is slightly concave, contrary to the naive linear interpolation that yields straight lines for both variance and covariance terms. This is because, under the Riemmanian metric, the distance between two covariance matrices increases exponentially as one matrix approaches singularity, i.e., perfect correlation, whereas HðtÞ moves at a constant speed as shown in Fig. 1b: Convex h 11 and concave h 12 implies that the correlation increases more rapidly at the beginning and more slowly later compared to the case of linear interpolation. This results in the equal distance between two adjacent points along the geodesic, contrary to the increasing distance between two adjacent points along the linear interpolation line. This is a desired property of covariance dynamics since, for example, an increase in correlation from 0.0 to 0.5 must be more likely than an increase from 0.5 to 1.0. In fact, for H 1 ¼ h0:5 0:5h " # , the minimum distance between H 0 and H 1 is achieved when h is near 1.275, whereas it would be achieved when h ¼ 1 under a conventional metric such as Frobenius norm. This suggests that an increase in covariance is likely to involve an increase in variance more under the Riemannian metric compared to an Euclidean metric, which is consistent with the fact that the correlations are much more persistent than the variances. This is also supported by the empirical analysis demonstrated later in this section, where the first principal component of covariance matrix variation turns out to be simultaneous changes of the variance and covariance.

Riemmanian exponential map
We examine the movement of a covariance matrix when it is repeatedly subject to a tangent vector. This is to assess the effect of repeated shocks in one direction. Suppose the covariance matrix has the initial value F 1 implies a shock only on the first variable, while F 2 (F 3 ) implies shocks of an equal size in the same (opposite) direction. The results are displayed in Figs. 2, 3, and 4. In each figure, r 1 , r 2 , and q respectively denote the standard deviations of the two variables and their correlation coefficient. When H t is subject to F 1 , r 1 increases while q decreases. This is intuitive as independent shocks would imply reduced correlation. Note that r 1 increases at a reduced pace when it is high. This is desirable because when the variance is high, the same shock would have a smaller impact. As F 2 implies shocks in the same direction, the correlation increases when H t is subject to F 2 , and for the same reason, it decreases when  Fig. 4 Movement of H t subject to F 3 . r 1 , r 2 , and q respectively denote the SD of the two variables and their correlation coefficient H t is subject to F 3 . However, the correlation decreases more rapidly, as we would expect from the actual movements of covariance matrices.
While Riemmanian exponential map carries desired properties of covariance dynamics, the behavior of the covariance matrix is not entirely clear due to the nonlinearity of the matrix exponential. This makes specifying the tangent vector and calibrating the model tricky: e.g., one may want to make correlations more persistent than variances. Although the GGARCH model does not allow independent modelling of variances and correlations, one workaround is to incorporate the GGARCH model into the DCC framework, i.e., estimate variances using a univariate GARCH model and estimate correlations using the GGARCH model.
We have investigated various characteristics of the covariance space and related them to some of the features observed in the market. Based on the promising simulation results, we believe that, by correctly specifying the tangent vector, our model can be successfully employed in many financial application domains in which volatilities are best modeled as time-varying.

Global market correlation
In this section, we apply the GGARCH models to a set of real market data and compare them with the BEKK and DCC models. We choose two stock market indexes, S&P500 and FTSE100, both obtained from Thomson Reuters database. Daily index values are collected from October 1, 2003 to September 30, 2013, from which daily log returns are calculated. The sample mean and covariance matrix of the index returns are reported in Table 2. The two markets are highly correlated during the sample period, and have a similar level of mean and variance.

Test models
We consider six GGARCH specifications and compare them with two BEKK specifications (scalar and diagonal coefficient matrices) and a DCC specification. The most general form of the BEKK model was also considered but excluded as the estimation results were too sensitive to initial values and therefore unreliable. In all specifications, the covariance matrix is assumed to be a function of only the first lags of the covariance matrix and the residuals. As our primary interest lies in the covariance dynamics, we assume a constant mean return for simplicity. The six GGARCH specifications are described below. For the details of the BEKK and DCC models, please refer to Engle and Kroner (1995) and Engle (2002), respectively.
where A, B, and C are scalar.
where A, B, and C are symmetric, and is the element-wise matrix product.
where A, B, and C are arbitrary n Â n matrices.
• GGARCH PCA DIAG where A k , B k , and C k are diagonal matrices.
where A k , B k , and C k are upper triangular matrices.
All the models are estimated via Quasi-Maximum Likelihood Estimation (QMLE). 2 The PCA-based GGARCH models require a time series of covariance matrices in order to apply PCA prior to model estimation. We generate a covariance matrix time series by calculating sample covariance from the minimum size (two, in our case) subsample at each time t. We tested larger subsamples but the results were qualititively similar. PCA results are reported in Table 3. The first component explains almost 80% of the change and the first two components combined together explain more than 95% of the change. Based on this, we choose K ¼ 2. Brief inspection of the eigenvectors reveals an interesting observation: The first component is related to simultaneous changes of the variances and covariance, the second component is related to independent changes of the variances, and finally the third component is related to an independent change of the covariance. The first component implies that the principal variation of the covariance matrix is due to simultaneous changes of the variances while the correlation remaining still.

Estimation and diagnosis
Estimation and diagnosis results are reported in Tables 4, 5, 6 and 7. The models are ranked in each table based on their performance: ***, **, and * represent the best three performing models in descending order. Table 4 compares the log-likelihood value of each model. The gain in the log-likelihood value from increased model flexibility seems trivial. For example, the difference in the loglikelihood values of GGARCH SCALAR and GGARCH LINEAR is mere 0.3%, even though the former has only 3 parameters, whereas the latter has 9. This means that a model selection criteria such as Bayesian Information Criteria (BIC) would favor the scalar  version within each model family. The difference between model families is also small. Still, the DCC and BEKK models achieve higher log-likelihood values than the GGARCH models. Table 5 reports the results of the autocorrelation test proposed by Ledoit et al. (2003). If a model is correctly specified, normalized residuals should be serially uncorrelated. The test results are in favor of the DCC and BEKK models. The null hypothesis of zero autocorrelation cannot be rejected in these models. Among other models, only the GGARCH LINEAR passes the test at 5% level.
A minimum variance portfolio composed of the two market indexes is obtained from the covariance estimate of each model. The portfolio is rebalanced every 22 business days and held until the next rebalancing day. The results are reported in Table 6. It seems difficult to distinguish the models based on the portfolio optimization results. Portfolio composition is similar across the models and the standard deviation of the portfolio return is almost indistinguishable. This is mainly due to the high correlation and the similar sizes of the variances of the two market returns, as reported in Table 2. Nevertheless, subtle differences among the models can still be observed: The GGARCH family allocate consistently   The first row indicates portfolio composition between S&P500 and FTSE100, and the second row indicates probability level. The figures are probability of loss exceeding VaR lower weights on S&P500 compared to the BEKK and DCC models. Considering the fact that the volatility of S&P500 is higher overall, and the objective is to minimize variance, the allocations by the GGARCH family seem more plausible. Finally, we calculate Value-at-Risk (VaR) of three portfolios at three different probability levels. Three portfolios are constructed by mixing the two market indexes at 20: 80,50:50,and 80:20 ratios,and VaR is computed at 95,99,and 99.9% probability levels. Assuming normality of the returns, VaR at time t is given by where a is the probability level, z a is the z-value at a, and w is the portfolio weights. VaR is computed everyday and the probability of loss exceeding VaR is obtained by comparing the VaR with the actual loss of the portfolio over the sample period. The results are reported in Table 7. The values in the table are the probability of loss exceeding VaR. If VaR is correctly estimated, the probabilityt will converge to the probability level a. The models are ranked based on the overall performance across portfolios and probability levels. It turns out that all the models underestimate VaR, except the GGARCH PCA DIAG for 50:50 portfolio at 95% level. Nevertheless, the GGARCH family appear to estimate VaR more accurately, and PCA-based GGARCH models, in particular, perform best.
We have evaluated the proposed models using various diagnostic tools. The results are mixed and do not consistently support any particular model. This could be because the model specifications considered here are not correct, or more fundamentally, our geometric framework has limitations in modelling covariance dynamics. It will remain inconclusive until more exhaustive research is undertaken. Nevertheless, our models demonstrate clear benefits over the existing models: While the BEKK and DCC models perform better in the in-sample tests, GGARCH models are better performers in the out-of-sample tests, i.e., asset allocation and risk measurement, which are more important in practical applications. Remarkably, this is achieved with fewer model parameters.

Conclusion
In this paper, we have proposed a new framework for addressing covariance dynamics from a geometric perspective using differential geometry tools developed by Fletcher et al. (2003), Fletcher et al. (2003), and Moakher (2005) among others. Our framework preserves the geometric structure of the covariance matrix without any arbitrary restrictions by respecting the inherent geometric features of the covariance matrix. It also possesses the desired nonlinear nature of the covariance dynamics observed in the market. Based on our geometric framework, we derived two types of covariance dynamics models, a general geometric GARCH (GGARCH) model, and a PCA-based geometric GARCH (GGARCH PCA) model. The major benefits of our model is (1) it allows us to specify the dynamics in an intuitive manner; (2) the nonlinear nature of the covariance dynamics is naturally implied; (3) a parsimonious specification can be easily achieved. The last feature is particularly important as it makes the model suitable for large dimensional systems.
Simulation studies reveal potential benefits of our model by showing that it captures many well-known properties of the covariance dynamics. Empirical studies, however, show mixed results and do not fully support our model against the widely recognized BEKK and DCC models. Nevertheless, it is encouraging that our model outperforms these models when forecasting matters, as evidenced in the asset allocation and VaR estimation tests. Different specifications of the dynamics within our framework could improve the overall performance.
We have primarily focused on introducing a new framework and the underlying mathematical concepts from a theoretical perspective, and demonstrated its potential benefits through case studies. More comprehensive empirical studies and comparison analysis are in order: numerous areas of application can be further considered, e.g., credit risk modelling, asset pricing, and portfolio optimization among others. Particularly intriguing is the possibility of a conditional CAPM model within the context of our framework. Econometric theories in the covariance space, e.g., conditions for stationarity, have yet to be established. Extensions of the fundamental models proposed in this paper can also be considered: different specifications of the tangent vector is of particular interest. Finally, a multivariate stochastic volatility model can be specified in the same framework by adding a diffusion term to the continuous-time dynamics. We expect that the geometric framework proposed here can serve as a building block for developing various covariance dynamics models, and more broadly contribute to building a new paradigm for economic modelling.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix: Principal geodesic analysis
Principal component analysis (PCA) finds a linear subspace in which the variability of the data is best described. Similarly, principal geodesic analysis (PGA) seeks a submanifold that best represents the variability of the data in a Riemannian manifold. Fletcher and his colleagues investigate the PGA on a Lie group (Fletcher et al. 2003) and also on the space PðnÞ (Fletcher and Joshi 2004). They first generalize three important concepts in PCA: variance, subspace, and projection, and develop an algorithm for PGA. For completeness, we briefly summarize the process; a more concrete derivation can be found in one of the references cited above.
First, the variance of a set of data on a Riemannian manifold is defined as the expected value of the squared Riemannian distance from the mean: Given a sample of N data, the sample variance is defined as Secondly, the manifold counterpart of the linear subspace in the Euclidean space is defined.
A submanifold H of a manifold M is said to be geodesic at P 2 H, if all geodesics of H passing through P are also geodesics of M. Since a submanifold geodesic at P preserves the distance from P, submanifolds geodesic at the mean can be regarded as the generalization of the linear subspaces.
We now define the projection operator to a manifold. Fletcher and Joshi (2004) Projection onto a geodesic submanifold at P can be approximated in the tangent space to the mean, T P M. If V 1 ; . . .; V K form an orthonormal basis for T P H, then the projection operator can be approximated by the formula With the concepts defined above, it can now be shown that PGA can be performed by applying PCA to the tangent space of the manifold, T l M. The algorithm for PGA on P(n) is summarized below.
• Given P 1 ; . . .; P N 2 PðnÞ, • Calculate the intrinsic mean of fP i g and denote it P. • Compute the tangent vectors at P of the geodesics connecting P and P i : • Apply PCA to fX i g, i.e., find eigenvectors and eigenvalues, fv k ; k k g of SðnÞ are the principal directions and k k are the corresponding variances.
Each observation P i can be reproduced by the formula