Robustness and sensitivity analyses of rough Volterra stochastic volatility models

In this paper, we analyze the robustness and sensitivity of various continuous-time rough Volterra stochastic volatility models in relation to the process of market calibration. Model robustness is examined from two perspectives: the sensitivity of option price estimates and the sensitivity of parameter estimates to changes in the option data structure. The following sensitivity analysis consists of statistical tests to determine whether a given studied model is sensitive to changes in the option data structure based on the distribution of parameter estimates. Empirical study is performed on a data set consisting of Apple Inc. equity options traded on four different days in April and May 2015. In particular, the results for RFSV, rBergomi and α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}RFSV models are provided and compared to the results for Heston, Bates, and AFSVJD models.


Introduction
In the field of mathematical finance, the stochastic volatility (SV) models are widely used to analyze derivative securities such as options.The SV models do not only assume that the asset price follows a specific stochastic process but also that the instantaneous volatility of asset returns is of random nature.
The origin of these models goes back to the paper by Hull and White (1987) however the SV models became particularly popular thanks to the model by Heston (1993), in which the volatility is modeled by the mean-reverting square root process.This model became popular among both practitioners and academics.
Although many other SV models have been proposed since then, it seems that none of them can be considered to be the universally best market practice approach.Some models may perform well when calibrated to real market data with complex volatility surfaces but at the same time, they can suffer from over-fitting or they might not be robust to the changes in the option data structure as it is described by Pospíšil, Sobotka, and Ziegler (2019).Moreover, a model with a good fit to an implied volatility surface might not be in-line with the observed properties of the corresponding realized volatility time series.
One severe limitation of the classical SV models might be for example the independence of increments of the driving Brownian motion.This motivated Comte and Renault (1998), Comte, Coutin, and Renault (2012), and independently for example Alòs, León, and Vives (2007) to consider the fractional Brownian motion (fBm) as the driving process since the fBm is a generalization of the Brownian motion which allows correlation of increments dependent on the so-called Hurst index H ∈ (0, 1).For H > 1/2, the increments are positively correlated and we say that the process has long memory.For H < 1/2, the increments are negatively correlated and we speak about short memory or more recently about "rough regime".Gatheral, Jaisson, and Rosenbaum (2018) showed empirically that H < 1/2 by estimating it from the realized volatility time series of major stock indexes and argues that the rough fractional stochastic volatility (RFSV) model is more consistent with the reality.
In this paper, we consider the αRFSV model recently introduced by Merino, Pospíšil, Sobotka, Sottinen, and Vives (2021).This model unifies and generalizes the RFSV model (α = 1) and the rBergomi model (α = 0).For pricing of a European call, we employ Monte-Carlo (MC) simulations using the Cholesky method equipped with the control variate variance reduction technique as it is suggested by Matas and Pospíšil (2021).
We then calibrate the model to a real market dataset and analyze its robustness to the changes in option data structure (options of different combinations of strikes and expiration dates may be available for trading on different days) using the methodology proposed by Pospíšil, Sobotka, and Ziegler (2019) which is based on data bootstrapping.In this paper, the authors showed that pricing using the classical SV models such as Heston (1993) and Bates (1996) models is highly sensitive to changes in option data structure.More robust results were obtained for the longmemory approximative fractional SV model, but not for all considered datasets.Then, a natural question arises: can the RFSV models perform better?Apparently, the answer is yes as we show in this paper.Since the RFSV models belong to a wider class of the rough Volterra processes, the presented methodology is applicable to this wider class as well.
The structure of the paper is the following.In Section 2, we introduce the pertinent rough Volterra stochastic volatility models.In Section 3, we describe the methodology, in particular the calibration of considered models to real market data.We describe the methodology of option data bootstrapping, as well as the details of the robustness and sensitivity analyses.In Section 4, we summarize the obtained calibration results by comparing all the models in terms of variation in model parameters and in bootstrapped option model prices.We also test the roughness parameter and the parameter α for significance.Then, we provide the results of the sensitivity analysis fulfilled by a Monte Carlo filtering technique, testing whether a given studied model is sensitive to the changes in the option data structure when being calibrated.We conclude all the obtained results in Section 5.

Volterra volatility process
Let W = (W t , t ≥ 0) be a standard Wiener process defined on a probability space (Ω, F , Q) and let F W = (F W t , t ≥ 0) be the filtration generated by W .We consider a general Volterra volatility process defined as where where K(t, s) is a kernel such that for all t > 0 and By r(t, s) we denote the autocovariance function of Y t and by r(t) the variance In particular we will model volatility as the exponential Volterra volatility process where (Y t , t ≥ 0) is the Gaussian Volterra process (2) satisfying assumptions (A1) and (A2), r(t) is its autocovariance function (3), and σ 0 > 0, ξ > 0 and α ∈ [0, 1] are model parameters.
A very important example of Gaussian Volterra processes is the standard fractional Brownian motion (fBm) B H t (the exponent H has the meaning of index, not power) where K(t, s) is a kernel that depends also on the Hurst parameter H ∈ (0, 1).Recall that the autocovariance function of B H t is given by and in particular r(t) := r(t, t) = t 2H , t ≥ 0. Nowadays, the most precise Volterra representation of fBm is the one by Molchan and Golosov (1969) where .
To understand the connection between Molchan-Golosov and other representations of fBm such as the original Mandelbrot and Van Ness (1968) representation, we refer readers to the paper by Jost (2008).
There are various methods to simulate the fractional Brownian motion numerically.We often divide these methods into two classes: exact methods and approximate methods (Dieker 2002).We focus on more accurate exact methods that usually exploit the covariance function (6) of the fBm to simulate exactly the fBm (the output of the method is a sampled realization of the fBm) without the necessity to treat the complicated Volterra kernel.In particular, the Cholesky method use a covariance matrix to generate the fBm from two independent normal samples.Despite its higher computational complexity, this method has already proved to be the most suitable for simulation of the volatility models (Matas and Pospíšil 2021).

Rough Volterra volatility models
Let S = (S t , t ∈ [0, T ]) be a strictly positive asset price process under a market chosen risk neutral probability measure Q that follows the stochastic dynamics: where S 0 is the current spot price, r ≥ 0 is the all-in interest rate, W t and W t are independent standard Wiener processes defined on a probability space (Ω, F , Q) and ρ ∈ [−1, 1] represents the correlation between W t and W t .Let F W and F W be the filtrations generated by W and W respectively and let F := F W ∨F W .The stochastic volatility process σ t is a square-integrable Volterra process assumed to be adapted to the filtration generated by W and its trajectories are assumed to be a.s.càdlàg and strictly positive a.e.exponential Volterra volatility process satisfies these properties).
For convenience we let X t = ln S t , t ∈ [0, T ], and consider the model Recall that Z := ρW + 1 − ρ 2 W is a standard Wiener process.
In this paper we will study the αRFSV model firstly introduced by Merino, Pospíšil, Sobotka, Sottinen, and Vives (2021).In this model, the volatility is modelled as the exponential Volterra process with fBm, i.e.
where σ 0 > 0, ξ > 0 and α ∈ [0, 1] are model parameters together with H < 1/2 that guarantees the rough regime.For α = 0 we get the RFSV model (Gatheral, Jaisson, and Rosenbaum 2018), for α = 1 the rBergomi model (Bayer, Friz, and Gatheral 2016).While both cases of the αRFSV model are more likely to replicate the stylized facts of volatility (Gatheral, Jaisson, and Rosenbaum 2018) even by using relatively small number of parameters (σ 0 , ξ, ρ, H), the issue is the non-markovianity of the model.Because of this, we cannot derive any semi-closed form solution using the standard Itô calculus nor the Heston's framework.Therefore, to price even vanilla options, we have to rely on Monte-Carlo (MC) simulations.For these purposes, a modified Cholesky method will be used together with the control variate variance reduction technique as it was described by Matas and Pospíšil (2021).
We close this section by mentioning, that there exists yet another pricing approach that takes advantages of the so-called approximation formula derived by Merino, Pospíšil, Sobotka, Sottinen, and Vives (2021).This formula can be used either as a standalone fast approximation or together with the MC simulations to speed up the calibration tasks.However, in this paper, we will focus on robustness and sensitivity analyses based on pricing approaches that are as accurate as possible and this can be achieved currently only by the MC simulations that use an exact simulation technique for the fBm.

Methodology
In this section, we describe the methodology of calibration of the rough Volterra models to real market data and we focus on the robustness and sensitivity analysis.

Calibration to market data
Model calibration constitutes a way to estimate model parameters from available market data.The alternative approach suggests estimating the parameters directly from time series data such as for example Gatheral, Jaisson, and Rosenbaum (2018) did for the Hurst parameter.We understand model calibration as the problem of estimating the model parameters by fitting the model to market data with pre-agreed accuracy.
Mathematically, we express the calibration problem as an optimization problem where is the observed market price of the ith option, i = 1, . . ., N , with time to maturity T i and of strike price K i , while w i is a weight and C Θ (T i , K i ) denotes the option price computed under the model with a vector of parameters Θ.For αRFSV, we have Θ = [σ 0 , ρ, H, ξ, α].
In fact, the representation of the calibration problem in ( 12) is a non-linear weighted least squares problem.To obtain a reasonable output, we have to assume that the market prices are correct, i.e., there is no inefficiency in the prices, which is usually not the case, especially for options being further ITM or OTM.To fix this, let us assume that the more an option is traded, the more accurate the price is.We can then weight the importance of a given option in the least squares problem by the traded volume of the given option.However, there is also another, and in fact a more convenient and popular way to implement such weights.We can get the information of uncertainty about the price of an option from its bid-ask spread.The greater the bid-ask spread, the more uncertainty (and usually less trading volume) there is about the price.Therefore, we will use a function of bid and ask prices for the weights: Based on the empirical results (Mrázek, Pospíšil, and Sobotka 2016), we will consider only the case g(x) = 1/x 2 .
Because the objective function is non-linear, we cannot solve the problem analytically as in the case of standard linear regression.Hence, we revert to iterative numerical optimizers.
For the minimization of ( 12), we use the MATLAB function lsqnonlin() that implements an interior trust region algorithm, described by Coleman and Li (1996).The algorithm assumes, among other things, that the target function is convex.However, we cannot even show the convexity of the target function since we have no analytical expression to describe it.Therefore, if the algorithm ends up in a local minimum, it is not guaranteed that it is the global minimum.
In fact, the target function can have more than one local minimum (the source is the nonlinearity of the model price function).To determine the initial point for gradient-based lsqnonlin(), we use another MATLAB function ga() that implements a genetic algorithm minimization approach.It deploys a predefined number1 of initial points across the domain of the function and then, each point serves as an initial condition for minimization that is performed for a pre-defined number of steps.Based on the genetic rules of random mutation, crossbreeding, and preservation of the fittest, the most successful points are preserved, perturbated by a random mutation, and crossbred among themselves.This approach (Mrázek, Pospíšil, and Sobotka 2016) has been shown to produce sound results.
To measure the quality of the fit of a calibrated model, we use the following metrics.Having N options in the data set, we denote C mkt i the market price of the ith option and Ci the estimated price of the ith option based on the calibrated model.Denoting S 0 the spot price, the first metric is the average relative fair value (ARFV) and the second one is the maximum relative fair value (MRFV).They can be expressed as It is worth to mention that these measures offer a better error understanding than the originally used average absolute relative error (AARE) and maximum absolute relative error (MARE)

Robustness analysis
We calibrate the αRFSV model the way described in the previous section to a real market dataset.
In the ideal hypothetical case, all combinations of strikes and times to maturity for a given option would be available, i.e., we would have a continuous price surface to which we would calibrate a selected model.However, in reality, we have only a finite number of different options available to trade and moreover, the combinations of strikes and times to maturities (we call this option data structure) changes and even the number of combinations itself changes over time.Therefore, the obtained coefficient estimates can differ, should the model calibration be sensitive to the option data structure.
In this paper, we understand robustness as the property of a model that conveys the sensitivity of the model being calibrated to changes in the option structure.To study the robustness of the αRFSV model, we use the methodology suggested by Pospíšil, Sobotka, and Ziegler (2019).Therefore, our results of the robustness analysis of the αRFSV model are comparable with those of the Heston, Bates, and the approximate fractional stochastic volatility jump diffusion (AFSVJD) model, presented in the referenced paper2 .
To analyze robustness, we have to simulate the changes in the option structure.To do this, we employ bootstrapping of a given option structure.Bootstrapping is a technique when random samples are selected with replacement from the initial dataset.For example, to bootstrap the data set (X 1 , X 2 , . . ., X 6 ), we need to generate uniformly distributed random integers from {1, 2, . . ., 6}.Suppose the realization is {2, 3, 5, 4, 4, 3}.Then, the obtained bootstrapped sample is (X 2 , X 1 , X 5 , X 4 , X 4 , X 3 ).
Mathematically, an option structure is the set of all the combinations of strikes K and times to maturity T available for trading in a given day.Having market data consisting of N options, the set X = {(K i , T i ), i = 1, . . ., N } is the option structure for the given day where each option has the market price By bootstrapping X in total of M times, we obtain M new option structures X 1 , . . ., X M .Then each X j , together with the option prices from the initial dataset assigned to the corresponding combinations of strikes and times to maturities, produces bootstrapped sample B j .Next, we calibrate the model separately to each B j and obtain estimates of the model parameters and model prices.Let us denote Θj the parameter estimates obtained from the bootstrapped sample B j , and Cj = [ Cj 1 , . . ., Cj N ], where Cj i = Cj i (K i , T i ), is the vector of corresponding model prices.Having the results of the calibrations from B 1 , . . ., B M , we can compute the bootstrap estimates of the parameters and models prices.The bootstrap estimate of a parameter is the mean across all the estimated parameters: and the bootstrap estimate of a model price of the ith option is Next, we look at the variance of the errors of the price estimates of the ith option Cj i − C mkt i .However, to be able to better compare the variances among different options, we normalize the error.Then, let us denote the variance of the normalized errors of the ith option.It is also useful to examine the bootstrap relative error (BRE) for the ith option: We analyze variation in coefficients visually by plotting a scatter plot matrices.Denoting d the number of model coefficients being calibrated, the scatter plot matrix is a d × d matrix, where histograms for each coefficient are on the diagonal, and 2D scatter plots of corresponding values of coefficients elsewhere.Hence, from a scatter plot matrix, we get a grasp of the distributions of coefficients and also whether there is any dependence between pairs of coefficients and variation in the estimates.

Sensitivity analysis
In this paper, we use a similar method to carry out a sensitivity analysis introduced by Pospíšil, Sobotka, and Ziegler (2019) based on the ideas of Saltelli, Ratto, Andres, Campolongo, Cariboni, Gatelli et al. (2008).In short, we aim to test whether the αRFSV model is sensitive to changes in option structure through a given parameter.
In our context, we chose the following Monte-Carlo filtering technique3 : To each vector of calibrated model parameters obtained from the bootstrapped data, we calculate the average relative fair value (ARFV) as a quality measure for the calibrated model fit.Then, we separate the calibrated models into three groups: (I) the calibrated models with the corresponding values of the ARFV up to the third octile, (II) the models with the ARFV between the third and the fifth octile, and (III) the models with the ARFV above the fifth octile.Next, for each calibrated parameter, we compare the distribution of the parameter estimates corresponding to models from group (I) with the distribution from group (III).We use the Kolmogorov-Smirnov test for the comparison.The null hypothesis is that the parameter estimates from group (I) comes from the same distribution as those from group (III).

Numerical results
In this section, we present our results of calibrations and robustness and sensitivity analyses of the αRFSV model.We used the same real market dataset as in the paper by Pospíšil, Sobotka, and Ziegler (2019) where Heston, Bates, and the AFSVJD models were analyzed.Consequently, the results are directly comparable.

Data description
We use a real market dataset that consists of market prices of call options on Apple Inc. stock (NASDAQ: AAPL) quoted on four days of 2015: 04/01, 04/15, 05/01, 05/15.Naturally, the combinations of strikes and times to maturity of the options (the option data structure) change over time.There are 113 options in the option chain on the first day.The second day, the total number of different options rises to 158, the next day to 201, and the last day decreases to 194.
For convenience we visualize the data from May, 15, in Figure 1, in order to give some perspective.For each listed call option with the strike K and time to maturity T , a disk is plotted with center in (K, T ).The diameter of the disk relates to the price of the option.

Calibration routine
In order to calibrate the αRFSV model, we use the Cholesky method with the modified turbocharging method introduced by Matas and Pospíšil (2021) and we follow the recommendation given there to employ at least P = 150, 000 paths discretized by n = 4 × 252 per interval [0, 1], so the pricing method assures sufficient accuracy.Then, we are able to price any option with T < T max from the paths, which were already simulated, by truncating them to corresponding interval [0, T ].
We tried to use different weights 4 for the target function ( 12), but the best results were obtained for the weight type which aligns with the results by Mrázek, Pospíšil, and Sobotka (2016) for other SV models.For that reason, we present only the results for this type of weights.To compare weighted prices with the market prices, see Figure 2 and 1.  16).The positions of disks are given by the combinations of the strikes K on the x-axis and the maturities T on the y-axis of the options listed at the time.The diameter of each disk relates to the corresponding weighted close price.Compare with Figure 1.

Overall calibration
First, we summarize the results of the calibrations to the market data.Then, we compare the results to those obtained by Pospíšil, Sobotka, and Ziegler (2019) where different SV models were analyzed using similar methods and the identical dataset.For convenience of the comparison, we adopt Table 1  For the MATLAB function ga(), we set the number of initial points on 150 and the number of iterations on 5, as more than 5 did not make much significant improvement.For lsqnonlin(), which is ran after ga() and which further minimizes its output, we set the tolerance on the value of the target function on 10 −6 and the tolerance on the norm of the difference between two subsequent points on 10 −7 .Although the global optimization part is heavily time consuming, it is crucial in the situations when no initial guess is available to be used for the local optimization part that is significantly faster for obvious reasons.The whole procedure takes just a couple of minutes on a personal computer and no supercomputing power is necessary.The bounds of the coefficients considered for the calibration are summarized in Table 2.While the bounds σ 0 > 0 and ρ > −1 are naturally arising from the definition of the αRFSV model, the upper bound for σ 0 and the bounds for ξ were determined based on several test calibrations such that they provided a suitable area for the genetic algorithm while not limiting the calibration procedure in finding the global maximum.The upper bound ρ ≤ −0.05 is based on the recommendation given in Matas and Pospíšil (2021) and the range for the Hurst parameter was set as 0.05 ≤ H ≤ 0.25 which is, according to Bennedsen, Lunde, and Pakkanen (2022), a common range for H based on estimates for 2000 different equities.Table 3 presents the results of the overall calibration procedure for the studied models.We can see that for the first two days, rBergomi provides better fits than RFSV.For the next two days, the situation reverses and the fits obtained by RSFV are superior to those obtained by rBergomi.An interesting result is that the αRFSV model that unifies the two models by introducing a new parameter α that fulfills the role of the weight between the models fits the data in the most consistent way.We discovered that for the first two days, the parameter α is closer to 1 which corresponds to the rBergomi model and the next two days, α leans towards 0 thus the RFSV model.However, the αRFSV model does not always provide the best fit.
Comparing the values of AARE in Table 3 to those obtained for other SV models tabulated in Table 1, we can observe that the rough models do not provide better fits than the SV models.
Nevertheless, rough models, as we will see in the next sections, are much more robust compared to the SV models.
To illustrate the difference in the fit of two different models on one day, we visualized the model prices, the market prices, and the BREs in Figure 3.We chose April 01 because there is the biggest difference in the AARE of rBergomi (top row) and RFSV (bottom row).Above the K − T plane we plotted the market and model prices together (left side) and the corresponding BREs (right side).On this day, the rBergomi model provides much better fit than the RFSV model.

Parameter significance testing
We also test for parameter significance.We are particularly interested whether the parameters H and α have any affect on the model fit, i.e., whether the fit of the αRFSV model is better/worse when H (resp. α) is being calibrated compared to the model with fixed H (resp. α).We consider the fixed value of H being 1/2, thus it constituted a model with the volatility process being driven just by the Bm instead of the fBm.To test the significance of α, we compare the fit of the αRFSV model with the rBergomi model which corresponds to α = 1.
If we had a deterministic pricing formula, we could simply calibrate the models and compare the fit directly.But since the pricing involves randomness (Monte Carlo simulations), we need to conduct a statistical test to decide whether the difference in the fit measured by the ARFV is significant.We thus conducted 100 simulations to that resulted in different prices and thus a sample of different values of the ARFV for a given calibrated model.We then used the two-sample t-test to compare the mentioned pairs of models.We first used this method to compare the rBergomi model with H being calibrate and with H fixed to 1/2.For all the four days, rBergomi provided significantly better fit.Then we compared the αRFSV model with the rBergomi model which corresponds to α = 1.Again, the null hypothesis was rejected for all the days.It is worth to mention that all the p-values were smaller than 0.001.

Robustness analysis
To analyze the robustness of the studied models, we ran calibrations on 200 bootstrapped samples as described in Subsection 3.2 and examined errors and the variation in the prices and coefficients.
For the initial points for the bootcalibrations, we chose the parameters estimated by the overall calibration (Table 4) while keeping the other calibration procedure parameters the same as before.
First, we examine the errors of prices and its variation with respect to the changing option structure.Then, we analyze the variation in the model parameter estimates.Finally, we summarize the results in a table and quantitatively compare the studied models to the Heston, Bates, and AFSVJD models earlier analyzed by Pospíšil, Sobotka, and Ziegler (2019).

Prices -errors and variation
We examined how the model prices obtained from bootcalibrations differ from the market prices by calculating bootstrap relative errors BRE i using (15) and the variation price estimates by calculating the variances of absolute relative errors V i using (14).
Figure 4 depicts the values of BRE i and V i for the option structure on April 1 produced by the αRFSV model (top row) and the rBergomi model (bottom row).For both models, the largest errors and variation of errors concentrates on the right side relative to the spot price.This is expected since deep OTM options have zero intrinsic value and hence all the value comes from the time component associated with the probability of profitable exercise at expiration which is naturally more difficult to model.Comparing the results of two models on this particular day, rBergomi provides much better fit than αRFSV.The variation in rBergomi errors is also smaller which indicates that the model is less sensitive to the changes in the option structure.
Complete results for all the three models for each day are available in similar format in Appendix of the thesis Matas (2021).By comparing all the figures, we saw the same pattern: larger errors for the OTM option.However, an interesting result is that the αRFSV model has the smallest variation (sometimes by even more than two orders of magnitude) of the errors for all the days which suggests that the αRFSV model may the most robust of the three models.
Figure 4: Option data structure from April 1, where the diameter of the size of a disk depicts the bootstrap relative error BRE i (15) (left) and the variance V i (14) (right) corresponding to an option of a given combination of K (x-axis) and T (y-axis).The top row belongs to the RFSV model and the bottom one to rBergomi.

Variability of estimations of model coefficients
In order to analyze the variability of the coefficient estimates obtained from the bootcalibrations, we plot and examine scatterplot matrices.
Figure 5 illustrates the model coefficient estimates of the αRFSV model obtained from the bootcalibrations.Since the 5-dimensional parameter space is visualized as a matrix of 2D scatter plots, we can visually examine any patterns between the parameter estimates, while the histograms on the diagonal can provide some insight on the distributions of the parameter estimates.We can observe that there are no visible patterns and the distributions are symmetric with positive kurtosis which are both good properties for estimates.Also, notice that the variation around the bootstrap estimate is of a very small order of magnitude.
In fact, the scatter plot matrix in Figure 5 is qualitatively identical to the scatter plot matrices for other models on any day.There are no visible patterns suggesting dependency between any two coefficient estimates, all the distribution are symmetric with positive kurtosis, and the variation around bootstrap estimates is remarkably low, especially compared to similar scatter plot matrices for the Heston, Bates, and AFSVJD models presented by Pospíšil, Sobotka, and Ziegler (2019).

Summary of robustness analysis and comparison to SV models
Finally, to quantitatively compare the robustness analysis results of the three studied models to the Heston, Bates, and AFSVJD, we organized the results into Table 5 that provides results from May 15, but results from the remaining three days are qualitatively very similar and area vailable on request.The tablessummarize the variation both in absolute relative errors of prices and in the coefficient estimates.To quantify the variance in coefficient estimates by a single number for each model, we calculated the relative (normalized by average) inter-quartile ranges (IQRs) for each model coefficient from the 150 bootcalibrations and then calculated average and maximum from the relative IQRs.The variation in both AREs and coefficient estimates is smaller for rough models by several orders of magnitude compared to the non-rough SV models.Based on these results, we conclude that the rough models are more robust than Bates, Heston, and AFSVJD models.

Sensitivity analysis
We conducted the sensitivity analysis as described in Subsection 3.3, i.e., for each parameter in a given day and for a given model, we tested the null hypothesis that the distribution of the parameter estimates corresponding to the 3/8 "worst" bootcalibrations is the same as the distribution of the parameter estimates belonging to the 3/8 "best" bootcalibrations, using the KS test.
The KS test did not reject the null hypothesis for any of the parameter-model-day combination.That indicates that the studied models are not sensitive to changes in the option structure when being calibrated.Although considerable variation in the values of the αRFSV is still prevalent, the results of the sensitivity analysis suggest that the variation comes mainly from the changes in the option structure, independently of the parameter estimates.
Figure 5: A scatter plot matrix of the parameter estimates from bootcalibrations of the αRFSV model for May 1.The red stars represent the bootstrap estimates (13), while the black crosses represent the estimates from the overall calibration.

Conclusion
Initially, we compared the goodness of fit among the rough SV models we examined, as well as with the Heston, Bates, and AFSVJD models, using the average absolute relative error as the measure.Our findings indicated that none of the fractional SV models demonstrated superior performance for the used data sets.However, the αRFSV model displayed the most consistent results.Notably, when RFSV exhibited better performance for a particular data set, the α parameter tended to be closer to 0. Conversely, when the rBergomi model provided a better fit, the α parameter was closer to 1. Nevertheless, our comparison of the average absolute relative error indicated that the rough models we investigated did not exhibit superiority over the Heston, Bates, and AFSVJD models.
Then we presented the parameter estimates of the overall calibrations and tested the parameters H and α for significance.The two-sample t-test confirmed that both the parameters are very statistically significantly when calibrated for all the four data sets used.
Next, we analyzed the robustness of the rough SV models based on plots of BRE, variances of absolute relative errors across the bootstrapped data sets, and the scatter plot matrices of the parameter estimates.While the BREs were higher for the OTM option in all cases, an interesting results was that the αRFSV had the smallest variation (sometimes by more than two orders of magnitude) of the errors for all days compared to the two other studied models.The scatterplot matrices revealed that there are no patterns suggesting that any pair of parameter estimates would be dependent and the variance of the estimates turned out to be remarkably small, especially compared to the standard SV models.We also provided a table that summarizes the robustness analysis results by quantifying the variation in parameter estimates by several standard statistics for all the studied models (both rough and standard SV).Based on these results, we concluded that the rough models are much more robust than Bates, Heston, and AFSVJD models.
Lastly, we tested the sensitivity of the models to the changes in the option structure when being calibrated.We used a Monte Carlo filtering technique and the KS test.The statistical procedure did not show that the fit of a given model is significantly sensitive to the changes in the option structure.Consequently, we concluded that the persisting variability in the errors originates from changes in the option structure, regardless of the estimated parameters.
During the process of writing the paper, several additional questions and issues arose.Regarding calibration, we could estimate some of the model coefficients from time series, e.g., the Hurst parameter H can be estimated by the method proposed in Gatheral, Jaisson, and Rosenbaum (2018), or the coefficient ρ can be estimated as the correlation between stock price returns and realized volatility changes.We could then analyze the robustness of the models for such cases in a similar fashion.Another possibility is to try a different approach for the calibration itself.Instead of the deterministic gradient-based trust region approach of lsqnonlin(), we could employ a stochastic approximation approach or even the deep-learning method developed by Horvath, Muguruza, and Tomas (2021).
In the paper by Merino, Pospíšil, Sobotka, Sottinen, and Vives (2021), an approximation of the option price in the αRFSV model was derived and numerical experiments therein propose a promising hybrid calibration scheme which combines the approximation formula alongside MC simulations.Since the aim of this paper was to study the model as accurately as possible, we avoided the usage of approximation formula in our robustness and sensitivity analyses tests, however, repeating the same experiments with the usage of approximation formula should be straightforward.

Figure 1 :
Figure1: Call option data structure for AAPL dataset from May, 15, 2015.The positions of disks are given by the combinations of the strikes K on the x-axis and the maturities T on the y-axis of the options listed at the time.The diameter of each disk relates to the corresponding close price.

Figure 2 :
Figure 2: Example of the options call data structure for AAPL call option prices from May, 15, 2015, weighted by (16).The positions of disks are given by the combinations of the strikes K on the x-axis and the maturities T on the y-axis of the options listed at the time.The diameter of each disk relates to the corresponding weighted close price.Compare with Figure1.

Figure 3 :
Figure 3: The market and model prices (left) and the corresponding BREs (right) for the rBergomi model (top row) and the RFSV model (bottom row) on April 01.

Table 1 :
from the referenced paper as Table1.It contains AAREs of the calibrated Heston, Bates, and AFSVJD models.Lastly, we test the significance of the H and α parameters.Average (AARE) of overall calibrations of the Heston, Bates, and AFSVJD models for the same dataset as we use, reprint of(Pospíšil, Sobotka, andZiegler 2019, Table 1).

Table 2 :
Lower and upper bounds for the model coefficients we considered for the overall calibration.

Table 4 :
The overall calibration results.

Table 5 :
Comparison of robustness analysis for different SV models; dataset 05/15.