Implied volatility surfaces: a comprehensive analysis using half a billion option prices

This study delves into the critical aspect of accurately estimating single stock volatility surfaces, a task indispensable for option pricing, risk management, and empirical asset pricing. Utilizing a comprehensive dataset consisting of half a billion daily price observations for options on 499 US individual stocks and the S&P 500, the research investigates the accuracy of diverse methods for constructing volatility surfaces. The comparative evaluation of the three-dimensional kernel smoother by OptionMetrics (IvyDB US file and data reference manual, version 5.2, Rev. 01/27/2022, Computer software manual, New York, 2022), the semi-parametric spline by Figlewski (in: Robert F. Engle (ed) Estimating the implied risk neutral density. Volatility and time series econometrics: Essays in honor, Oxford University Press, Oxford, 2008), and a refined one-dimensional kernel smoother reveals the distinct superiority of the latter. This method consistently outperforms its counterparts across all moneyness, maturity, and liquidity categories, with markedly lower error metrics. The study further uncovers significant distortions in the extraction of Bakshi et al. (Rev Financ Stud 16:101–143, 2003) moments and skewness spanning induced by the noise-infused three-dimensional kernel smoother, which could potentially mislead derivative pricing and trading decisions. The findings offer valuable insights to traders, risk managers, investors, and researchers, suggesting a robust, one-size-fits-all method for crafting more accurate and less noisy volatility predictions. The research advances our understanding of option-implied information, its extraction, and broader implications for financial markets.


Introduction
The accurate estimation of the implied volatility surface of stock options is paramount for option pricing, risk management, and financial research.Existing methods, often tested on highly liquid index options, demonstrate limitations when applied to single stock options.To address this, our study utilizes an extensive dataset of half a billion daily price observations for options on 499 US individual stocks and the S&P 500.Our analysis identifies the one-dimensional kernel smoother as the most reliable and accurate method for constructing daily volatility surfaces.
These findings bear significant implications for a spectrum of stakeholders, including traders, risk managers, investors, and academics.By adopting the onedimensional kernel smoother, they can estimate volatility surfaces with higher precision, leading to improved decision-making and analysis.This research accentuates the importance of a robust method that delivers accurate predictions across diverse options, regardless of their maturity, moneyness, and liquidity variations.
The research contrasts the one-dimensional kernel smoother with the threedimensional kernel smoother by OptionMetrics (2022) and the spline method by Figlewski (2008).The one-dimensional kernel smoother consistently outshines the other methods, delivering the lowest leave-one-out cross-validation root-meansquared errors (RMSE) across different volume buckets, maturity, and moneyness categories.
Moreover, our research uncovers the disruptive effects of noise in the volatility surface construction, particularly from the three-dimensional kernel smoother.Such distortions profoundly impact the Bakshi et al. (2003) moments and skewness spanning regressions, especially for short-term options and during extreme volatility events.These distortions potentially mislead derivative pricing and trading decisions, emphasizing the urgency for a more reliable volatility surface construction method. 1  This study significantly enhances the existing literature in several ways.Firstly, we extend the investigation of volatility surface construction methods to include single stock options of varied liquidity, moving beyond the typically studied liquid index options. 2This extension gains relevance in the current era of big data and machine learning, where single stock options offer forward-looking insights into firm-specific information.
It is important to note that all of the methods examined are agnostic to both the time and the specific firm under consideration.As such, we omit a firm and time index in the following formulas, which are generalizable to different times and single firm combinations.

One-dimensional kernel smoother
In this section, a description of the one-dimensional kernel smoother employed in our approach is provided.
We consider the input volatility surface for a firm and day, denoted by { i, } i∈[1,N ], ∈T .Here, N denotes the total number of observed option-implied moneyness levels for a maturity option and T stands for the set of observed maturities.We denote with m i, ∈ {m i, } i∈ [1,N ], ∈T an ascendingly ordered set of observed option-implied moneyness levels for a specific observed option maturity . 4oneyness m i, is defined as where K and F stand for the strike and the single stock's forward price, respectively.We denote the observed option-implied volatility for m i, as i, .
We let K min and K max denote the smallest and largest observed strike of a -matu- rity option, i.e.
Our goal is to generate a volatility surface for normalized moneyness levels ranging from [−10, 4] .Specifically, we express the normalized moneyness mi, as a func- tion of both the moneyness m i, and the at-the-money implied volatility ATM for the observed option panel at a given maturity .Mathematically, this reads: Initially, we employ Ulrich and Walther (2020)'s one-dimensional kernel smoother to compute i, i∈ [1,N ] with a target moneyness step size of 0.01.Next, we utilize linear regressions to estimate the missing implied volatility-normalized moneyness pairs for our target interval of mi, ∈ [−10, 4] based on the three left and rightmost implied volatility-moneyness pairs.Lastly, we smooth the resulting volatility surface using the same one-dimensional kernel smoother.The one-dimensional kernel smoother takes the following form: (1) 1 3 Implied volatility surfaces: a comprehensive analysis using… where 1d (m j ) is the smoothed -period implied volatility for a target moneyness m j , m i, and i, are observed (and linearly extrapolated) moneyness and option-implied volatility pairs, N represents the total number of input implied volatilities and k (x) is an unnormalized Gaussian kernel function, defined as where h is a maturity-specific bandwidth parameter, defined as where S is the price of the underlying stock.

Three-dimensional kernel smoother
In estimating single firm volatility surfaces, OptionMetrics (2022) employs a threedimensional kernel smoother which includes a maturity dimension , Put/Call flag I j ∈ {0, 1} , and option delta Δ j , with the moneyness dimension being replaced by Δ j .OptionMetrics (2022) uses options with vegas of ≥ 0.5 as input to the three-dimen- sional kernel smoother, producing an implied volatility surface with option deltas in the range of [0.1, 0.9].This range of deltas has been extended compared to Option-Metrics (2016) to address the finding in Ulrich and Walther (2020) that a delta range of [0.2, 0.8] is insufficient for accurately capturing left tail risk in index options.
The three-dimensional kernel smoother takes the form where 3d is the implied volatility smoothed using the three-dimensional kernel smoother, and Δ represent the option's vega and delta, respectively, and is the observed implied volatility with a specific delta, maturity, and call or put flag.The three-dimensional Gaussian kernel takes the form with the respective bandwidth parameters h 1 ∶= 0.05, h 2 ∶= 0.005, h 3 ∶= 0.001.

Spline
The final method we examine is the smoothing spline developed by Figlewski (2008).This semi-parametric approach smoothes the observed implied volatilitymoneyness pairs using an nth-order polynomial with one knot point located at a moneyness of one.
To enhance flexibility, we adopt an adaptive approach by setting the order of the smoothing spline n to vary with N .Specifically, we set the order to increase gradu- ally as the number of observations increases, in order to mitigate overfitting.Concretely, we set the order as follows: The data pre-processing method employed is based on Figlewski (2008).Specifically, the spline is fitted to out-of-the-money call and put options, while smoothing is applied to at-the-money volatilities in order to mitigate noise.The smoothing is performed using the following method: with where * (K i, ) stands for the smoothed implied volatility of observed at-the-money options that is later used in the spline smoothing.We address the potential impact of the minimum amount of input data on the model's performance by reporting results separately for days with a minimum number of inputs of 4 and 5, corresponding to N ≥ 4 and N ≥ 5 , respectively.
The parameters of the spline are estimated by minimizing the least squares of the non in-the-money subset of observed implied volatilities As with the one-dimensional kernel, the spline is fitted for each maturity separately.The resulting smoothed volatilities of the spline are denoted as spline i, .

Validation
We employ the leave-one-out cross-validation technique to validate the accuracy of the three state-of-the-art implied volatility surfaces.This technique is a standard tool in machine learning for assessing the generalization performance of predictive models on unseen data (Geisser, 1993;Kohavi, 1995).Our cross-validation procedure involves sequentially leaving out one observation from a day's option price ( 9) 1 3 Implied volatility surfaces: a comprehensive analysis using… observations and using the remaining data to predict its value, conditional on one of the three methodologies.The iterative nature of this procedure enables us to evaluate the forecast accuracy of each method when predicting the left-out price, based on the information contained in the other prices.We perform two prediction tasks: one that excludes the leftmost and rightmost points and another that includes all observations.This enables us to evaluate the sensitivity of each method to extrapolation-induced overfitting.
Consistent with the literature, we examine implied volatilities rather than prices to evaluate the predictive performance of the three state-of-the-art methods.For each method M ∈ {1d, 3d, spline N ≥4 } and observed option panel, we calculate the root-mean-squared error (RMSE) and the mean absolute error (MAE), and where in order to highlight that the error statistics are calculated across all time periods and all firms, we explicitly incorporate the time index t and a firm index j ∈ [1, 500] .Here, N T represents the total number of time points.As we have 499 firms and the S&P 500, we consider j ∈ [1, 500] .To report the RMSE and MAE for different moneyness and maturity combinations, we adjust the formulas by considering the relevant subsets.

Data
In this section, we describe the data used in our analysis and explain how we calculate single firm dividend yields and the risk-free rate.Implied volatility surfaces: a comprehensive analysis using…

Option prices
In this section, we provide an overview of the option price data used in our analysis, consisting of 373,825,605 daily option prices for 499 S&P 500 constituents and 14,267,068 prices for options on the S&P 500 between January 2004 and July 2019, sourced from CBOE.For our analysis, we use daily quote data and consider the mid price.Table 1 summarizes the input data broken down into moneyness and maturity buckets.The data is categorized into short-, medium-, and long-term based on the remaining time to maturity: 1 to 29, 30 to 365 days, and longer, respectively.Moneyness is classified as at-the-money (ATM), left-tail, and right-tail if the moneyness is between 0.95 and 1.05, below 0.95, and above 1.05, respectively.On average, there were 31.96strikes and 6.23 maturities traded per day and single firm, This table presents the cross-validation MAE and RMSE for S&P 500 options across various maturity and moneyness combinations.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness axis, represented by m = K F , is divided into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The error figures are given in annualized implied volatility units.The Spline* column reports the results for spline with N ≥ 5 data points per fitted line instead of N ≥ 4 .Cells in bold highlight the best performing method per maturity-moneyness bucket.The data covers the period from Jan 2004 to Jul 2019   1 shows that the majority of the input data concentrates in the left tail, with approximately 58% (8,340,544 observations) for index options and 46% (172,918,617 observations) for single firm options.The disparity in the relative proportion of call options between the index and single firm option markets is significant, with right tail options representing only The maturity breakdown in Table 1 shows that medium-term options make up the majority of quoted prices, comprising approximately 60% for both index and single firm options.However, there is a notable difference in the proportion of short-term options, with only 10% classified as short-term for index options, while for single stock options, this figure rises to 27%.
Our option panel data analysis reveals that for index options, the majority of quoted prices are medium to long-term at-the-money (ATM) and out-of-the-money (OTM) put options, while for single stock options, the largest share of quoted prices are short to medium-term OTM put and call options.We apply standard filters to remove duplicates and option prices that do not fulfill basic no-arbitrage restrictions and drop options for which implied volatility did not converge.

Risk-free rate, dividend yields and implied volatility
In this section, we provide a description of how we calculate the risk-free rate, dividend yields, and implied volatilities.We follow the methodology suggested by van Binsbergen et al. (2021) to extract the risk-free rate term structure from index options.This approach utilizes the put-call parity relationship for index options with the same maturity and has become the standard in option pricing research.Similar approaches have also been employed by Ulrich et al. (2023), Golez and Jackwerth (2022), and OptionMetrics (2022) to compute the risk-free rate.
Calculating the dividend yield is a challenging task, and no one-size-fits-all solution exists.We choose to work with one at-the-money call and put price pair for each traded maturity to derive the dividend yield (Hull, 2018).Any measurement error in the dividend yield is equally reflected in all the methods and does not introduce any bias in our comparative analysis.
Lastly, we utilize the approach proposed by Barone-Adesi and Whaley (1987) to convert option prices with early exercise features into implied volatilities.Barone-Adesi and Whaley (1987) extends the Black-Scholes model for an early exercise premium and is known for its favorable trade-off of computation speed and accuracy.
Fig. 3 Boxplots for cross-validation squared prediction errors for single stock options, one-dimensional kernel smoother, by industry sector.This figure displays Boxplots of the average cross-validation squared errors with inter-and extrapolation across single stock options by GICS industry sector for the one-dimensional kernel smoother method.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness buckets, color-coded in the plot, are represented by m = K F , and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days.The data covers the period from Jan 2004 to Jul 2019 ▸ 1 3 Implied volatility surfaces: a comprehensive analysis using… Similar to the dividend yield calculation, any measurement error in the inversion process is evenly spread across all methods. 5Fig. 3 (continued) 5 We have conducted robustness tests on a subset of the data and found that our results are not sensitive to the choice of pricing model.Other approaches commonly employed in the literature include different variations of Binomial trees as well as Ju and Zhong (1999), Geske and Johnson (1984), Kim (1990), Bunch and Johnson (1992), Barone-Adesi (2005), Zhu (2006), among many others.

3
Implied volatility surfaces: a comprehensive analysis using…

Findings
In this section, we present the findings of our analysis.We begin by summarizing the cross-validation error statistics for different specifications, including one where we do not evaluate extrapolation to the left and right most observations.To better understand the cross-validation errors, we record the maximum, standard deviation, 95% quantiles, and the entire error distribution using boxplots.We disaggregate the results for different moneyness, maturity, and volume buckets.
To study the relationship between accuracy and trading volume, we introduce volume buckets.We collect all single stock option volumes for each day and calculate the 0.25 ( q 0.25 ) and 0.75 ( q 0.75 ) volume quantiles.For each day t and single firm i, we classify its daily volume into a low, medium, or high volume bucket, depending on whether the actual volume is (i) below q 0.25 , (ii) between q 0.25 and q 0.75 , or (iii) larger than q 0.75 .
We then present the results for S&P 500 options and compare them to Ulrich and Walther (2020).This is followed by an analysis of single stock options.

S&P 500
In this section, we present our analysis of three methods used to construct implied volatility surfaces for S&P 500 index options.Our focus is on the short-, medium-, and long-term options, and we report our results using leave-one-out cross-validation root-mean-squared errors (RMSE) and mean absolute errors (MAE) in panel (a) of Table 2.
Our findings indicate that the one-dimensional kernel smoother is the only method that produces acceptable, small prediction errors across all moneyness and maturity buckets for S&P 500 index options.Specifically, its short-term RMSE for left, center, and right tail options are 0.0090, 0.0029, and 0.0075, respectively.In contrast, the three-dimensional kernel smoother of OptionMetrics (2022) and the spline exhibit strong evidence of overfitting and noise in their smoothed volatility surfaces.Specifically, their respective RMSEs for the same maturity and moneyness buckets are 0.5566, 0.1181, 0.2690, and 3.7805, 0.0489, and 0.4600.
Moreover, we find that the three-dimensional kernel smoother and the spline exhibit a concerning level of noise in their smoothed implied volatilities for shortterm options.For instance, for the class of short-term out-of-the-money put options, both methods produce noise in implied volatility forecasts that is 61.84 and 420 times larger than prediction noise arising from the one-dimensional kernel smoother.
Our results hold true for medium and long-term options as well, as indicated by the RMSE of predicted versus realized implied volatility.Specifically, the threedimensional kernel smoother of OptionMetrics (2022) and the semi-parametric spline exhibit RMSEs that are approximately ten times larger than those of the onedimensional kernel smoother.Our findings are consistent with previous research by Ulrich and Walther (2020) on the RMSEs of index options in terms of the magnitude of errors and the relative ranking of RMSEs.
Panel (b) of Table 2 confirms that our performance measures are not affected by extrapolation into the left and right tails of the observed implied volatility distribution.The results from both the RMSE and MAE in panels (a) and (b) of Table 2 confirm the superiority of the one-dimensional kernel smoother, which exhibits the smallest prediction errors for S&P 500 index options.
Boxplots for the RMSE of all methods and moneyness/maturity buckets are displayed in Fig. 1, with the results indicating that at-the-money options have the lowest RMSE across all methods.However, the semi-parametric spline exhibits severe difficulties in accurately predicting implied volatility for short-term left tail options.
In conclusion, we strongly advise against using the semi-parametric spline and the three-dimensional kernel smoother of OptionMetrics (2022) for constructing implied volatility surfaces of index options.The one-dimensional kernel smoother emerges as the superior choice, producing negligible noise that falls comfortably within typical bid-ask price bounds.
In the following analysis, we turn our attention to assessing the efficacy of the examined methods in predicting implied volatility for stock options.

Single firms
In this section, we focus on evaluating the performance of each method in predicting implied volatility for single stock options.To our knowledge, there is a gap in the literature in terms of examining the quality of implied volatility surfaces for individual stock options.Therefore, we break down the RMSE results for single stock options into three-dimensional volume/maturity/moneyness buckets to further assess the performance of each method for high and low volume option contracts.
Panel (a) of Table 3 shows the average RMSEs and MAEs for single stock options.The results indicate that even the best-performing method incurs twice as high RMSEs when applied to single stock options on low volume days.For short maturity options in high volume days, the average RMSE for left, center, and right-tail options is 0.0329, 0.0152, and 0.0350, respectively.These respective RMSEs increase to 0.0612, 0.0380, and 0.0879 when looking at low volume single stock options while keeping all else equal.Furthermore, the RMSE Fig. 4 Boxplots for cross-validation squared prediction errors for single stock options, three-dimensional kernel regression, by industry sector.This figure displays Boxplots of the average cross-validation squared errors with inter-and extrapolation across single stock options by GICS industry sector for the three-dimensional kernel smoother method.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness buckets, color-coded in the plot, are represented by m = K F , and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days.The data covers the period from Jan 2004 to Jul 2019 ▸ 1 3 Implied volatility surfaces: a comprehensive analysis using… 1 3 of even the highest volume bucket is roughly four times larger than the overall RMSE of index options, indicating that illiquidity is a significant concern for single stock options and introduces significant noise into their implied volatility surfaces.We then compare how different methods perform in coping with the noise induced by illiquidity in single stock implied volatilities.The semi-parametric spline exhibits catastrophic RMSEs for short-dated options, with RMSEs for high volume left-tail, center, and right-tail options being 0.1787, 15.4456, and 36.8436, respectively. These RMSEs are five, one thousand, and one thousand times higher than those of the best-performing one-dimensional kernel method.The OptionMetrics (2022) three-dimensional kernel smoother performs better than the semi-parametric spline, but its RMSE for high volume short-dated options (left, center, right) is still 13, 20, and 12 times higher than those of the one-dimensional kernel smoother.This result highlights that building option-implied volatility surfaces using the threedimensional kernel smoother cannot be recommended.
Figure 2 indicates that the squared prediction error diminishes as the volume of traded options increases.In particular, the one-dimensional kernel smoother exhibits an exponential decrease in error as volume increases, suggesting that the benefits of additional information from option trading volumes are large.In contrast, the semi-parametric spline performs poorly for low volume short-maturity left-tail single stock options, indicating that the spline is less effective in incorporating information on low volume days into the volatility surface.The three-dimensional kernel smoother aligns with the trend of the one-dimensional smoother, but its forecast errors are significantly larger.
In summary, we find that the one-dimensional kernel smoother outperforms the other two methods examined in this study, across all different volume/maturity/ moneyness buckets, in terms of accuracy in computing implied volatility surfaces for both index and single stock options.Our recommendation is to use the onedimensional kernel smoother in future research that requires low RMSE volatility surfaces.Our findings support the notion that simplicity often pays off in terms of prediction accuracy and generalizability to unseen data.

Industry clusters
In a final step, we group single stocks into industries based on the Global Industry Classification Standard (GICS), which includes eleven sectors: Energy, Materials, Industrials, Consumer Discretionary, Consumer Staples, Health Care, Financials, Information Technology, Communication Services, Utilities, and Real Estate.To explore the industry-specific effects on our findings, we examine the Boxplots for the one-dimensional (Fig. 3), three-dimensional (Fig. 4), and spline methodologies (Fig. 5) across the 11 industries.The results in Fig. 3 reveal that the one-dimensional kernel smoother's performance remains unaltered by industry, exemplifying its robustness across various sectors.Conversely, Fig. 4 highlights that the threedimensional kernel smoother persistently generates larger errors in comparison to the one-dimensional smoother throughout all 11 industries, further supporting our conclusion that the one-dimensional kernel smoother is the superior method for computing volatility surfaces.
We also observe in Fig. 5 that the spline method produces varying results for different industries.Some industries exhibit small errors similar to the one-dimensional kernel smoother, while others have significantly larger errors.These industry-specific differences suggest that the semi-parametric spline method may not be reliable for single stock options analysis, particularly in certain industries.As such, we advise caution when using the spline method and recommend that researchers benchmark their results against those obtained using the one-dimensional kernel smoother.

Lessons from the tails
To complete the analysis, we have a closer look at the (i) standard deviation of the prediction error, (ii) the 95% quantile of the absolute prediction error and (iii) the maximum absolute prediction error.These values are displayed in Table 4.As previously, we aggregate prediction errors across all 499 firms and display results for different volume/maturity/moneyness buckets.
The standard deviation of the prediction error, which can be viewed as the volatility of the distribution of prediction errors, is lower for the one-dimensional kernel smoother than for the three-dimensional kernel smoother or spline.In the high volume and short maturity options bucket, the prediction error volatility for the left-tail region is 0.0229, 0.3515, and 0.0999 for the one-dimensional kernel smoother, three-dimensional kernel smoother, and spline, respectively.The one-dimensional kernel smoother performs well regardless of which part of the implied volatility surface is predicted, while the three-dimensional kernel smoother and spline have difficulties predicting the tails.In summary, the realized standard deviation of single stock implied volatility prediction errors is excessive for all methods except the one-dimensional kernel smoother.
In terms of the 95% quantile of the absolute prediction error, the one-dimensional kernel smoother outperforms the other two methods, with respective values of 0.0370, 0.6297, and 0.2359 for high volume short maturity left-tail implied volatilities.This implies that the worst 5% of mispricing for a short maturity out-of-the-money put option is around 4% (implied volatility units) for the onedimensional kernel smoother, while it is 63% for the three-dimensional kernel smoother of OptionMetrics (2022) and 24% for the spline.The three-dimensional kernel smoother performs significantly worse on days when trading reveals more information, while both the one-dimensional kernel smoother and spline tend to exhibit lower 95% error quantiles as volume increases.

Fig. 5
Boxplots for cross-validation squared prediction errors for single stock options, Spline N ≥4 , by industry sector.This figure displays Boxplots of the average cross-validation squared errors with inter-and extrapolation across single stock options by GICS industry sector for the spline method.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness buckets, color-coded in the plot, are represented by m = K F , and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days.The plot reports the results for spline with N ≥ 4 data points per fitted line.The data covers the period from Jan 2004 to Jul 2019 ▸ 1 3 The average maximum absolute prediction error across all single stock options shows that the one-dimensional kernel smoother outperforms the other two methods by several magnitudes.The maximum absolute implied volatility prediction error for the one-dimensional kernel smoother ranges from 0.7004 to 2.7663,  while the respective values for the three-dimensional kernel smoother are 3.1467 to 52.1571 and 0.2746 to 109.5617 for the spline.
In summary, the one-dimensional kernel smoother is the most reliable method for constructing single stock implied volatility surfaces.The three-dimensional kernel smoother of OptionMetrics ( 2022) and the semi-parametric spline of Figlewski (2008) are both prone to prediction errors with large second and higher moments, suggesting that they may be too complex or unrealistic.We therefore recommend academics and practitioners to use the one-dimensional kernel smoother for constructing end-of-day volatility surfaces.

Further implications for single stock option-implied information
The extraction of meaningful economic insights from implied volatility surfaces can be compromised by the presence of uninformative noise.This is particularly relevant when evaluating the efficacy of option-implied risk measures.In this section, we investigate Bakshi et al. (2003)'s implied moments within the cross-section of implied volatility.Additionally, we extend our analysis to replicate the skewness spanning regressions presented in Bakshi et al. (2003) Table 6, utilizing an expanded dataset that includes a larger number of firms and a longer time span compared to Bakshi et al. (2003).Through these exercises, we aim to demonstrate the potential effects that different methods for constructing volatility surfaces could have on the extraction of information implied by options.

Impact of method selection on Bakshi et al. (2003) moments
Our analysis, as visualized in Fig. 6, underscores the importance of the methodology adopted for constructing the volatility surface.Both the three-dimensional kernel smoother and the spline method inject substantial uninformative noise, resulting in significant distortions in the Bakshi et al. (2003) moments.These distortions are particularly acute for short-term options and in instances of extreme market situations.
For short-term options across varying liquidity days, the distortions become even more pronounced for options-implied skewness and kurtosis, with the three-dimensional kernel smoother drastically underpredicting kurtosis and depicting skewness close to zero.This infers that the implied risk-neutral density for this method resembles a Gaussian distribution, a misleading interpretation when compared to the Fig. 6 Single stock (Bakshi et al., 2003) moments for different volatility surfaces.This figure displays Boxplots of model-free option-implied (Bakshi et al., 2003) moments for the one-dimensional kernel smoother method, the three-dimensional kernel smoother method and spline with N ≥ 3 data points per fitted line.The volume buckets stand for low, medium and high volume trading days.All options are of short maturity with less than 30 days to expiration.The data covers the period from Jan 2004 to Jul 2019 ▸ non-Gaussian return densities implied by the more accurate one-dimensional kernel smoother and, to some extent, the spline method.
This discrepancy raises concerns about the efficacy of the three-dimensional kernel smoother in accurately reflecting the properties of the implied volatility surface.This misrepresentation could lead to significant misconceptions about risk, thus affecting derivative pricing and trading decisions.Our results strongly suggest that, in the context of the Bakshi et al. (2003) moments, the spline method and the onedimensional kernel smoother provide more reliable insights compared to the threedimensional kernel smoother, further emphasizing the need for accurate methodology selection in volatility surface construction.

Noise in skewness spanning
Our skewness spanning regressions (depicted in Fig. 7) corroborate the findings of Bakshi et al. (2003) Table 6: the skewness of the index is more pronounced than that of single stocks.Nonetheless, our results also highlight that the choice of method to construct the volatility surface dramatically affects these observations.
Employing the one-dimensional kernel smoother, we are able to accurately detect significant mapping coefficients, aligning with Bakshi et al. (2003)'s original results.However, when the three-dimensional kernel smoother is applied, the added noise drastically diminishes the spanning R 2 and spanning t-statistics, effectively render- ing them close to zero across firms (see Fig. 8).Such a result conceals the inherent patterns within the data, thus impairing the accurate extraction of option-implied information.
More worryingly, this discrepancy underlines the risk of potential misinterpretations in derivative pricing and trading decisions, which could be led astray by the distortions introduced by the three-dimensional kernel smoother.In this context, our findings argue powerfully for the adoption of more accurate methodologies, like the one-dimensional kernel smoother, for robust volatility surface construction and reliable skewness spanning regressions.This figure displays Histogram plots for statistics of skewness spanning regressions, given by SKEW n (t) = 0 + 1 × SKEW m (t) + n (t) , where SKEW n and SKEW m denote the risk-neutral skewness of a single stock n and the market, for the one-dimensional kernel smoother method.All options are of short maturity with less than 30 days to expiration.The data covers the period from Jan 2004 to Jul 2019 ▸

Conclusion
Our research underscores the robust superiority of the one-dimensional kernel smoother in constructing accurate volatility surfaces, outperforming both the widely utilized three-dimensional kernel smoother of OptionMetrics ( 2022) and the semiparametric spline of Figlewski (2008) across all tested error metrics.The difference becomes particularly stark when considering the worst 5% absolute implied volatility prediction error for short-maturity out-of-the-money put options: less than 4% for the one-dimensional kernel smoother, compared to 24% and 63% for the spline and three-dimensional kernel smoother, respectively.
This study also reveals the profound impact of method choice on the extraction of option-implied information.The use of the three-dimensional kernel smoother introduces significant distortions in Bakshi et al. (2003) moments, particularly for short-term options and during extreme volatility events.These distortions can potentially mislead derivative pricing and trading decisions, underscoring the urgency of adopting a more accurate and reliable method for constructing daily implied volatility surfaces across all moneyness and maturity ranges.
Further, by extending our investigation to single stock options and incorporating a considerably larger dataset than earlier studies, our research provides a comprehensive evaluation of methodologies for constructing accurate volatility surfaces in the era of abundant data.This contribution deepens our understanding of optionimplied information, its extraction process, and the far-reaching implications for broader financial markets, offering valuable insights for practitioners and academics alike.
Fig. 8 Statistics of skewness spanning regressions by methodology.This figure displays Histogram plots for statistics of skewness spanning regressions, given by SKEW n (t) = 0 + 1 × SKEW m (t) + n (t) , where SKEW n and SKEW m denote the risk-neutral skewness of a single stock n and the market, for the onedimensional kernel smoother method, the three-dimensional kernel smoother method and spline with N ≥ 3 data points per fitted line.All options are of short maturity with less than 30 days to expiration.The data covers the period from Jan 2004 to Jul 2019 ▸ 1 3 Implied volatility surfaces: a comprehensive analysis using…

Fig. 1
Fig. 1 Boxplots for cross-validation squared prediction errors for S&P 500 options.This figure displays Boxplots of the cross-validation squared errors with inter-and extrapolation for S&P 500 options.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness buckets, color-coded in the plot, are represented by m = K F , and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days.The Spline plot reports the results for spline with N ≥ 4 data points per fitted line.The data covers the period from Jan 2004 to Jul 2019

Fig. 2
Fig. 2 Boxplots for cross-validation squared prediction errors for single stock options.This figure displays Boxplots of the average cross-validation squared errors with inter-and extrapolation across single stock options.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness buckets, color-coded in the plot, are represented by m = K F , and divides the volatility smile into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The volume buckets 0.25, 0.5, 0.75 stand for low, medium and high volume trading days.The Spline plot reports the results for spline with N ≥ 4 data points per fitted line.The data covers the period from Jan 2004 to Jul 2019

Fig. 7
Fig. 7Statistics of skewness spanning regressions for one-dimensional kernel smoother.This figure displays Histogram plots for statistics of skewness spanning regressions, given by SKEW n (t) = 0 + 1 × SKEW m (t) + n (t) , where SKEW n and SKEW m denote the risk-neutral skewness of a single stock n and the market, for the one-dimensional kernel smoother method.All options are of short maturity with less than 30 days to expiration.The data covers the period from Jan 2004 to Jul 2019

Table 1
Summary statistics of data for cross-validating volatility surface construction methods

Table 2
Cross-validation prediction errors for implied volatilities of S&P 500 options

Table 3
Cross-validation prediction errors for implied volatilities of single stock options

Table 3 (
This table presents the cross-validation MAE and RMSE for single stock options across various maturity and moneyness combinations.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness axis, represented by m = K F , is divided into left, center, and right sections, with the dividing lines at 0.95 and 1.05.The error figures are given in annualized implied volatility units.The Spline* column reports the results for spline with N

Table 4
Tails of the cross-validation prediction error for single stock implied volatilities This table presents selected statistics of the cross-validation forecast error of different methods (1d-, 3d-kernel smoother and spline) for single stock option implied volatilities across various volume, maturity and moneyness combinations.Reported quantities are 95% quantile of the absolute forecast error, the maximum absolute forecast error and the standard deviation of the forecast error.The maturity section categorizes options based on remaining maturities, with short maturity options having less than 30 days, medium maturity options having 30 to 365 days, and long maturity options having more than 365 days.The moneyness axis, represented by m