A machine learning approach to univariate time series forecasting of quarterly earnings

We propose our quarterly earnings prediction (QEPSVR) model, which is based on epsilon support vector regression (ε-SVR), as a new univariate model for quarterly earnings forecasting. This follows the recommendations of Lorek (Adv Account 30:315–321, 2014. https://doi.org/10.1016/j.adiac.2014.09.008), who notes that although the model developed by Brown and Rozeff (J Account Res 17:179–189, 1979) (BR ARIMA) is advocated as still being the premier univariate model, it may no longer be suitable for describing recent quarterly earnings series. We conduct empirical studies on recent data to compare the predictive accuracy of the QEPSVR model to that of the BR ARIMA model under a multitude of conditions. Our results show that the predictive accuracy of the QEPSVR model significantly exceeds that of the BR ARIMA model under 24 out of the 28 tested experiment conditions. Furthermore, significance is achieved under all conditions considering short forecast horizons or limited availability of historic data. We therefore advocate the use of the QEPSVR model for firms performing short-term operational planning, for recently founded companies and for firms that have restructured their business model.


Introduction
The quarterly earnings reported by a company is an accounting figure of great significance.Quarterly earnings can be used to track performance in the context of management and debt contracts (Dechow et al. 1998), and are reflective of corporate governance (Chen et al. 2015).Isidro and Dias (2017) also show that earnings are strongly related to stock returns in volatile market conditions, while Zoubi et al. (2016) consider disaggregated earnings to better explain variation in stock returns.Furthermore, differences between forecasted and actual earnings have been used to calculate a firm's market premium (Dopuch et al. 2008).
The prediction of future quarterly earnings using univariate statistical models has been the subject of extensive research.Lorek and Willinger (2011) and Lorek (2014) claim that the autoregressive integrated moving average (ARIMA) model proposed by Brown and Rozeff (1979), denoted by BR ARIMA, is the premier univariate statistical model for the prediction of quarterly earnings.Its functional form is where Y q is the earnings of a company for a quarter q , is an autoregressive parameter, is a moving average parameter and q is the disturbance term at q.
While the BR ARIMA model has been praised for its predictive accuracy on historic data, Lorek (2014) expresses concerns about its ability to describe more recent quarterly earnings.This is because research suggests that the time series properties of quarterly earnings have changed significantly since the development of the BR ARIMA model four decades ago.For instance, Baginski et al. (2003) note an apparent decline in the persistence of quarterly earnings from 1967 to 2001.This could be explained by the increased prevalence of high-tech firms (Kwon and Yin 2015).Lorek and Willinger (2008) also find that for an increasing number of companies, the quarterly earnings series no longer exhibit any significant seasonality.Furthermore, Klein and Marquardt (2006) highlight an increasing frequency of negative quarterly earnings, which according to Hayn (1995) disrupts autocorrelation patterns.Lorek (2014) therefore advocates research towards developing a new univariate model for the prediction of quarterly earnings.
In this paper we introduce a new model based on epsilon support vector regression (ε-SVR) (Smola and Schölkopf 2004;Vapnik 1995), termed the quarterly earnings prediction (QEP SVR ) model.We choose ε-SVR as a supervised learning algorithm because of (1) the guarantee of finding a globally optimal solution, (2) a sparse solution space, meaning only some training points contribute to the solution, and (3) the use of dot products to facilitate efficient computation of nonlinear solutions (Thissen et al. 2003).Furthermore, ε-SVR has been shown to generally perform well in many real-world applications (Hastie et al. 2009).Initial trials were also conducted using random forests (Breiman 2001), gradient boosting (Friedman 2001), and gaussian processes (Rasmussen and Williams 2006) as base supervised learners for the QEP SVR model.However, utilizing ε-SVR yielded the highest predictive accuracy.
Our QEP SVR model retains the univariate character of the BR ARIMA model, meaning all predictive features are derivable from historic quarterly earnings series.Unlike the BR ARIMA model however, the QEP SVR model is fitted using the historic quarterly earnings of multiple firms.The objective of this paper is to analyze and compare the predictive accuracy of the QEP SVR model to that of the state-of-the-art BR ARIMA model under a multitude of conditions.
The rest of the paper proceeds as follows.Section 2 familiarizes the reader with the basics of ε-SVR.Section 3 then describes the QEP SVR model in detail.This is followed by an explanation of the research method for comparing the QEP SVR model to the BR ARIMA model in Sect. 4. The experimental results are discussed in Sect.5, while Sect.6 offers concluding remarks.

Epsilon support vector regression (ε-SVR)
ε-SVR (Smola and Schölkopf 2004;Vapnik 1995) is an integral part of the QEP SVR model; its key ideas are introduced in this section.Consider training data of the form x 1 , y 1 , … , x n , y n ⊂ X × ℝ , where X is the input space (e.g.X = ℝ d ), i.e. x i ∈ X and y i ∈ ℝ for i = 1, … , n .The goal of ε-SVR regression is to fit a model, f (x) , to the training data, such that (1) the maximum deviation of each training point y i from f x i is ε, and (2) f (x) is as flat as possible.The procedure for achieving this is introduced for cases where f (x) is linear first, followed by an extension to nonlinear models.
A linear function is stipulated as where w ∈ X, b ∈ ℝ , and ⟨w, x⟩ denotes the dot product of w and x.For this function, flat- ness refers to w being small.ε-SVR fits a linear function to training data by solving the fol- lowing convex optimization problem: where ∥w∥ 2 = ⟨w, w⟩ .Since satisfying these constraints may be infeasible, slack variables i and * i are introduced: Here, deviations only contribute to the total error when The trade-off between attaining a flat f and not allowing deviations greater than ε is controlled by the con- stant C > 0 .However, the above optimization problem is usually solved in its dual formula- tion.The involves the construction of a Lagrange function, L: where the variables i , * i , i , * i ≥ 0 are Lagrange multipliers.To find the optimal solution, the partial derivatives of L with respect to the variables w, b, i and * i are set to 0: The substitution of these equations into L gives the dual optimization problem, in which i and * i vanish: The optimization problem can therefore be stipulated in terms of dot products between the training data.This observation is key for the subsequent extension of ε-SVR to nonlinear f (x) .Since rearranging w L gives it follows that f (x) can be rewritten as a linear combination of the training inputs, x i : ε-SVR is also capable of fitting nonlinear f (x) to training data.This requires a mapping of the input space X to another space, X * .The dual optimization problem specified above could then be solved for the modified training data, x * 1 , y 1 , … , x * n , y n ⊂ X * × ℝ .However, computing dot products in X * may be computationally infeasible.Instead, a ker- nel function K can be used to compute the dot products ⟨x * i , x * j ⟩ specified in the optimi- zation problem from X directly-avoiding the explicit transformation X ↦ X * .The dual optimization problem is thus rewritten as where flatness of f (x) is maximized in X * rather than in X .The fitted nonlinear function is stipulated as follows: Finally, the variable b can be estimated using the Karush-Kuhn-Tucker conditions (Karush 1939;Kuhn and Tucker 1951), which are not discussed further.

The QEP SVR model
The notation used in the description of the QEP SVR model is as follows.Firstly, quarterly earnings series are assigned relative integer indices.For example, if Y q |q ∈ ℤ denotes the fourth quarter (Q4) earnings for 2015 in a series Y , then Y q−5 denotes the Q3 earnings for 2014.The unmodified quarterly earnings series are referred to as original ( orig) series.However, differenced ( diff ) and quarterly differenced ( qdiff ) series are also considered in order to expose quarter-by-quarter and quarter-to-quarter earnings relationships, respectively (Lorek and Willinger 2011).Their derivation from orig series is shown below.
This means that for a continuous orig series of length n , the derived diff and qdiff series have lengths n − 1 and n − 4 , respectively.
At a high level, the QEP SVR model can be thought of as a sequence of data manipulation steps.Given a historic quarterly earnings series, the QEP SVR model first extracts several explanatory variables (features).Next, a heuristic is used to estimate the predictive power of each feature -only the top features are selected.The retained features are subsequently scaled to normalize their range.An ε-SVR model is then applied to the scaled features, yielding a (scaled) one-step-ahead quarterly earnings prediction.Finally, depending on the configuration of the ε-SVR model, this prediction is scaled back into the range of the original series.This gives the true one-step-ahead quarterly earnings prediction.
The BR ARIMA model and the QEP SVR model always return one-step-ahead predictions when evaluated.We therefore obtain multi-step-ahead predictions via a series of one-step-ahead predictions.For example, making a two-step-ahead prediction for a historic series of length AvailableSeriesLength(ASL) ∈ ℕ + , orig −ASL , … , orig −1 , requires (1) evaluating a model for orig −ASL , … , orig −1 to obtain the one-step-ahead prediction of orig 0 , denoted by orig * 0 , then (2) evaluating the same model again for orig −ASL+1 , … , orig * 0 , yielding the desired two-step-ahead prediction of orig 1 , denoted by orig * 1 .Note that the model parameters are the same in (1) and (2), i.e. the BR ARIMA and QEP SVR models are not refitted during multi-step-ahead predictions.
The QEP SVR model has two operations: fit and predict.Fitting is the process of estimating the QEP SVR model's parameters from a set of historic quarterly earnings series, while the predict operation uses the model parameters to make one-step-ahead quarterly earnings forecasts.We describe the QEP SVR model in the context of these two operations for the remainder of this section, hereby refining the high-level approach outlined in the previous paragraph.
We define both the fitting and prediction operations of the QEP SVR model in pseudocode.The notation i[c] refers to an item i in a collection c .Furthermore, the notation x ← e denotes the assignment of the value of an expression e to a variable x .All refer- enced functions are prefixed by the name of the module they belong to.There are four modules, Extraction , Selection , Scaling and SVR , each of which encapsulates a specific type of data manipulation.While the predict operation is explained first, some functions and parameters referenced by both operations are explained more closely in the subsequent description of the fit operation.
The input origSeriesCollection of the predict operation is a set of m con- tinuous orig series.All series in origSeriesCollection are of equal length AvailableSeriesLength(ASL) ∈ ℕ + and have the form orig −ASL , … , orig −1 .The first step of the predict operation is to extract the diff and qdiff series corresponding to each of the m orig series in the input.This is done via the fTransform function of the Extraction module, yielding an output matrix X 1 ∈ ℝ m× (3⋅ASL−5) .Each of the m rows of X 1 contains the 3 ⋅ ASL − 5 elements of the orig , diff and qdiff series for a single input series.Since the elements of each row are aligned, the column indices of X 1 are the set These indi- ces are also referred to as features.
The next step is to remove columns from X 1 corresponding to features with low predictive power.This is done using the transform function of the Selection module.It requires a model parameter, selectedFeatures , that specifies the set of k features to retain.Note that all model parameters in the modelParams collection are determined using the QEP SVR model's fit operation.Subsequently, each of the remaining columns in X 2 ∈ ℝ m×k are scaled by the transform function of the Scaling module.This produces a scaled matrix, X 3 ∈ ℝ m×k .The scaling of each column is controlled by the model parameter scaleParamsX.
Next, the predict function of the SVR module uses the model parameter svrParams to estimate the value of a specific target variable for each row of X 3 .The result, Y 1 ∈ ℝ m×1 , is then subject to inverse scaling by the invTransform function of the Scaling module to produce Y 2 ∈ ℝ m×1 .Note that for a given matrix X and constant scaling parameters P , the application of Scaling.invTransform to the output of Scaling.transform(X,P) (and vice versa) yields the original matrix, X.
The final step of the predict operation is the application of the Extraction mod- ule's invTTransform function to Y 2 .This ensures that the output, Y 3 ∈ ℝ m×1 , consists of one-step-ahead predictions for the m input orig series in origSeriesCollection , since the SVR.predict function may have returned (scaled) predictions for the corresponding diff or qdiff series instead.The Extraction.invTTransform is notified of this through the model parameter targetVar , and uses the previously determined X 1 ∈ ℝ m× (3⋅ASL−5) to compute the following: where the notation X orig −q 1 denotes the column of X 1 corresponding to the feature orig −q .The fit operation determines model parameters for a specified orig series length, ASL , for which one-step-ahead forecasts are to subsequently be made via the predict operation.Fitting is controlled by hyperparameters ( hyperParams ).As for the predict operation, the input origSeriesCollection is a set of continuous orig series.For the fit operation however, each series must have a length of at least ASL + 1 and lengths are not required to be equal.
The first step of the fit operation is to decompose the series in origSeriesCollection .Conceptually, this is done by sliding a window of length ASL + 1 along each input series, one step of a time.For example, given ASL = 6 , a single input series orig 1 … , orig 8 would be split into two series: orig 1 , … , orig 7 and orig 2 , … , orig 8 .Assume this yields a total of s series, each of the form orig −ASL , … , orig 0 .Next, the first ASL elements of each of these series are assigned to a row of the matrix X 1 ∈ ℝ s×ASL , while the last element of each series is assigned to the corresponding row in Y 1 ∈ ℝ s×1 .This decomposition is performed by the Extraction.windowTransformfunction.
The fTransform function of the Extraction module also accepts orig series in matrix form (in addition to set form, as in the predict operation).It uses X 1 to create a feature matrix X 2 ∈ ℝ s× (3⋅ASL−5) .The target matrix Y 2 ∈ ℝ s×1 is computed by the Extraction.tTransformfunction using X 2 , Y 1 and the hyperparameter targetVar as follows where X orig −q 2 denotes the column of X 2 corresponding to the feature orig −q .Note that since targetVar is also required as a model parameter by the predict operation, it is added to the modelParams collection returned by the fit operation.
The next step is to choose the k ∈ ℕ + features in X 2 with the highest estimated explana- tory power for predicting Y 2 .This is performed by the fit function of the Selection mod- ule.A heuristic for explanatory power is the mutual information score (Cover and Thomas 1991) of a feature with the target variable.It is defined as where A and B are continuous random variables, P (A,B) denotes the joint probability density function of A and B , and P (A) and P (B) are the marginal probability density functions of A and B , respectively.Intuitively, mutual information describes the degree of uncertainty reduction in A through knowledge of B .A higher mutual information score therefore sug- gests greater explanatory power.Once the set of features selectedFeatures is determined, Selection.transformremoves all unimportant features from X 2 to produce X 3 ∈ ℝ s×k .
Subsequently, the columns of X 3 and Y 2 are scaled.The type of scaling is determined by the scaleTypeX and scaleTypeY hyperparameters, respectively.These are either set to none, Gaussian or quantile Gaussian.The fit function of the Scaling module determines the scaling parameters scaleParamsX and scaleParamsY , which are specific to the chosen scal- ing type ( scaleParamsX contains scaling parameters for each of the k columns of X 3 ).The Scaling module's transform function infers the scaling type from the scaling parameters and applies the scaling to produce X 4 ∈ ℝ s×k and Y 3 ∈ ℝ s×1 .
If the scaling type is none, the scaling parameters are an empty set.Applying this type of scaling has no effect on the input matrix.In the case of Gaussian scaling, the scaling parameters are the sample mean and sample standard deviation of each column.The transform function of the Scaling module uses these parameters to replace each element of an input matrix X , with the z-score where x i,j is the element in the ith row and jth column of X , and xj and s j are the sample mean and sample standard deviation of the jth column of X , respectively.Conversely, call- ing the Scaling.invTransformfunction (referenced in the predict operation) with Gaussian has the effect of replacing each element x i,j with s j x i,j + xj .Since Gaussian scaling depends on sample means, it is susceptible to outliers.Quantile Gaussian scaling uses a deterministic rank-based inverse normal transformation (Beasley et al. 2009), which is much more robust in the presence of outliers.The scaling parameters are a set of functions, Q j , where Q j ∶ [0, 1] ↦ ℝ is a quantile function for the jth column of the matrix to be scaled.In this case, applying Scaling.transformreplaces each element of an input matrix X , x i,j , with −1 (Q −1 j x i,j ) , where Q −1 j is the inverse quantile function for the jth column and −1 is the inverse cumulative distribution function of a standard normal distribution.The application of Scaling.invTransformreplaces each x i,j with Q j x i,j .After the completion of scaling, the svrParams model parameter is determined using the fit function of the SVR module.The SVR module performs ε-SVR, which was introduced in Sect. 2. In this case X 4 and Y 3 are considered as training data of the form , where the notation X (i) 4 denotes the ith row of X 4 .ε-SVR fits a model, f ∶ ℝ k ↦ ℝ , to this training data, i.e. it learns the mapping from each row of X 4 to the corresponding element of Y 3 .In the context of the QEP SVR model, f has the functional form where X ∈ ℝ k , i and * i are Lagrange multipliers associated with the ith training point and b is a constant.The kernel function K is chosen to be the squared exponential (Rasmussen and Williams 2006), stipulated as where X, X � ∈ ℝ k , ∥X − X � ∥ denotes the Euclidean norm of X − X � , and the hyperparameter ∈ ℝ is included in the hyperParams collection provided as an input to the fit operation.As specified in Sect.2, the value of b can be estimated using the Karush-Kuhn-Tucker conditions (Karush 1939;Kuhn and Tucker 1951), while the Lagrange multipliers are found by solving the reformulated dual optimization problem: Recall that the constant C > 0 controls the trade-off between attaining a flat f and not allowing deviations greater than ε.Both ε and C are hyperparameters that must be passed to the fit operation.The set of Lagrange multipliers i , * i |i = 1, 2, … s , the training points {X (i)  4 |i = 1, 2, … s} and the constant b are assigned to the collection svrParams.Finally, selectedFeatures , scaleParamsX , scaleParamsY , svrParams as well as the hyperparameter targetVar are returned by the fit operation as the set of model parameters required by the predict operation of the QEP SVR model.

Research method
At any point during their practical application, the BR ARIMA model and the QEP SVR model are in one of two phases: development or operation.When in operation, a model can be evaluated.Evaluating a model means passing it a continuous historic orig series, −ASL , … , orig −1 , for which it returns a one-step-ahead prediction of orig 0 , denoted by orig * 0 .As mentioned previously, both models consider an iterative approach to making multi-step-ahead predictions, during which a model is not refitted.Model parameters therefore cannot be modified during operation.
The development phase consists of any activities that prepare a model for operation.In the case of the BR ARIMA model, the values of (the autoregressive parameter) and (the moving average parameter) are firm-specific.This means that they are estimated from the single orig series for which predictions are to be made (Lorek and Willinger 2011).Once and are determined, the BR ARIMA model is immediately evaluated to obtain predictions.Conversely, the hyperparameters and model parameters of the QEP SVR model are estimated from the historic data of a collection of firms.The model can then be evaluated for multiple firms before entering another development phase.
The predictive accuracy of the QEP SVR model is compared to that of the BR ARIMA model under different experiment conditions.A condition is described by two variables: available series length ( ASL ) and prediction steps ( PS ).ASL is the length of the series for which predictions are made during operation, while PS specifies how many quarters into the future earnings should be predicted (i.e.given orig −ASL , … , orig −1 , the element orig PS−1 is to be predicted).A total of 28 experiment conditions, (ASL, PS) ∈ {6, … , 12} × {1, … , 4} , are considered.The chosen measure of predictive accuracy is mean absolute percentage error (MAPE) (Makridakis et al. 1982).It is defined as where n is the number of predictions, and F and A denote the forecasted and actual quar- terly earnings.As in Lorek and Willinger (2011), prediction errors exceeding 100 percent are truncated to 100 percent.This is done to avoid the effects of explosive prediction errors.
The data used for the experiment consists of the orig quarterly earnings series of 117 companies across the German DAX, MDAX, SDAX and TecDAX stock market indices.1Each series contains 24 consecutive quarterly earnings from Q1 2012 to Q4 2017, giving a total of 2808 quarterly earnings.Table 1 summarizes the distribution of quarterly earnings and yearly book value of total assets of the companies in the experiment data.All values are in millions of Euros, to the nearest hundred thousand.
The main industries of companies in the experiment data are highlighted in Table 2. Industry classification is performed according to the International Standard Industrial Classification of All Economic Activities (ISIC) system (United Nations, 2008).
The MAPE values of the QEP SVR model are calculated for all considered experiment conditions by executing GetQEPErrors .This operation implements tenfold cross-valida- tion.The input S is therefore a set of 10 disjoint subsets (folds), S 1 , … , S 10 , of the 117 series in the experiment data.Each fold is of roughly equal size.The superscript notation (a ∶ b) slices all series in a set by removing earnings at quarters before a and after b .All MAPE values are calculated for predictions of quarterly earnings in predYear only.
Figure 1 illustrates four steps of how GetQepErrors makes predictions for predYear = 2016 at ASL = 8 and fold = 10 , assuming each fold S i were to consist of a single series.In the first step, one to four-step-ahead predictions are made for S 10 , while only a one-step-ahead prediction can be made for S 10 in the fourth step.The hyperparameters required by the fit operation of the QEP SVR model are determined once for all experiment conditions.This is done by minimizing the MAPE values for predictions of quarterly earnings in 2016: The set of optimal (lowest) MAPE values found during hyperparameter optimization are referred to as validation errors: However, a more accurate estimate of the QEP SVR model's predictive accuracy is obtained by using the optimized hyperparameters to make predictions for quarterly earnings in 2017.This is because the data for 2017 is not observed during hyperparameter optimization.These errors are referred to as testing errors and are obtained as follows: The MAPE values for the BR ARIMA model are also calculated for predictions of quarterly earnings in 2016 and 2017.These are obtained by calling GetBrArimaErrors (experimentData, 2016) and GetBrArimaErrors (experimentData, 2017) , where experimentData denotes the set of all 117 orig series.The BR.fit function determines the values of and by minimizing the sum of squared disturbance terms.MAPE values are compared on a per-company basis.This means that the errors returned by and GetQepErrors are aggregated such that there are 117 MAPE values for the BR ARIMA model and the QEP SVR model under each of the 28 experiment conditions.Hypothesis testing is then performed to asses if the 117 MAPE values optimalHyperParams ← argmin hyperParams GetQepErrors (S, 2016, hyperParams).validationErrs ← GetQepErrors (S, 2016, optimalHyperParams) testingErrs ← GetQepErrors(S, 2017, optimalHyperParams) calculated for the QEP SVR model are significantly lower than those calculated for the BR ARIMA under each condition, i.e. if the QEP SVR model has a significantly higher predictive accuracy than the BR ARIMA model.Since this involves paired samples, the paired t test (Kim 2015) and the Wilcoxon signed-rank test (Wilcoxon 1945) are used.In both cases, the one-tailed test is considered.
For a given condition, let BR and QEP denote the mean of the 117 MAPE values calculated for the BR ARIMA and QEP SVR models, respectively.Similarly, let M BR and M QEP denote the median of the 117 MAPE values for each model.The null and alternative hypotheses of the paired t-test, H T 0 and H T 1 , are stated as while the null and alternative hypotheses of the Wilcoxon signed-rank test, H W 0 and H W 1 , are

Results
The hyperparameter values that minimize the validation errors are shown in Table 3. Table 4 shows the results of comparing the validation and testing errors of the QEP SVR model to the corresponding prediction errors of the BR ARIMA model under each of the 28 experiment conditions.
Table 4 shows that 51 out of the 56 p-values calculated using the paired t-test lie below a level of 0.05, while 52 out of the 56 p-values calculated using the Wilcoxon signed-rank are below 0.05.We assume statistical significance under a condition if the p-values for a statistical test lie below 0.05 for predictions in both 2016 and 2017.The results in Table 4 therefore provide evidence for the rejection of H T 0 and H W 0 in favor of H T 1 and H W 1 , respectively, under 24 of the 28 experiment conditions.The four conditions, ( ASL , PS ), with insufficient evidence for rejection are (9, 3) , (9, 4) , (11, 4) and (12, 4).
The 24 significant conditions include all those where PS ∈ {1, 2} (i.e.having short forecast horizons).This leads to the first result: The predictive accuracy of the QEP SVR model significantly exceeds that of the BR ARIMA model for short forecast horizons.This means the QEP SVR model is particularly suitable for companies considering shortterm operational planning.The features orig −4 and diff −3 have the highest mean selection probabilities and have a selection probability of at least 0.95 across all conditions.After becoming available for selection at ASL = 8 , the feature qdiff −4 is selected with a probability of at least 0.88 under all conditions with ASL ≥ 8 .The feature qdiff −1 is never selected for ASL ≥ 10 .Only 7 different features have selection probabilities exceeding 0.1, despite the pool of available features increasing from 13 (when ASL = 6 ) to 31 (when ASL = 12 ).The fact that 6 out of these 7 features are already available for selection at ASL = 6 suggests that increasing ASL may have little effect on the predictive accuracy of the QEP SVR model.

Conclusion
Following the recommendations of Lorek (2014), we propose our QEP SVR model as a new univariate statistical model for the prediction of quarterly earnings.Empirical evidence shows that under 24 out of 28 tested conditions, the predictive accuracy of the QEP SVR model is significantly higher than that of the state-of-the-art BR ARIMA model.Furthermore, the significant conditions include all those considering one and two-step-ahead predictions ( PS ∈ {1, 2} ), as well as those for which only limited historic data is available ( ASL ∈ {6, 7, 8} ).The experimental results therefore advocate using the QEP SVR model instead of the BR ARIMA model for short-term operational planning, for recently founded companies and for companies that have recently made fundamental changes to their business model.
Since the hyperparameters and model parameters of the QEP SVR model are determined from the historic data of multiple companies, further research is needed to understand how the choice of these companies affects predictive accuracy.Factors of interest include the industry, size, and diversity of companies, as well as their relationship to those companies for which predictions are to be made.Other areas of research include studying the effect of condition-specific hyperparameter optimization and exploring methods for combining the QEP SVR model with other forecasting methods.As an example, predictions of the QEP SVR

Fig. 1
Fig. 1 How the GetQepErrors operation makes predictions

Table 1
Distribution of quarterly earnings and book value of total assets

Table 3
features being selected (i.e. a probability of 1 implies that a feature is always selected) across all predictions of quarterly earnings in 2016 and 2017.Features are arranged from left to right in descending order of mean selection probability across all ASL values.