Regression Analysis and Estimating Regression Models

Guerard, John B.; Saxena, Anureet; Gultekin, Mustafa

doi:10.1007/978-3-030-43547-9_12

John B. Guerard Jr.⁴,
Anureet Saxena⁵ &
Mustafa Gultekin⁶

1 Citations

Abstract

A forecast is merely a prediction about the future values of data. Financial forecasts span a broad range of areas, and each of the forecasts is of interest to a number of people and departments in a firm. A sales manager may wish to forecast sales (either in units sold or revenues generated). This prediction is of interest to the operations (manufacturing) department in order to predict the materials and time needed to create the product. The corporate financial officer is interested in the amount of cash required to support the projected level of sales and how much available cash inflow he can eventually expect to pay financial costs, cover expansion programs, and provide cash payouts to investors. In short, good forecasting underlies the construction of an operational cash budget.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
See chapter “Financing Current Operations and the Cash Budget.”
2.
The reader is referred to an excellent statistical reference, such as Irwin Miller and J.E. Freund, Probability and Statistics for Engineers, (Englewood Cliffs, NJ: Prentice-Hall, 1965).
3.
Cochrane D. and G.H. Orcutt. 1949. “Application of Least Squares Regression to Relationships Containing Autocorrelated Error Terms,” Journal of the American Statistical Association 44: 32–61.
4.
The reader is referred to C.T. Clark and L.L. Schkade, Statistical Analysis for Administrative Decisions (Cincinnati: South-Western Publishing Company, 1979) for an excellent treatment of this topic.

References

Beaton, A. E., & Tukey, J. W. (1974). The fitting of power series, meaning polynomials, illustrated on bank-spectroscopic data. Technometrics, 16, 147–185.
Article Google Scholar
Belsley, D. A., Kuh, E., & Welsch, R. E. (1980). Regression diagnostics: Identifying influential data and sources of collinearity. New York: Wiley. Chapter 2.
Book Google Scholar
Burns, A. F., & Mitchell, W. C. (1946). Measuring business cycles. New York: NBER.
Google Scholar
Efron, B., Hastie, T., Johnstone, J., & Tibshirani, R. (2004). Least angle regression. The Annals of Statistics, 32, 407–499.
Article Google Scholar
Guerard Jr., J. B. (2001). A note on the forecasting effectiveness of the U.S. leading economic indicators. Indian Economic Review, 36, 251–268.
Google Scholar
Guerard Jr., J. B. (2004). The forecasting effectiveness of the U.S. leading economic indicators: Further evidence and initial G7 results. In P. Dua (Ed.), Business cycles and economic growth: An analysis using leading indicators (pp. 174–187). New York: Oxford University Press.
Google Scholar
Guerard, J. B., Jr., & Schwartz, E. (2007). Quantitative corporate finance. New York: Springer.
Google Scholar
Guerard, J. B., Xu, G., & Wang, Z. (2019). Portfolio and Investment Analysis with SAS^®: Financial Modelling Techniques for optimization. Cary, NK: SAS Press.
Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2016). The elements of statistical learning: Data mining, inference, and prediction (2nd ed, 11th printing). New York: Springer.
Google Scholar
Huber, P. J. (1973). Robust regression: Asymptotics, conjectures, and Monte Carlo. Annals of Statistics, 1, 799–782.
Article Google Scholar
Johnston, J. (1972). Econometric methods (2nd ed.). New York: McGraw-Hill.
Google Scholar
Levanon, G., Manini, J-C., Ozyildirim, A., Schaitkin, B., & Tanchua, J. (2015). Using financial indicators to predict turning points in the business cycle: the case of the leading economic index for the United States. International Journal of Forecasting, 31, 419–438.
Google Scholar
Maronna, R. A., Martin, R. D., Yohai, V. J., & Salibian-Barrera, M. (2019). Robust statistics: Theory and practice (with R). New York: Wiley.
Google Scholar
Mitchell, W. C. (1913). Business cycles. New York: Burt Franklin reprint.
Google Scholar
Mitchell, W. C. (1927). Business cycles: the problem & its setting. New York: National Bureau of Economic Research.
Google Scholar
Mitchell, W. C. (1951). What happens during business cycles: A progress report. New York: NBER.
Google Scholar
Moore, G. H. (1961). Business cycle indicators, 2 volumes. Princeton: Princeton University Press.
Google Scholar
Zarnowitz, V. (1992). Business cycles: Theory, history, indicators, and forecasting. Chicago: University of Chicago Press.
Book Google Scholar

Download references

Author information

Authors and Affiliations

McKinley Capital Management, LLC, Anchorage, AK, USA
John B. Guerard Jr.
McKinley Capital Management, LLC, Stamford, CT, USA
Anureet Saxena
Kenan-Flagler Business School, University of North Carolina Chapel Hill, Chapel Hill, NC, USA
Mustafa Gultekin

Authors

John B. Guerard Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Anureet Saxena
View author publications
You can also search for this author in PubMed Google Scholar
Mustafa Gultekin
View author publications
You can also search for this author in PubMed Google Scholar

Appendices

Appendix 1: Least Angle Regression

Efron et al. (2004) introduce LAR to the reader by discussing automatic model-building algorithms, including forward selection, all subsets, and back elimination. One can measure the goodness of fit in terms of predictive accuracy, but we will use a different manner, how well model subsets perform in terms of portfolio geometric means using out-of-sample variable weights. LAR is a variation of forward selection; that is, the technique selects the variable with the largest absolute correlation, x_j1, with the response variable, y, and performs simple linear regression of y on x_j1. The regression produces a residual vector orthogonal to x_j1, now considered to be the response, or dependent variable. One projects the other predictor variables orthogonally to x_j1 and repeats the selection process. The application of K steps produces a set of predictor variables, x_j1, x_j2, x_j3, …, x_jk, to construct a K-parameter linear model. Hastie et al. (2016) state that forward selection is an aggressive fitting technique that can be overly greedy, eliminating at a second step useful prediction correlated with x_j1.

Forward stagewise regression is a cautious version of forward selection and is geometrically related to LASSO. All variables are standardized, to be zero mean and unit variance, or

$$ {\sum}_{i=1}^n{y}_i=0,\kern0.75em {\sum}_{i=1}^n{x}_{ij}=0,\kern0.5em {\sum}_{i=1}^n{x}_{ij}^z=1,\kern0.5em $$

(37)

A prediction vector, $ \hat{\mu} $, can be written as an matrix form:

$$ \hat{\mu}={\sum}_{j=1}^m{x}_j{\hat{\beta}}_j=X\hat{\beta} $$

(38)

The total squared error is

$$ S\left(\hat{\beta}\right)={\left\Vert y-\hat{\mu}\right\Vert}^2=\kern0.37em {\sum}_{i=1}^n{\left({y}_i-{\mu}_i\right)}^2 $$

(39)

If $ \mathrm{T}\left(\hat{\beta}\right) $ is the absoluter norm of $ \hat{\beta} $, then

$$ T\left(\hat{\beta}\right)={\sum}_{j=1}^m\left|{\hat{\beta}}_j\right| $$

(40)

The LASSO chooses $ \hat{\beta} $ by minimizing $ \mathrm{S}\left(\hat{\beta}\right) $ subject to a bound on $ \mathrm{T}\left(\hat{\beta}\right) $. The LASSO tends to shrink the OLS coefficients to zero, the shrinkage improves predictive accuracy, and LASSO is parsimonious, with only a subset of covariates that have non-zero values of $ {\hat{\beta}}_j $. Stagewise, or forward stagewise linear regression, is an iterative process that begins with $ \hat{\mu}=0 $ and builds the regression function in successive small steps. If $ \mathrm{c}\left(\hat{\mu}\right) $ is the vector of current correlations

$$ \hat{c}=c\ \left(\hat{\mu}\right)={X}^T\left(y-\hat{\mu}\right) $$

(41)

Then the next step of the stagewise algorithm is taken in the direction of the greatest current correlation

$$ \hat{J}=\arg \max \kern0.50em \left|{\hat{c}}_j\ \right|\kern1em \mathrm{and}\kern1em \hat{\mu}\to \hat{\mu}+\upvarepsilon \cdot \operatorname{sign}\ \left({\hat{c}}_j\right)\cdot {x}_j. $$

(42)

The stagewise procedure does not lead to the “big” choice $ \upvarepsilon =\left|{\hat{c}}_j\ \right| $, the classic forward selection technique, but rather to a “small” choice, which is not overly greedy.

LAR is a version of the stagewise procedure to accelerate calculations. All coefficients start as equal to zero. LAR is similar to forward selection in that it finds the predictor most correlated with the response.

LAR next proceeds to find the predictor with as much correlation with the current residual. LAR finds in an equiangular direction between the first two predictors and a third predictor that is equally correlated with the “most correlated” set, along the “least angle direction.” LAR creates a regression model, one covariate at a step, such that after K steps, only K of the $ {\hat{\beta}}_j\mathrm{s} $ are non-zero:

$$ {\hat{\mu}}_1={\hat{\mu}}_0+{\hat{\gamma}}_1{x}_1 $$

(43)

LAR uses $ {\hat{\gamma}}_1 $ such that $ {\overline{y}}_2-\hat{\mu} $ is equally correlated with x₁ and x₂. Efron et al. (46) demonstrate that $ {\overline{y}}_2-{\hat{\mu}}_1 $ bisects the angle between x₁ and x₂ such that $ {c}_1\left({\hat{\mu}}_1\right)={c}_2\left({\hat{\mu}}_1\right) $.

$$ {\hat{\mu}}_2={\hat{\mu}}_1+{\hat{\gamma}}_2{u}_2, $$

(44)

where u₂ is the unit vector lying along the bisector.

The LAR subsequent steps are taken along equiangular vectors. The covariate vectors are linearly independent.

LAR begins at $ {\hat{\mu}}_0=0 $ with the stagewise procedure. Each step builds $ \hat{\mu} $. If the current $ {\hat{\mu}}_A $ is $ \hat{c}={X}^T\left(y-{\hat{\mu}}_A\right) $, then C is the current correlations vector. The active set A is the set of indices corresponding to

$$ {\hat{c}}_j=\max \left\{\left|{\hat{c}}_j\right|\right\}\kern1em \mathrm{and}\kern1em \mathrm{A}=\left\{\left|j|{\hat{c}}_j\right|=\hat{c}\right\} $$

(45)

Let $ {s}_j=\operatorname{sign}\left\{{\hat{c}}_j\right\} $ for jϵA. For A the subset of the indices {1, 2,…, m}, define the matrix X_A = (…, s_jx_j, …), $ {g}_A={X}_A^T{X}_A $. Calculate $ {A}_A={\left({1}_A^T{g}_A^{-1}{1}_A\right)}^{-0.5} $.

Define equiangular vector

$$ {u}_A={X}_A{w}_A $$

(46)

where $ {w}_A={A}_A{g}_A^{-1}{1}_A $. Compute u_A inner product vector a: a ≡ X^Tu_A. The next LAR step update is

$$ {\hat{\mu}}_{A+}={\hat{\mu}}_A+\hat{\gamma}{u}_A $$

where

$$ \hat{\gamma}={\underset{\mathrm{j}\ \overset{\acute{\mkern6mu}}{\mathrm{o}}\ \mathrm{A}}{\min}}^{+}\left\{\frac{\hat{c}-{\hat{c}}_j}{A_A-{a}_j},\frac{\hat{c}+{\hat{c}}_j}{A_A+{a}_j}\right\} $$

(47)

Thus, $ \hat{\gamma} $ is the smallest positive value of γ such that the new index j joins the active set, A₊. The new maximum absolute correlation is $ {c}_{+}=\hat{c}-\hat{\gamma}{A}_A $.

Appendix 2: The US Leading Economic Indicators

Let us follow The Conference Board components and their definitions, as of November 29, 2019:

BCI-01 Average Weekly Hours, Manufacturing

The average hours worked per week by production workers in manufacturing industries tend to lead the business cycle because employers usually adjust work hours before increasing or decreasing their workforce.

BCI-05 Average Weekly Initial Claims for Unemployment Insurance

The number of new claims filed for unemployment insurance are typically more sensitive than either total employment or unemployment to overall business conditions, and this series tends to lead the business cycle. It is inverted when included in the leading index; the signs of the month-to-month changes are reversed, because initial claims increase when employment conditions worsen (i.e., layoffs rise and new hirings fall).

BCI-08 Manufacturers’ New Orders, Consumer Goods and Materials (in 1982 $)

These goods are primarily used by consumers. The inflation-adjusted value of new orders leads actual production because new orders directly affect the level of both unfilled orders and inventories that firms monitor when making production decisions. The Conference Board deflates the current dollar orders data using price indexes constructed from various sources at the industry level and a chain-weighted aggregate price index formula.

BCI-130 ISM New Order Index

This index reflects the levels of new orders from customers. As a diffusion index, its value reflects the number of participants reporting increased orders during the previous month compared to the number reporting decreased orders, and this series tends to lead the business cycle. When the index has a reading of greater than 50, it is an indication that orders have increased during the past month. This index, therefore, tends to lead the business cycle. ISM new orders are based on a monthly survey conducted by Institution for Supply Management (formerly known as National Association of Purchasing Management). The Conference Board takes normalized value of this index as a measure of its contribution to LEI.

BCI-33 Manufacturers’ New Orders, Non-defense Capital Goods Excl. Aircraft (in 1982 $)

This index, combing with orders from aircraft (in inflation-adjusted dollars) are the producers’ counterpart to BCI-08.

BCI-29 Building Permits, New Private Housing Units

The number of residential building permits issued is an indicator of construction activity, which typically leads most other types of economic production.

BCI-19 Stock Prices, 500 Common Stocks

The Standard & Poor’s 500 stock index reflects the price movements of a broad selection of common stocks traded on the New York Stock Exchange. Increases (decreases) of the stock index can reflect both the general sentiments of investors and the movements of interest rates, which is usually another good indicator for future economic activity.

BCI-107 Leading Credit Index™

This index is consisted of six financial indicators: 2-year swap spread (real time), LIBOR 3-month less 3-month treasury bill yield spread (real time); debit balances at margin account at broker dealer (monthly); AAII Investors Sentiment Bullish (%) less Bearish (%) (weekly); Senior Loan Officers C&I loan survey, bank tightening credit to large and medium firms (quarterly); and Security Repurchases (quarterly) from the Total Finance-Liabilities section of Federal Reserve’s flow of fund report. Because of these financial indicators’ forward-looking content, LCI leads economic activities.

BCI-129 Interest Rate Spread, 10-Year Treasury Bonds Less Federal Funds

The spread or difference between long and short rates is often called the yield curve. This series is constructed using the 10-year treasury bond rate and the federal funds rate, an overnight interbank borrowing rate. It is felt to be an indicator of the stance of monetary policy and general financial conditions because it rises (falls) when short rates are relatively low (high). When it becomes negative (i.e., short rates are higher than long rates and the yield curve inverts), its record as an indicator of recessions is particularly strong.

BCI-125 Avg. Consumer Expectations for Business and Economic Conditions

This index reflects changes in consumer attitudes concerning future economic conditions and, therefore, is the only indicator in the leading index that is completely expectations-based. It is an equally weighted average of consumer expectations of business and economic conditions using two questions, Consumer Expectations for Economic Conditions 12-months ahead from Surveys of Consumers conducted by Reuters/University of Michigan, and Consumer Expectations for Business Conditions 6-months ahead from Consumer Confidence Survey by The Conference Board. Responses to the questions concerning various business and economic conditions are classified as positive, negative, or unchanged.

Regression Appendix 3

Table 5 Regression Appendix 3: Regression Diagnostics of Table 3

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guerard, J.B., Saxena, A., Gultekin, M. (2021). Regression Analysis and Estimating Regression Models. In: Quantitative Corporate Finance. Springer, Cham. https://doi.org/10.1007/978-3-030-43547-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-43547-9_12
Published: 22 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43546-2
Online ISBN: 978-3-030-43547-9
eBook Packages: HistoryHistory (R0)

Publish with us

Policies and ethics