Skip to main content

Applied Time-Series Analysis in Marketing

  • Reference work entry
  • First Online:
Handbook of Market Research

Abstract

Time-series models constitute a core component of marketing research and are applied to solve a wide spectrum of marketing problems. This chapter covers traditional and modern time-series models with applications in extant marketing research. We first introduce basic concepts and diagnostics including stationarity test (the augmented Dicky-Fuller test of unit roots), and autocorrelation plots via autocorrelation function (ACF) and partial autocorrelation function (PACF). We then discuss single-equation time-series models such as autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA) models with and without exogenous variables. Multiple-equation dynamic systems including vector autoregressive (VAR) models together with generalized impulse response functions (GIRFs) and generalized forecast error variance decomposition (GFEVD) are then discussed in detail. Other relevant models such as generalized autoregressive conditional heteroskedasticity (GARCH) models are covered. Finally, a case study accompanied by data and R codes is provided to demonstrate detailed estimation steps of key models covered in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 649.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 699.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov & F. Csáki (Eds.), 2nd international symposium on information theory, Tsahkadsor, Armenia, USSR, September 2-8, 1971 (pp. 267–281). Akadémiai Kiadó: Budapest.

    Google Scholar 

  • Bollerslev, T., Engle, R. F., & Wooldridge, J. M. (1988). A capital asset pricing model with time-varying covariances. Journal of Political Economy, 96(1), 116–131.

    Article  Google Scholar 

  • Bronnenberg, B. J., Mahajan, V., & Vanhonacker, W. R. (2000). The emergence of market structure in new repeat-purchase categories: The interplay of market share and retailer distribution. Journal of Marketing Research, 37(1), 16–31.

    Article  Google Scholar 

  • Colicev, A., Malshe, A., Pauwels, K., & O’Connor, P. (2018). Improving consumer mindset metrics and shareholder value through social media: The different roles of owned and earned media. Journal of Marketing, 82(1), 37–56.

    Article  Google Scholar 

  • Colicev, A., Kumar, A., & O’Connor, P. (2019). Modeling the relationship between firm and user generated content and the stages of the marketing funnel. International Journal of Research in Marketing, 36(1), 100–116.

    Article  Google Scholar 

  • De Haan, E., Wiesel, T., & Pauwels, K. (2016). The effectiveness of different forms of online advertising for purchase conversion in a multiple-channel attribution framework. International Journal of Research in Marketing, 33(3), 491–507.

    Article  Google Scholar 

  • Dekimpe, M. G., & Hanssens, D. M. (1995). The persistence of marketing effects on sales. Marketing Science, 14(1), 1–21.

    Article  Google Scholar 

  • Deleersnyder, B., Geyskens, I., Gielens, K., & Dekimpe, M. G. (2002). How cannibalistic is the Internet channel? A study of the newspaper industry in the United Kingdom and the Netherlands. International Journal of Research in Marketing, 19(4), 337–348.

    Article  Google Scholar 

  • Engle, R. F., Lilien, D. M., & Robins, R. P. (1987). Estimating time varying risk premia in the term structure: The ARCH-M model. Econometrica: journal of the Econometric Society55(2), 391–407.

    Google Scholar 

  • Engle, R. F., & Granger, C. W. (1987). Co-integration and error correction: Representation, estimation, and testing. Econometrica: Journal of the Econometric Society, 55(2), 251–276.

    Google Scholar 

  • Esteban-Bravo, M., Vidal-Sanz, J. M., & Yildirim, G. (2017). Can retail sales volatility be curbed through marketing actions? Marketing Science, 36(2), 232–253.

    Article  Google Scholar 

  • Fischer, M., Shin, H. S., & Hanssens, D. M. (2016). Brand performance volatility from marketing spending. Management Science, 62(1), 197–215.

    Google Scholar 

  • Franses, P. H., & Van Dijk, D. (1996). Forecasting stock market volatility using (non-linear) Garch models. Journal of Forecasting, 15(3), 229–235.

    Article  Google Scholar 

  • Franses, P. H., & Van Dijk, D. (2000). Non-linear time series models in empirical finance. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society, 37(3), 424–438.

    Google Scholar 

  • Hannan, E. J., & Quinn, B. G. (1979). The determination of the order of an autoregression. Journal of the Royal Statistical Society: Series B: Methodological, 41, 190–195.

    Google Scholar 

  • Hanssens, D. M., & Pauwels, K. H. (2016). Demonstrating the value of marketing. Journal of Marketing, 80(6), 173–190.

    Article  Google Scholar 

  • Ilhan, B. E., Kübler, R. V., & Pauwels, K. H. (2018). Battle of the brand fans: Impact ofbrand attack and defense on social media. Journal of Interactive Marketing, 43, 33–51.

    Article  Google Scholar 

  • Kireyev, P., Pauwels, K., & Gupta, S. (2016). Do display ads influence search? Attribution and dynamics in online advertising. International Journal of Research in Marketing, 33(3), 475–490.

    Article  Google Scholar 

  • Kwiatkowski, D., Phillips, P. C., Schmidt, P., & Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root. Journal of Econometrics, 54(1–3), 159–178.

    Article  Google Scholar 

  • Lütkepohl, H. (2005). New introduction to multiple time series analysis. Berlin/New York: Springer Science & Business Media.

    Book  Google Scholar 

  • Nijs, V. R., Srinivasan, S., & Pauwels, K. (2007). Retail-price drivers and retailer profits. Marketing Science, 26(4), 473–487.

    Article  Google Scholar 

  • Pauwels, K. (2004). How dynamic consumer response, competitor response, company support, and company inertia shape long-term marketing effectiveness. Marketing Science, 23(4), 596–610.

    Article  Google Scholar 

  • Pauwels, K. H. (2017). Modern (multiple) time series models: The dynamic system. In Advanced methods for modeling markets (pp. 115–148). Cham: Springer.

    Chapter  Google Scholar 

  • Pauwels, K., Silva-Risso, J., Srinivasan, S., & Hanssens, D. M. (2004). New products, sales promotions, and firm value: The case of the automobile industry. Journal of Marketing, 68(4), 142–156.

    Google Scholar 

  • Pauwels, K., Demirci, C., Yildirim, G., & Srinivasan, S. (2016). The impact of brand familiarity on online and offline media synergy. International Journal of Research in Marketing, 33(4), 739–753.

    Article  Google Scholar 

  • Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464.

    Article  Google Scholar 

  • Slotegraaf, R. J., & Pauwels, K. (2008). The impact of brand equity and innovation on the long-term effectiveness of promotions. Journal of Marketing Research, 45(3), 293–306.

    Google Scholar 

  • Srinivasan, S., Vanhuele, M., & Pauwels, K. (2010). Mind-set metrics in market response models: An integrative approach. Journal of Marketing Research, 47(4), 672–684.

    Article  Google Scholar 

  • Srinivasan, S., Rutz, O. J., & Pauwels, K. (2016). Paths to and off purchase: Quantifying the impact of traditional marketing and online consumer activity. Journal of the Academy of Marketing Science, 44(4), 440–453.

    Article  Google Scholar 

  • Van Dieijen, M., Borah, A., Tellis, G. J., & Franses, P. H. (2019). Big data analysis of volatility spillovers of brands across social media and stock markets. Industrial Marketing Management, 88, 465.

    Article  Google Scholar 

  • Yildirim, G., Wang, W., & Deleersnyder, B. (2020) Market turbulence following a major new product introduction: Is it really so bad? Working paper.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gokhan Yildirim .

Editor information

Editors and Affiliations

Section Editor information

Appendix

Appendix

Software Application

The purpose of this software application section is to show our readers how analysts and modelers tackle real-world marketing problems using time-series models like ARIMA and VAR that are covered in this chapter. We hereby introduce R (You might want to know more about this powerful analytical tool via https://www.r-project.org/), an open-source and free software for statistical computing and data analysis. It is widely adopted by academics and industry practitioners as a powerful analytical tool to facilitate them when dealing with data-rich challenges.

Let’s start by walking into the following scenario:

  • ABC company operates in the kitchen appliance industry in an emerging market. The company so far has focused all its marketing efforts on offline flyer advertising and online Google AdWords. However, recent performance reports showed that ABC’s sales have not been reaching the management’s expectations.

  • The CMO, in preparation for a meeting with the CEO and CFO, is keen to know if and to what ABC’s sales will look like in the next quarter. Further, he/she wonders to what extent marketing expenditures are effective in driving up sales. The CMO is also curious if there’s any potential for ABC to optimize its current marketing budget allocation.

As the director of the marketing analytics department, you are presented with ABC’s historical weekly sales and marketing expenditure on flyer advertising and Google AdWords advertising over a time span of 122 weeks (i.e., 122 observations). Having met with the CMO, you make a summary of the questions to be answered and your action plans as follows:

  1. (a)

    What would be the forecast of demand for the next 12 weeks?

    We are going to predict future sales using two approaches: ARIMA and multiple linear regression (MLR) and compare their estimation and prediction results.

  2. (b)

    What drives sales in the long run? What is the contribution of each marketing action to sales (i.e., return on marketing investment)?

    We are going to estimate a VAR model and perform FEVD to evaluate the relative importance of AdWords, flyers, and sales inertia played in determining long-run sales.

  3. (c)

    To what extent do AdWords and flyers impact sales in the short versus long run?

    We are going to perform IRF analysis to evaluate the short- and long-run elasticities of each marketing action.

  4. (d)

    How should ABC allocate marketing budget between AdWords and flyers to get the best result?

    We are going to use long-run elasticities of AdWords and flyer marketing to determine the optimal resource allocation scheme for ABC.

Data Visualizations

Here is a brief preview of the first 10 rows of our dataset:

figure a

As you can observe, the firm so far has been having a relatively stable expenditure on AdWords (around 900 each week), while that on flyers it has been much more fluctuating. For example, during the first 10 weeks, the firm spent 17134.21 in week 2 and 12079.39 in week 7, and nothing for the rest 8 weeks. To get a feel of the data patterns, it is a good practice to visually inspect them by plotting sales, flyer expenditure, and Google AdWords expenditure, respectively using ts.plot.

figure b
figure c
figure d
figure e
figure f
figure g

ARIMA Modeling

As elaborated in this chapter, we are going to estimate and predict sales using ARIMA following the procedure summarized below:

  • Perform unit root tests to check for nonstationary variables and take differences of the variables that are evolving.

  • Plot ACF and PACF to determine the order of lags and hence the specification of ARIMA.

  • Split the data into training and testing set and estimate an ARIMA model using the training set and predict sales using the testing set.

Additionally, we will also estimate a multiple linear regression (MLR) model using the training set and predict sales using the testing set. This is to compare model performance using ARIMA and MLR method.

Log Transformation

First, through the time-series plots, we observe a high level of data turbulence (or volatility), which, if not treated properly, will lead to false model results and interpretations. It is a typical practice to take the logarithm of each variable to smooth out the series as a preliminary step:

figure h

Note that we add 1 to each variable during log transformation to avoid having log (0), which equals to negative infinity .

Stationary Tests

It is critical for analysts to make sure that data series being modeled are all stationary (instead of evolving) in order to have reliable model results. As introduced in the chapter, there are multiple tests for series stationarity, including the ADF, KPSS, and Phillips-Perron test that can be executed using R function adf.test, kpss.test, and pp.test, respectively. Here in this section, we demonstrate the procedure of using the ADF test. Under the ADF test, the null and alternative hypotheses are:

  • H0: The data is not stationary

  • H1: The data is stationary

Note that for adf.test and pp.test, we can reject the null hypothesis that the variable is not stationary (i.e., with a unit root) if the p-value is smaller than a certain significance level; yet kpss.test works in the opposite way, i.e., the null hypothesis is that the series is stationary without a unit root.

To check for stationarity, we need to first let R know that weekly sales and AdWords and flyer expenditures are time series using ts function, and then perform ADF test using adf.test function:

figure i

Test results inform us that LAd series (i.e., log-transformed AdWords spending) is evolving with p-value greater than 0.05. We need to take the first-difference of this series to make it stationary. Note that once we first-difference a log-transformed series, the interpretation will be different: now the series refers to growth of weekly AdWords spending, rather than AdWords spending itself.

To take the first-difference of a series, we use R function diff:

figure j

Now we can perform ADF test again to make sure that all variables are stationary:

figure k

Stationary test results suggest that now the three variables (with AdWords spending first-differenced) are all stationary. Note that to construct ARIMA model, we only need sales series, while all series are needed for MLR and later VAR model .

ACF and PACF for Order of Lags

Once a time series has been stationarized, a systematic way to determine the order of lags of the autoregressive (AR) and moving average (MA) components of ARIMA model is to plot and inspect ACF and PACF . Here we use R function ggtsdisplay, which can generate (i) the plot of the series over time, (ii) the ACF plot, and (iii) the PACF plot simultaneously and automatically:

figure l

Here on the ACF and PACF plots, the dashed horizontal lines represent the critical region (95% confidence level) for the lags. The lag order of the AR and MA component is identified by the number of lags where the PACF and ACF plot displays a clear cutoff, respectively. Here we find both ACF and PACF have a cutoff at lag 1, indicating that we should probably take a lag order of 1 for both MA and AR components. Further, given that we did not take the difference of the sales series, the final specification of our model is ARIMA (1,0,1), or ARMA (1,1).

Construct and Estimate an ARIMA Model

Splitting the Data

To estimate sales and examine model predicting power, we cannot exploit the entire data to construct our model. Instead, we need to split the series into training (in-sample) and testing (out-of-sample) sets. To do this, we apply the most commonly adopted 80–20 scheme, namely we use the first 80% of the observations as the training set and the rest 20% as the testing set. Given that we have 122 observations in total, we should use the first 96 observations for our training set, and the rest 25 observations as the testing set.

figure m
Train the Model

ACF and PACF suggest that we estimate an ARIMA (1,0,1) model. To estimate the model using the training set, we use R function Arima:

figure n

Furthermore, you may estimate ARIMA models with different specifications and compare model performances (e.g., AIC and BIC) to pick the best specification. There are also R functions that can automatically pick the specifications with the lowest AIC and BIC, for example, auto.arima. However, analysts should keep in mind that you are the constructor of your model and that it is you that should be the final decision-maker on what model to estimate by considering managerial and strategic factors that model diagnostics cannot inform you. For example, some firms might operate within a certain cycle and would want to evaluate sales using a specific order of lags.

Estimate a Multiple Regression Model

In addition to ARIMA, given a dataset as such, it is also very common for modelers to adopt multiple linear regression method and estimate a linear model to fit and predict sales. This is because MLR allows us to incorporate other exogenous factors, while ARIMA typically only involves the endogenous variable itself.

Again, we need to use the training set for model estimation. To do this, we use the lm function in R, referring to “linear model.”

figure o

Interpreting the results briefly, a 1% increase in lag sales will lead to 0.39% of increase in current sales; a 1% increase in flyer spending will lead to 0.01% increase in sales. The coefficient of “DLAd” is statistically insignificant .

Validation Set Assessment: ARIMA Versus MLR

Now it is time for us to contrast model performance using ARIMA and MLR method and predict sales using the testing set. We further plot the actual sales and predicted sales using ARIMA and MLR method, respectively, on the same graph for contrast.

figure p
figure q

From the graph, both models can mimic (to a certain extent) the pattern of actual sales in the testing set. To determine which method does a relatively more accurate job, we can calculate and compare the root-mean-square deviation (RMSE) of both predictions.

figure r

Comparing prediction accuracy using both methods, we find that ARIMA managed to capture the dynamics of ABC’s sales better, since its RMSE (0.33) is slightly lower than that of MLR (0.40).

To improve the model prediction accuracy of MLR, several other factors should be considered, for example, the weekly price of ABC, and weekly sales of ABC’s core competitors. Modelers can incorporate additional variables into the model depending on data availability. Furthermore, answering the first question raised by the CMO, we can predict sales of ABC for the next 3 months (12 weeks) based on our ARIMA model. Here we plot both predicted sales and 95% confidence intervals in the graph below:

figure s

Reverting the log-transformed predicted values back, we get the predicted sales for the next quarter (i.e., 12 weeks) as below:

figure t

VAR Model Steps

Estimating a VAR Model

We are able to set up our VAR model relatively easily since we have already performed model diagnostics on series stationarity through unit root tests in section “Testing for Evolution Versus Stationarity.” Taking all three variables as endogenous variables, we estimate a VAR model consisting of (log-transformed) weekly sales, lagged Google AdWords expenditure, and flyer expenditure using VAR function. We then summarize the results in the table below.

figure u
figure v

From the VAR output, we find that:

  • Direct effects: AdWords (0.402) and flyer (0.003) both have positive direct impact on sales.

  • Carryover effects: past AdWords (−0.154), flyer advertising (−0.276), and sales (0.310) all exert impact on their current values, respectively.

  • Feedback effects: sales have positive feedback effect on flyer (0.298) and ad spending (0.310), while negative feedback effect on online AdWords spending (−0.062).

Note that due to the limited sample size and data variation in our sample, some of the coefficients seem statistically insignificant. However, to see the effect of AdWords and Flyer advertising on Revenues over time (e.g., immediate and long-term effects), it is more rigorous to refer to results from impulse response function (IRF) analysis.

After the estimation it is always good practice to check the residuals’ normality and the autocorrelation. If there is any misspecification, you may need to search if any anomaly such as outlier and structural break occurs. Here we plot the residuals and inspect their mean.

figure w

We observe that the residuals seem to vary randomly around zero, with a mean of zero .

Forecast Error Variance Decomposition

Referring to the second question raised by the CMO, we perform FEVD analysis to evaluate and visualize the relative importance or contribution of flyers, AdWords, and sales inertia using R function fevd:

figure x

The table above indicates that ABC’s sales are quite “sticky” in the sense that lagged sales (LSales) contribute to more than 90% to changes in current sales. Offline marketing seems to play a more important role than online AdWords for ABC. The table above corresponds to the bottom panel of the graph below.

figure y

IRF Analysis

Responding to the third question raised by the CMO, we perform IRF (In this session we are using orthogonalized impulse reaction function for estimation. In R environment, to implement GIRF estimations, we need to estimate a Bayesian VAR model (you may check package “bvartools”) instead, which is beyond the scope of this chapter.) analysis to evaluate the short- and long-run elasticities of flyers and AdWords marketing of ABC using irf function in R.

figure z
figure aa
figure ab

IRF plots help us visualize when the peak effects occur. On the plots, the solid line refers to IRF coefficients, while the dashed lines refer to lower and upper bound of the IRF coefficient’s confidence interval. It seems that increase flyer spending can cause an immediate boost of sales; in contrast, it takes longer time for spending on AdWords to have positive impact on sales. Moreover, we can observe that these impacts all decay fast and gets close to zero over time, mostly within 6 periods (weeks).

Immediate and Long-Term Effects

In order to compute the immediate and long-term effects, we need to evaluate the significance of each IRF coefficient. If the t-statistics of the IRF coefficient is greater than 1 (Here we follow previous research (e.g., Slotegraaf and Pauwels 2008) to set the criteria as t>1. You may apply the t>2 rule if you would like to evaluate coefficient significance at a 95% significance level) (t>1), we treat it as significant and keep the value of that coefficient; otherwise, we treat the coefficient as zero. To calculate the t-statistics, we need to derive the standard error (se) of each coefficient from its confidence interval, since lower boundci = β − 1.96 ∗ se, and upper boundci = β + 1.96 ∗ se. We then calculate the t-statistics using t-stat= β/se.

Based on the above computations, the first period impact is called the immediate effect while the cumulative effect over 8 periods is called the long-run effect.

Now we make a table in R to summarize IRF coefficients and their confidence intervals. You will see in the output that response means the response value at a particular period (there are 8 periods in total), lower and upper refer to the lower and upper bound of the corresponding confidence intervals, respectively.

figure ac

Now we apply the t>1 rule to determine coefficient significance and calculate long-term elasticities of AdWords and flyer advertising spending.

figure ad
figure ae
figure af

After applying the t>1 rule, we figure out that the AdWords advertising has a significant and positive impact on revenues in second period, while flyer advertising has significant and positive impact on revenues in the first and second period. Put it more specifically, after adding up significant coefficients overtime to get the long-term elasticities for both advertisings, we can say that:

  • An 1% increase in AdWords advertising spending growth (note that we first-differenced the series) will increase the firm’s revenues by 0.04% in the long run.

  • An 1% increase in flyer advertising spending will increase the firm’s revenues by 0.12% in the long run .

Optimal Allocation Between AdWords and Flyer Spending

Finally, we can respond to the final question from the CMO regarding ABC’s budget allocation. To do this, we may first take a look at the current budget allocation of ABC. We just need to review the dataset and calculate the total amount of money that the firm has spent on AdWords and flyers, respectively. Then we create a pie chart to visualize the current budget allocation of the firm.

figure ag
figure ah
figure ai

We can see that the firm is currently putting far more resources on flyers, since it spends 85% of its budget on it and only 15% on online AdWords.

For the optimal marketing budget allocation, we need to retrieve the impact of AdWords and flyer spending from IRF analysis. More specifically, we will calculate the optimal allocation for each marketing channel using the following formula:

$$ \mathrm{Optimal}\ \mathrm{Allocatio}{\mathrm{n}}_i=\frac{\eta_i}{\sum_{i=1}^I{\eta}_i} $$

where η is the elasticity of marketing tool i.

As an example, for AdWords spending, we will calculate it as follows:

$$ \mathrm{Optimal}\ \mathrm{Allocatio}{\mathrm{n}}_{\mathrm{Adwords}}=\frac{\eta_{\mathrm{AdWords}}}{\eta_{\mathrm{AdWords}}+{\eta}_{\mathrm{Flyers}}} $$

Let’s do this in R now:

figure aj

Having figured out the optimal budget allocation between AdWords and flyer, we can now create another pie chart so that we can compare:

figure ak
figure al
figure am

The optimal budget allocation is that the firm should actually spend less of its marketing budget on flyer advertising (77%, instead of 85%), and more on Google AdWords advertising (23% instead of 15%). Contrasting the optimal and actual budget allocation of the firm, it is quite obvious that currently, the firm is underestimating the power of online marketing through AdWords and overemphasizing the importance of offline flyers.

We can see that without analyzing resource allocation, a firm can be quite far away from what it “should” do. Looking at the optimal budget allocation is quite critical in managers’ decision-making, since utilizing the constrained resource more wisely can potentially make a big difference to firm performance (e.g., revenues).

On a final note, this section talks about the allocation when the sales performance is taken into consideration. Brand managers may pursue different KPIs as well, such as market share, profits, and brand liking. With different KPIs pursued by the brand manager, the allocation would be different. Moreover, instead of keeping the budget the same and reallocating it, the brand manager may want to increase the budget. In such a case, the dynamics between marketing input and financial performance would be altered, leading to different optimal allocation.

To conclude, we responded to the questions raised by ABC’s CMO regarding demand forecasting, marketing effectiveness, and budget allocation ARIMA (and MLR) and VAR (and FEVD and IRF) methods. We hope that our readers can have a better understanding of the materials covered in this chapter by referring to this application exercise.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Wang, W., Yildirim, G. (2022). Applied Time-Series Analysis in Marketing. In: Homburg, C., Klarmann, M., Vomberg, A. (eds) Handbook of Market Research. Springer, Cham. https://doi.org/10.1007/978-3-319-57413-4_37

Download citation

Publish with us

Policies and ethics