Abstract
Detecting causation in observational data is a difficult task. Identifying the causative direction, coupling delay, and causal chain linkages from time series may be used to find causal relationships. Three issues must be addressed when inferring causality from time series data: resilience to noisy time series, computing efficiency and seamless causal inference from high-dimensional data. The research aims to provide empirical evidence on the relationship of Marvel Cinematic Universe (MCU) movies and marvel comic book sales using Fourier Transforms and cross-correlation of two time series data. The first of its kind study, establishes some concrete evidence on whether the trend of declining comic study and increasing movie audience will disrupt in the post COVID world.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Movies based on comic books have become some of Hollywood’s most popular in the previous two decades. However, due in part to a speculative bubble that resulted in a crashFootnote 1 and in part to the development of digital media, comic book sales have fallen from their high levels in the 1990s.s. The huge success of the super-hero genre on the big screen should, in theory, translate to more interest in the comic book source material. This study is focused on determining the extent to which this link exists.
Fourier Transforms and cross-correlation of two time series are used in this study to evaluate and comprehend the total monthly profits of comic books and comic book movies. The focus is on films from the Marvel Cinematic Universe (MCU), which includes films starring comic book characters such as Iron Man, Thor, Captain America, and Spider-Man. These projects contain fairly accurate adaptations of comic-book characters and plots, as well as represent some of the most profitable and successful comic-book films ever, with global box-office receipts of over 22 billion dollars (USD) [1]. Now that Marvel and DC are regularly topping the box office with their superhero films, they deserve to be recognized. They have constantly released more and more superhero movies as a result of building a cinematic universe, bringing in an increasing number of people to the theaters to view their flicks [2, 3].
The study is led by the following hypothetical question: To what degree do MCU film releases impact Marvel comic book sales? On one hand, one can anticipate that, on a wide scale, the frequency of Marvel adaptations throughout the MCU’s active period should result in a general increasing trend in Marvel comic-book sales. Variations in comic-book sales around the general trends, on the other hand, should be linked to the release of MCU films [4]. The above question, in terms of time series, is essentially about the similarity of two signals. However, because correlation does not indicate causality, the findings (if any) will be limited to the coherence of the hypotheses with the data, which one may or may not be able to confirm or refute.
One of the major incentives for social scientists to use data to detect behavior patterns and uncover intriguing correlations is the quest for causation [5]. The identification of causal effects, also known as causal inference, aims to identify the underlying mechanisms that cause changes in a phenomena [6, 7]. While the availability of large data and high-performance computers allows for innovative data analysis using causal inference, only a few research in the field of regional studies use causal inference methodologies. Understanding regional or country-wide phenomena with these advancements, on the other hand, offers vital insights into society and helps us to better monitor global trends [8,9,10]. What have been the highs and lows in MCU Movies and Marvel Comics sales in the United States over the last two decades? One may not know the causes of these anomalies, but by understanding the causes of phenomena that may have gone unnoticed previously, scientists may be able to better predict the consequences of one pattern change, provide solutions for impending problems, or be prepared for the coming paradigm shift.
2 Data
This project takes data from two sources. For the comic book sales figures, the data is taken from the universal repository of ComichronFootnote 2, an internet database featuring monthly comic book sales figures. For the movie earnings the data is collected from Universal Machine Learning (ML) repository. For all time domestic movie sales, Box Office MojoFootnote 3, an internet database giving various data related to movies, including box-office earnings over time, is used.
The two time series under consideration are the total profits of all Marvel comics and the total earnings of all MCU (domestically: USA) movies, in other words, the overall connections between Marvel movies and Marvel comics. The rationale behind this decision is because individual comic book series sales (for example, The Amazing Spider-Man) are unpredictable and fluctuate depending on a variety of factors, including the writer and artist. On the other hand, one would anticipate publisher-wide sales numbers to be less influenced by external factors. The data sets span the months of January 2008 through December 2019, nearly book ending the MCU till date.
Figure 1 displays the total profits of Marvel Comics during the MCU’s history (including the years 2008 and 2019).
The MCU dataset (purple trace, right panel) displays a continuous zero signal punctuated by strong peaks coinciding with movie releases, as one might anticipate. The comics data, on the other hand, displays a wide, complex pattern with high-frequency oscillations overlaid. The first step will be to filter the data in order to determine whether or not there is a trend.
3 Experiments
3.1 Filtering
Filtering the data is done to make the trend more obvious. First, a least-squares fit to a degree 5 polynomial is used to estimate the trend. The predicted trend is then subtracted, leaving just the variations around it. Figure 2 shows the raw data, trendline, and detrended data.
The data is filtered using the Fourier Transform of the time series f, which is defined as:
in which N is the length of the time series. The Fourier Transform can be seen in Fig. 3.
The greatest amplitude peaks have frequencies lower than 0.1 month\(^{-1}\) in absolute value, as determined by the amplitude of the peaks, which approximately corresponds to the ‘component’ of the signal with the appropriate frequency. As a result, any Fourier components with frequencies larger than that value are converted to zero to smooth out the data, thus truncating the Fourier Transform. The smoother trace displayed in Fig. 4 is obtained by doing an inverse Fourier Transform and putting back in the polynomial trend.
There is a general increasing trend over the duration of roughly ten-year frame of investigation, as one would predict based on the polynomial fit (and from eyeballing the raw data). The rise appears to be related to the number of MCU releases, but rather than going into detail about the quantitative elements, this research focuses on the correlation analysis.
3.2 Correlation
Now moving to examining fluctuations around the general trend after considering the broad effects of rising MCU releases on Marvel comics data. The cross-correlation between time series f and g is used to understand how these oscillations connect to the MCU signal:
When g is shifted by an amount \(\tau \), the above connection in Eq. (2) essentially yields the correlation between the two time series. When \(C_{fg}\) is graphed as a function of \(\tau \), the peaks indicate the extent to which g is connected to f when g is shifted by \(\tau \).
For MCU earnings, the same approach of removing the data as illustrated in Fig. 2 and normalizing the resultant detrended data by dividing by the maximum (absolute value) entry is used to make the two series more equal in magnitude for more appropriate cross-correlation values. Cross-correlation is calculated for the two normalized datasets. The cross-correlation between the MCU and Marvel Comics profits time series is shown in Fig. 5. The \(\tau -\)axis is set up so that a positive value of \(\tau \) correlates to a shift of \(\tau \) in the comics time series (in the positive direction).
4 Results and Discussion
4.1 Results
In terms of answering the guiding questions, findings have certain consequences. For one thing, the pattern seen in Fig. 4 indicates a definite rise with time, which is at least consistent with the predictions. But, as one can see in the previous section, the clear increasing tendency shown by the filtering process drives researchers to investigate the fluctuations that surround it. However, the results of the correlation (Fig. 5) are sloppy and unsatisfactory in various aspects. For example, at \(\tau \approx 6\) months, it is observed that one peak jumps out. This peak implies that the comic book data from 6 months ago is connected to the MCU data, which makes sense because the advertising campaign for big-budget movies like those in the MCU would begin around this time, with the publication of numerous ‘trailers’ and television ads. As a result, findings might imply that the start of this process has a greater impact on comic book sales than the actual release of the movie. There is also a peak around \(\tau = 0\), which corresponds to the film’s release month, but it blends in with the plot’s background noise.
Of course, correlation does not indicate causation. At the very least, the correlation data provided above is consistent with the hypothesis that MCU films influence Marvel Comics sales.
4.2 Control
It helps to have a ‘control’ dataset to make assertions more believable. If the theory is true, sales statistics for publishers other than Marvel at \(\tau \approx 6\) months should not show the same pattern. While DC Comics is a more apparent choice, it would not be a good control because its comics are in the same genre as Marvel’s. The results of the same analysis performed on the monthly sales statistics for IDW are presented in Fig. 6.
Even the peak at \(\tau \approx 5\) has comparable noise, and there are even several stronger peaks near \(\tau \approx -55\). As a result, it can be asserted that the correlation peak mentioned in Correlation Section is ‘real’: it accurately represents the impact of comic book movie advertising on comic book sales.
4.3 Future Directions
The analyses provided in the preceding sections include a number of aspects that are planned to be carried out in future research. To begin with, noise in data makes it difficult to separate the peaks that are found in the cross-correlation from the background. While the study of the ‘control’ dataset in the preceding section strengthens the assertions, the connection remains weak.
Second, the fact that the time axis is only accurate to the month level creates a resolution issue. Higher-resolution (daily) comics revenue statistics might reveal more compelling patterns, as it’s possible that a surge in comic book sales lasts shorter than a month, causing it to be lost in the monthly graphs. Thus, more granularity of time series data need to be considered for future use cases.
Of course, the amount of variables that impact comic book statistics is another big source of inaccuracy, this study ignores issues like seasonal oscillations (summer is a high-earning month for comic books), inflation, and so on. These factors may probably be modeled in a more complex study; the yearly oscillation, in particular, might be eliminated using a notch filter.
5 Conclusion
The findings in this study do not conclusively prove or disprove the hypothesis, nor do they provide a persuasive response to the central issue concerning the impact of comic book movies on comic book sales. Nonetheless, it is discovered that some pieces of credible evidence would undoubtedly spur further research into the subject. First, it is discovered that filtering monthly comic book sales data indicated a significant visual relationship between the wide growth in comic book sales throughout the ten-year research window and the release of MCU films during the same time. The cross-correlation results imply that the build-up to the release of MCU films (perhaps advertising campaigns) might be quantitatively connected to comic book sales.
The findings of this research illuminates numerous project possibilities, with two primary paths standing out. First, instead of focusing just on Marvel, it would be fascinating to conduct a similar research utilizing all comic films vs all super-hero comic books, one could discover an even greater link. Second, one might take the reverse approach and compare the sales of certain comic book series to specific film franchises (for example, Avengers comic book vs. Avengers film series). The link would most likely be weak for popular comic book series (readership may ‘saturate’ at certain point). However, movies can be anticipated to boost interest in the comics for minor characters who are granted movie franchises (e.g. Doctor Strange, perhaps).
Overall, this research demonstrates a creative use of the Fourier, filtering, and correlation techniques. The impact of big-budget blockbusters on source material is, of course, broader than comic books and an important topic in itself.
References
Forchini, P.: Movie discourse: Marvel and DC studios compared. In: The Routledge Handbook of Approache Analysis, pp. 183–201. Routledge (2020)
Ricker, A.: Call it Science: Biblical Studies, Science Fiction, and the Marvel Cinematic Universe (2021)
Park, J.-H., Kim, H.-N.: A study on the success factors of Marvel game using Marvel IP. In: Proceedings of the Korean Society of Computer Information Conference, pp. 155–158. Korean Society of Computer Information (2021)
Shi, C., Yu, X., Ren, Z.: How the Avengers Assembled? Analysis of Marvel Hero Social Network. arXiv preprint arXiv:2109.12900 (2021)
Liu, L., Wang, Y., Xu, Y.: A practical guide to counterfactual estimators for causal inference with time-series cross-sectional data. arXiv preprint arXiv:2107.00856 (2021)
Wauchope, H.S., et al.: Evaluating impact using time-series data. Trends Ecol. Evol. 36(3), 196–205 (2020)
Huang, Y., Fu, Z., Franzke, C.L.E.: Detecting causality from time series in a machine learning framework. Chaos Interdisc. J. Nonlinear Sci. 30(6), 063116 (2020)
Weichwald, S., Jakobsen, M.E., Mogensen, P.B., Petersen, L., Thams, N., Varando, G.: Causal structure learning from time series: large regression coefficients may predict causal links better in practice than small p-values. In: NeurIPS Competition and Demonstration, pp. 27–36. PMLR (2020)
Cliff, O.M., Novelli, L., Fulcher, B.D., Shine, J.M., Lizier, J.T.: Exact inference of linear dependence between multiple autocorrelated time series. arXiv preprint arXiv:2003.03887 (2020)
Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Philos. Trans. R. Soc. A 379(2194), 20200209 (2021)
Acknowledgement
All that I am, or ever hope to be, I owe to my angel mother.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2022 The Author(s)
About this paper
Cite this paper
Asesh, A. (2022). Causal Inference - Time Series. In: Biele, C., Kacprzyk, J., Kopeć, W., Owsiński, J.W., Romanowski, A., Sikorski, M. (eds) Digital Interaction and Machine Intelligence. MIDI 2021. Lecture Notes in Networks and Systems, vol 440. Springer, Cham. https://doi.org/10.1007/978-3-031-11432-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-11432-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11431-1
Online ISBN: 978-3-031-11432-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)