Keywords

1 Introduction

Movies based on comic books have become some of Hollywood’s most popular in the previous two decades. However, due in part to a speculative bubble that resulted in a crashFootnote 1 and in part to the development of digital media, comic book sales have fallen from their high levels in the 1990s.s. The huge success of the super-hero genre on the big screen should, in theory, translate to more interest in the comic book source material. This study is focused on determining the extent to which this link exists.

Fourier Transforms and cross-correlation of two time series are used in this study to evaluate and comprehend the total monthly profits of comic books and comic book movies. The focus is on films from the Marvel Cinematic Universe (MCU), which includes films starring comic book characters such as Iron Man, Thor, Captain America, and Spider-Man. These projects contain fairly accurate adaptations of comic-book characters and plots, as well as represent some of the most profitable and successful comic-book films ever, with global box-office receipts of over 22 billion dollars (USD) [1]. Now that Marvel and DC are regularly topping the box office with their superhero films, they deserve to be recognized. They have constantly released more and more superhero movies as a result of building a cinematic universe, bringing in an increasing number of people to the theaters to view their flicks [2, 3].

The study is led by the following hypothetical question: To what degree do MCU film releases impact Marvel comic book sales? On one hand, one can anticipate that, on a wide scale, the frequency of Marvel adaptations throughout the MCU’s active period should result in a general increasing trend in Marvel comic-book sales. Variations in comic-book sales around the general trends, on the other hand, should be linked to the release of MCU films [4]. The above question, in terms of time series, is essentially about the similarity of two signals. However, because correlation does not indicate causality, the findings (if any) will be limited to the coherence of the hypotheses with the data, which one may or may not be able to confirm or refute.

One of the major incentives for social scientists to use data to detect behavior patterns and uncover intriguing correlations is the quest for causation [5]. The identification of causal effects, also known as causal inference, aims to identify the underlying mechanisms that cause changes in a phenomena [6, 7]. While the availability of large data and high-performance computers allows for innovative data analysis using causal inference, only a few research in the field of regional studies use causal inference methodologies. Understanding regional or country-wide phenomena with these advancements, on the other hand, offers vital insights into society and helps us to better monitor global trends [8,9,10]. What have been the highs and lows in MCU Movies and Marvel Comics sales in the United States over the last two decades? One may not know the causes of these anomalies, but by understanding the causes of phenomena that may have gone unnoticed previously, scientists may be able to better predict the consequences of one pattern change, provide solutions for impending problems, or be prepared for the coming paradigm shift.

2 Data

This project takes data from two sources. For the comic book sales figures, the data is taken from the universal repository of ComichronFootnote 2, an internet database featuring monthly comic book sales figures. For the movie earnings the data is collected from Universal Machine Learning (ML) repository. For all time domestic movie sales, Box Office MojoFootnote 3, an internet database giving various data related to movies, including box-office earnings over time, is used.

The two time series under consideration are the total profits of all Marvel comics and the total earnings of all MCU (domestically: USA) movies, in other words, the overall connections between Marvel movies and Marvel comics. The rationale behind this decision is because individual comic book series sales (for example, The Amazing Spider-Man) are unpredictable and fluctuate depending on a variety of factors, including the writer and artist. On the other hand, one would anticipate publisher-wide sales numbers to be less influenced by external factors. The data sets span the months of January 2008 through December 2019, nearly book ending the MCU till date.

Figure 1 displays the total profits of Marvel Comics during the MCU’s history (including the years 2008 and 2019).

Fig. 1.
figure 1

Marvel Comics profits as a time series t in months since Jan 2008. Left panel - Marvel Comics profits as a time series t. Right panel - Total Marvel Cinematic Universe domestic box office profits as a function of time t

The MCU dataset (purple trace, right panel) displays a continuous zero signal punctuated by strong peaks coinciding with movie releases, as one might anticipate. The comics data, on the other hand, displays a wide, complex pattern with high-frequency oscillations overlaid. The first step will be to filter the data in order to determine whether or not there is a trend.

3 Experiments

3.1 Filtering

Filtering the data is done to make the trend more obvious. First, a least-squares fit to a degree 5 polynomial is used to estimate the trend. The predicted trend is then subtracted, leaving just the variations around it. Figure 2 shows the raw data, trendline, and detrended data.

Fig. 2.
figure 2

Raw comic book revenues and a degree 5 polynomial fit - Left panel. Subtracting the polynomial trendline from the raw data to get ‘detrended’ time series - Right panel

The data is filtered using the Fourier Transform of the time series f, which is defined as:

$$\begin{aligned} F_k = \varDelta t \sum _{j=0}^{N-1}f_j \mathrm e^{-\mathrm i2 \pi j k}, \end{aligned}$$
(1)

in which N is the length of the time series. The Fourier Transform can be seen in Fig. 3.

Fig. 3.
figure 3

As a function of frequency f, the amplitude of the fourier transform of the Marvel Comics Earnings time series.

The greatest amplitude peaks have frequencies lower than 0.1 month\(^{-1}\) in absolute value, as determined by the amplitude of the peaks, which approximately corresponds to the ‘component’ of the signal with the appropriate frequency. As a result, any Fourier components with frequencies larger than that value are converted to zero to smooth out the data, thus truncating the Fourier Transform. The smoother trace displayed in Fig. 4 is obtained by doing an inverse Fourier Transform and putting back in the polynomial trend.

Fig. 4.
figure 4

The result of truncating Fourier Transform and filtering Marvel Comics data

There is a general increasing trend over the duration of roughly ten-year frame of investigation, as one would predict based on the polynomial fit (and from eyeballing the raw data). The rise appears to be related to the number of MCU releases, but rather than going into detail about the quantitative elements, this research focuses on the correlation analysis.

3.2 Correlation

Now moving to examining fluctuations around the general trend after considering the broad effects of rising MCU releases on Marvel comics data. The cross-correlation between time series f and g is used to understand how these oscillations connect to the MCU signal:

$$\begin{aligned} C_{fg}(\tau ) = \int _{-\infty }^\infty \mathrm d\tau f^*(t)g(t+\tau ). \end{aligned}$$
(2)

When g is shifted by an amount \(\tau \), the above connection in Eq. (2) essentially yields the correlation between the two time series. When \(C_{fg}\) is graphed as a function of \(\tau \), the peaks indicate the extent to which g is connected to f when g is shifted by \(\tau \).

For MCU earnings, the same approach of removing the data as illustrated in Fig. 2 and normalizing the resultant detrended data by dividing by the maximum (absolute value) entry is used to make the two series more equal in magnitude for more appropriate cross-correlation values. Cross-correlation is calculated for the two normalized datasets. The cross-correlation between the MCU and Marvel Comics profits time series is shown in Fig. 5. The \(\tau -\)axis is set up so that a positive value of \(\tau \) correlates to a shift of \(\tau \) in the comics time series (in the positive direction).

Fig. 5.
figure 5

The MCU earnings dataset and the Marvel Comics (MC) dataset cross-correlate as a function of \(\tau \), the time series shift. Peaks define a high-correlation zone.

4 Results and Discussion

4.1 Results

In terms of answering the guiding questions, findings have certain consequences. For one thing, the pattern seen in Fig. 4 indicates a definite rise with time, which is at least consistent with the predictions. But, as one can see in the previous section, the clear increasing tendency shown by the filtering process drives researchers to investigate the fluctuations that surround it. However, the results of the correlation (Fig. 5) are sloppy and unsatisfactory in various aspects. For example, at \(\tau \approx 6\) months, it is observed that one peak jumps out. This peak implies that the comic book data from 6 months ago is connected to the MCU data, which makes sense because the advertising campaign for big-budget movies like those in the MCU would begin around this time, with the publication of numerous ‘trailers’ and television ads. As a result, findings might imply that the start of this process has a greater impact on comic book sales than the actual release of the movie. There is also a peak around \(\tau = 0\), which corresponds to the film’s release month, but it blends in with the plot’s background noise.

Of course, correlation does not indicate causation. At the very least, the correlation data provided above is consistent with the hypothesis that MCU films influence Marvel Comics sales.

4.2 Control

It helps to have a ‘control’ dataset to make assertions more believable. If the theory is true, sales statistics for publishers other than Marvel at \(\tau \approx 6\) months should not show the same pattern. While DC Comics is a more apparent choice, it would not be a good control because its comics are in the same genre as Marvel’s. The results of the same analysis performed on the monthly sales statistics for IDW are presented in Fig. 6.

Fig. 6.
figure 6

Cross-correlation of MCU earnings and IDW Comics earnings dataset as a function of \(\tau \), the shift between the time series. Peaks give region of high correlation.

Even the peak at \(\tau \approx 5\) has comparable noise, and there are even several stronger peaks near \(\tau \approx -55\). As a result, it can be asserted that the correlation peak mentioned in Correlation Section is ‘real’: it accurately represents the impact of comic book movie advertising on comic book sales.

4.3 Future Directions

The analyses provided in the preceding sections include a number of aspects that are planned to be carried out in future research. To begin with, noise in data makes it difficult to separate the peaks that are found in the cross-correlation from the background. While the study of the ‘control’ dataset in the preceding section strengthens the assertions, the connection remains weak.

Second, the fact that the time axis is only accurate to the month level creates a resolution issue. Higher-resolution (daily) comics revenue statistics might reveal more compelling patterns, as it’s possible that a surge in comic book sales lasts shorter than a month, causing it to be lost in the monthly graphs. Thus, more granularity of time series data need to be considered for future use cases.

Of course, the amount of variables that impact comic book statistics is another big source of inaccuracy, this study ignores issues like seasonal oscillations (summer is a high-earning month for comic books), inflation, and so on. These factors may probably be modeled in a more complex study; the yearly oscillation, in particular, might be eliminated using a notch filter.

5 Conclusion

The findings in this study do not conclusively prove or disprove the hypothesis, nor do they provide a persuasive response to the central issue concerning the impact of comic book movies on comic book sales. Nonetheless, it is discovered that some pieces of credible evidence would undoubtedly spur further research into the subject. First, it is discovered that filtering monthly comic book sales data indicated a significant visual relationship between the wide growth in comic book sales throughout the ten-year research window and the release of MCU films during the same time. The cross-correlation results imply that the build-up to the release of MCU films (perhaps advertising campaigns) might be quantitatively connected to comic book sales.

The findings of this research illuminates numerous project possibilities, with two primary paths standing out. First, instead of focusing just on Marvel, it would be fascinating to conduct a similar research utilizing all comic films vs all super-hero comic books, one could discover an even greater link. Second, one might take the reverse approach and compare the sales of certain comic book series to specific film franchises (for example, Avengers comic book vs. Avengers film series). The link would most likely be weak for popular comic book series (readership may ‘saturate’ at certain point). However, movies can be anticipated to boost interest in the comics for minor characters who are granted movie franchises (e.g. Doctor Strange, perhaps).

Overall, this research demonstrates a creative use of the Fourier, filtering, and correlation techniques. The impact of big-budget blockbusters on source material is, of course, broader than comic books and an important topic in itself.