Charting closed-loop collective cultural decisions: From book best sellers and music downloads to Twitter hashtags and Reddit comments

Charts are used to measure relative success for a large variety of cultural items. Traditional music charts have been shown to follow self-organizing principles with regard to the distribution of item lifetimes, the on-chart residence times. Here we examine if this observation holds also for (a) music streaming charts (b) book best-seller lists and (c) for social network activity charts, such as Twitter hashtags and the number of comments Reddit postings receive. We find that charts based on the active production of items, like commenting, are more likely to be influenced by external factors, in particular by the 24 hour day-night cycle. External factors are less important for consumption-based charts (sales, downloads), which can be explained by a generic theory of decision-making. In this view, humans aim to optimize the information content of the internal representation of the outside world, which is logarithmically compressed. Further support for information maximization is argued to arise from the comparison of hourly, daily and weekly charts, which allow to gauge the importance of decision times with respect to the chart compilation period.


Introduction
Isaac Asimov's science fiction 'Foundation' trilogy is based on the presumption that it may be possible to develop a framework which would allow to model and predict the political and economic development of human societies as such (Elkins, 1976;Phillips and Zyglidopoulos, 1999). At the core of the framework in question, 'psychohistory', lies the hypothesis that it could be possible to describe collective decision-making by rigorous laws, regardless of what a specific individual decides to do. Individual preferences would average out when populations are large. On a decisively less galactic level, a legitimate question regards the statistics of cultural phenomena produced by large numbers of decisions. Prime candidates are in this regard rankings measuring the economic success of cultural goods, like music albums, books and movies. Here we will show that certain traits of the New York Times best-seller list and the streaming charts compiled on a daily basis by Spotify, a streaming portal, are congruent with the predictions of an information-theoretical theory of human decision-making.
It has been recognized that charts, in particular music charts, like the US-based Billboard charts, provide access to the study of an extended range of socio-cultural developments. Examples are the evolution of harmonic and timbral properties of popular music (Mauch et al., 2015), the influence of gender and race on chart success (Lafrance et al., 2018), and how technological progress influences cultural diversity, respectively concentration processes (Ordanini and Nunes, 2016). Other studies investigated the correlation between acoustic features (Interiano et al., 2018) and repetitive lyrics (Nunes et al., 2015) to market success, the interplay between popularity and significance in popular music (Monechi et al., 2017), and whether there 2 Information theory of human decision-making On a statistical level, human decision-making may be modeled using an information theoretical ansatz (Gros et al., 2012). Starting from the same basic ansatz as Schneider and Gros (2019), we present here a modified derivation. From an epistemology view, we stress that our ansatz is intended as a possible explanatory framework. Similarly, the data analysis presented further below is not claimed to proof the correctness of the information theoretical ansatz, indicating however that it is a valid contender. From a generalized perspective our ansatz is neutral (Leroi et al., 2020), in the sense that is based not on semantic, but on statistical considerations.

Decision-making maximizing information
The brain processes and stores information. Given the finite size, information needs to be selected and compressed (Marois and Ivanoff, 2005). A suitable measure for the amount of information encoded by the probability distribution p(s) of a quantity s is given by the Shannon Entropy H[p], where A denotes the expectation value of the function A = A(s), see e.g. Gros (2015). We assume that the overall goal of human decisions is to maximize information, in particular the information generated by decision-making processes. As a proxy for the impact of actions, e.g. to buy a cultural item, we examine the statistical properties of the corresponding chart. Statistically, information maximization is equivalent to maximizing the entropy H[p]. This goal can be achieved in practice only when respecting a given set of constraints. Real-world actions typically need to factor in the amount of effort and time involved, as well as the uncertainty of the outcome, viz the variance. On a statistical level these two constraints are given by the mean and the variance, which are equivalent to the first and the second moment of the probability distribution in question, s and s 2 . The constrained maximum entropy distribution is consequently obtained by maximizing the objective function where a and b are suitable Lagrange multipliers. Using standard variational calculus (Gros, 2015), the distribution function maximizing Φ[p] is found to be a Gaussian with mean µ = −a/(2b) and variance σ 2 = 1/(2b), where normalization factors have been suppressed.

Human brains compress information logarithmically
When making decisions, which is the distribution function p(s) that is to be maximized statistically using the entropy H[p]? For the case of book and music charts, a prime candidate is the lifetime distribution p(L), which measures the probability that a given item, a book or a music album, remains listed for a period L, the lifetime. While listed, books and albums receive increased attention, which implies that the listing period constitutes a visible effect of the individual decision to buy a specific cultural item.
From the cognitive perspective, the brain is confronted with two contrasting demands: to store incoming information as faithfully as possible, covering at the same time the extended orders of magnitude characterizing physical stimuli, like sound and light intensity, as well as time scales. An efficient solution for this conundrum is to store information on a compressed scale, for instance logarithmically. This is indeed the case, as expressed by the Weber-Fechner law, which states that the brain discounts sensory stimuli (Hecht, 1924), numbers (Nieder and Miller, 2003;Dehaene, 2003) and time logarithmically (Howard, 2018). It is not a coincidence that we use logarithmic scales, lumen and decibel, to measure light and sound intensities.

Maximizing compressed information
When performing an operation, like information maximization, the brain uses internal states that we denote here s. These states are in general related logarithmically to outside quantities, as discussed in the previous section. For the case of the lifetime distribution p(L) we have which relates the observable distribution function p(L) with its internal representation, the probability density p(s). Using ds/dL = 1/L and the maximum entropy probability density p(s), as given by (3), one finds with a log-meanμ = −(a+1)/(2b), and a log-variance ofσ 2 = 1/(2b). p(L) corresponds to a log-normal distribution whenever b > 0, and to a power law, if b vanishes. Power laws, which are frequently observed (Marković and Gros, 2014), are hence a natural outcome of human activities, statistically, whenever the variance is either small or not taken into account. There are two causalities for why the variance may be of secondary importance. Firstly, when uncertainties are small, viz when fluctuations around the mean are Table 1: Parameters of lifetime distributions. Listed are the linear and quadratic parameters a and b from Eq. (5). Note that a + 1 corresponds to the exponent of the power law contribution, see (6). The corresponding quadratic curves in log-log space are shown in Figure 2 and  negligible. In this case the b-term in (5) is numerically small, becoming important only for exceedingly large or small lifetimes L. A second, more general argument is that one has to invest comparatively more resources to sample the variance than just the mean, in particular when correlations are present (Broersen, 1998). If time is a scarce resource, only the mean can be obtained reliably from a time dependent functionality, such as the lifetime distribution of cultural items. As a consequence it follows, that power laws are present when individual decision times are shorter than typical chart listing duration, with log-normal distributions emerging for extended decision times. The probability density p(L) is normalizable only for b ≥ 0 if arbitrary large lifetimes L are allowed. Real-life time intervals are however finite, which makes a negative b also viable. The values for a + 1 and b obtained throughout this study for different music and social network activity charts are discussed in the next section.

Results
The information-theoretical approach to the statistics of human decision-making implies a characteristic functional dependency for the distribution of the lifetime of cultural items. We tested this dependency for book and album sales charts, for music download charts, as well as for social media comments. Data retrieval and processing is described in Sect. 4. To visualize the possible p(L) distribution functions, figures with log-log representations of lifetime distributions include quadratic fits that correspond to Eq. (3). However, these fits are not claimed to correspond to statistically validated models, as discussed, e.g., by Clauset et al. (2009). In Table 1 and 2 the respective parameters are listed.

Chart diversity
An intuitive measure for the turnover rate of charts is the chart diversity d = N a /N s , where N a is the number of unique titles, here per year, normalized by the number of available slots N s (for the whole year). Figure 1 shows the evolution of the chart diversity of the New York Times best-seller list, in comparison with the top 10 and top 100 Billboard album charts. Billboard introduced streaming data into the chart in 2014 but still published a sales chart. This is why the line splits in 2014, with the chart including streaming shown as a dotted line.
The chart diversities of the New York Times best-seller list and of the Billboard album charts have taken a similar evolution over the course of the last three to four decades. This fact is quite remarkable, given that individual music albums and fiction books differ substantially concerning their respective consumption times. Generally one needs substantially longer time to read a novel than to listen to an album. Their overall key core characteristics are on the other hand similar. The average number of weeks a novel spent on the New York Times best-seller list w ≈ 1/d (where d is the chart diversity), shrunk fromw ≈ 1/0.1 = 10 weeks in the 1980s to aboutw ≈ 1/0.3 ≈ 3 in 2010. The equivalent numbers for the top-10 Billboard album charts arew ≈ 1/0.1 = 10 in the 1980s andw ≈ 1/0.4 = 2.5 in 2010.
The New York Times best-seller list and Billboard album charts are also similar with respect to the dynamics of titles that eventually make it to the top, the numberone titles. For this there are two possible courses. Either a title enters the chart directly at the top, then its number-one position is guaranteed. Alternatively a given title starts lower and works its way up to the top, over the course of several weeks or months. The probability for a number-one title to take the first course of action, P one , is shown in Figure 1. In the 1970s and 1980s close to no number-one best seller entered the New York Times best-seller list at the top. This changed dramatically in the following three to four decades. From the late 2000s onward close to all, about 95% of all number-one novels started charting right at the top. This is similar to the evolution the Billboard album charts took. A small difference is that the New York Times best-seller list started the transition about five years prior to the Billboard album chart, however with the evolution taking about ten years longer. In general the Billboard top 10 and top 100 charts differ only marginally.
The changes in P one are also reflected in the time it takes on the average to become number one. Before 1980, when close to all number-one titles had to work their way up to the top, this process took on the average five to eight weeks. Today number-one novels as well as albums become number one in under a week on the average, since almost all number-one titles start as such.

Musical chart lifetimes
In Figure 2 the lifetime distributions of daily and weekly music album charts are presented, namely for the Billboard album charts, which are published on a weekly Table 2: New York Times best-seller lifetime distribution parameters. Listed are the linear and quadratic parameters a and b, respectively, for the log-normal distribution Eq. (5), as shown in Figure 4. Note that the prefactor of the linear term is a + 1, see (6). The uncertainties refer to the margin of error given a 95% confidence level. basis, and for the respective Spotify download charts.
Comparing the functional dependency of past (1980)(1981)(1982) and present day (2017-2019) Billboard album lifetimes, one observes a transition from a log-normal distribution to a power law, as parameterized by (3). An equivalent transition is seen between daily and weekly Spotify charts. A caveat is here the outlier that can be observed for the Spotify daily charts at a lifetime of exactly one week. The origin of this outlier is unknown, it may be caused possibly by weekly algorithmic influences, such as maintenance periods. It is in any case remarkable, that the lifetime distributions presented in Figure 2 change on a functional level, either with time or when the charting period is increased. We argue, that a separation of time scales causes both transitions.
Traditionally, music albums were bought in music shops, which involved a personal trip, and hence a comparatively long execution time. Long decision and execution times imply, as argued in Sect. 2.3, that the variance of the album lifetime distribution is relevant, which takes consequently the form of a log-normal distribution. The typical time needed to acquire a music album dropped however substantially below one week when online shopping started to become relevant in the 1990s. Given that one week corresponds to the chart frequency, this development implies that the variance of the album lifetime distribution lost its relevance. The desire to maximize information, the basis of the here discussed theory of statistical decision-making, leads in this case to power laws.
We believe that an analogous argument explains why daily and weekly Spotify album charts show respectively log-normal and power law distributions. On the average, it is presumably not a matter of only a few hours to listen and to appreciate music albums, which may contain a substantial number of titles, but of a few days. This time scale, several days, lies between the frequencies of daily and weekly streaming charts, which would explain why daily and weekly album charts are respectively log-normal and power law distributed. Whereas the Billboard album charts are sales-based, the respective single charts are compiled predominantly on the basis of airtime statistics, for which the number of times songs are played by radio stations are counted. A direct comparison of album and single charts is in this case not possible (Schneider and Gros, 2019). The equivalent caveat does not hold for the Spotify charts, for which both album and single charts are based on streaming counts.
In Figure 3 the lifetime distributions of daily and weekly Spotify single charts are shown. Both are close to power law distributions with only small quadratic contributions. The overall process, to listen to a music sample, to decide to stream, and to actually do it, can be assumed to take only a few minutes for individual songs, but substantially longer in the case of albums. The observation that already daily single download charts show indications for power law behavior, with respect to the distribution of song lifetimes, is hence consistent with our basic framework.

New York Times book best sellers
In Figure 4 we present the probability that a given title stays for a certain number of weeks on the fiction best-seller list of the New York Times. In contrast to the top-100 music charts discussed further above, only the top ten titles of the week are included in the NYT best-seller list. In order to compensate for the comparatively limited database, we averaged the lifetime distribution over consecutive ten-year periods. Substantial scattering of the data can be observed nevertheless.
Included in Figure 4 are quadratic fits, −(a + 1)s − bs 2 , to the log-log representation of the lifetime distribution (compare Eq. (3)). It is notable, that the distribution of book lifetimes evolves from being convex in log-log scale in the 50s and 60s to be being concave, starting from the 80s. The turning point, in the 70s, matches roughly a shallow minimum in the chart diversity, as shown in Figure 1. It is presently unclear what drives this interesting phenomenon. The presence of a finite second-order component, b = 0, indicates in any case that the individual time scales for buying and reading books has not dropped below one week, the charting frequency. A power law appears in the 70s, when the lifetime distribution transits from convex to concave. The power laws seen in the distribution of musical charts emerge in contrast in the concave region, see The lifetime distribution of books on the New York Times best-seller list. Shown is the average over ten-year periods. Note that only ten books are listed weekly, the data is hence comparatively sparse. In a log-log presentation, as shown, a long-term evolution from convex to concave is observed.
for the two power laws, for the late-state Billboard lifetime distributions and for the book lifetimes in the 70s.

Dynamics of Reddit comments
New data items, like posts or images, are at the core of social media activity. The response of other users, in the form of comments or likes, will determine the popularity of social media posts and with this the likelihood of additional likes and comments.
In this context we analyzed the statistics of Reddit post commenting, using publicly available data (Reddit Database, 2020). Note, that Reddit is a discussion website organized around collections of user-created discussion boards for individual subjects called "subreddits". Users can submit posts such as text posts, links, images and videos to a subreddit. Posts are then voted up or down and commented on. While upvoting can be done directly on the post overview screen while scrolling through, in order to comment, users have to actively click on the post. The most popular posts from each subreddit are also shown on the front page on login. There are multiple options for sorting to determine which posts are the most popular. The old default was "Hot", which was based on the number of upvotes (log-weighted) and the submission time (recent posts prioritized). In 2018, "Best" was introduced as the new default, which tries in addition to show new content, including weighting based on comment activity. Comment data is publicly available (Reddit Database, 2020), including precise timestamps, vote counts are available however only as monthly averages. The technical aspects of our data analysis is described in Sect. 4.3.
For book and album charts the ranking criteria are the number of sales and downloads within the respective chart periods. In analogy, we used the number of comments for Reddit posts. In Figure 5 the chart diversity and the respective lifetimes for 10-and 60-min Reddit top-100 charts are shown. Within the time window analyzed, 2013-2015, a downward trend in chart diversity is observed, with an equivalent increase in chart lifetimes. This is in contrast to the long-term trend for book and album charts that can be observed in Figure 1.
The distributions for 10-and 60-min chart lifetimes presented in Figure 5 is nonmonotonic. On a coarse level, the probability to observe a certain lifetime drops in a power law-like fashion, which is however interseeded by a pronounced local maximum. The maxima correspond to characteristic time scales of about 20-26 hours, for both the 10-and the 60-min charts, which suggests that the intrinsic 24 hours day-night activity cycle may be involved. Users may sleep on a post or comment of the day, in order to revisit it in the next morning. A lesson learned from the Reddit database is then, that the presence of characteristic time scales may interfere strongly with the otherwise operative feedback loop between posts and comments.

Twitter Hashtags
We use a publicly available corpus of Twitter hashtag statistics to study the residence time of hashtags in hourly, daily and weekly top-50 charts. See Sect. 4.4 for data source and handling. Hashtags and Reddit comments are both examples of active user contributions, albeit with a key difference. Adding a hashtag to a tweet can be presumed to be on the average less time-consuming than composing an entire Reddit comment. We may hence expect the trend towards self-organization to be more pronounced for Twitter hashtag charts, than for Reddit comments.
In Figure 3 the distribution of Twitter hashtags for hourly, daily and weekly top-50 charts are presented. With respect to the Reddit data shown in Figure 5 one observes that the 24 hour activity cycle is now substantially less pronounced. Instead of a peak, the dominant feature of the lifetime distribution of the hourly Twitter hashtag charts is a kink. To be precise, only a weak local maximum occurs at about 18h for the hourly charts, and a 30% drop at 24h. For daily and weekly charts no anomaly is observed. Fitting the Twitter hashtag data with the maximum entropy distribution (3), we decided to split the 1h lifetime data in two parts, below and above the above discussed kink. Within the log-log representation, one finds that the quadratic contribution tends to become smaller for longer charting periods, viz when going from 1h to daily and weekly charts. This observation complies with the argumentation laid out in Sect. 2.3, namely that lifetime distributions become more power law-like when the time scale of the individual activities is substantially smaller than the charting periods.

Data sources and processing 4.1 Book and album charts
The New York Times book best sellers (New York Times Best Sellers, 2020), the Spotify music streaming charts (Spotify Charts, 2020) and the Billboard sales-based music charts (Billboard Charts, 2020) have been obtained from public internet sources. We also examined the statistics of the set of Twitter hashtags compiled by Lorenz-Spreen et al. (2019).
The algorithm used for the compilation of the Billboard charts has been adjusted over time, with a major update in 2014/15. At that point the traditional salesbased ranking was substituted by a ranking based on a multi-metric consumption rate, which includes weighted song streaming. This update, which took effect at the end of 2014, affects the chart statistics profoundly.
The  For data analysis we used statistically relevant binning, which consists, as explained in Sect. 4.5, of adjusting bin sizes dynamically such that a minimal number of N (data) min data points per bin is obtained.

New York Times Best-seller List
In contrast to the Billboard magazine, the New York Times does not publish a full history of their best-seller list. Only recent lists are published on the official New York Times website (New York Times Best Sellers, 2020). Consequently, the analysis presented here relies on a republication of the data by Hawes Publications (2020), which collected and republished the weekly best-seller list from the 1950s until today. Since the length of the best-seller list changed repeatedly, all analysis above consider a top 10 ranking.

Reddit data analysis
Reddit data was downloaded from the Reddit Database (2020). Due to the large data size, we restricted our analysis to two six-month periods, the first half of 2013, which contains 19.050.122 post and 187.385.130 comments, and the second half of 2015, with a total of 37.038.895/347.236.797 entries. Comments of posts receiving less than 1000 comments over the observation period were removed from the data, since they would not show up in the charts analyzed. The cleaned comment data was separated into time slices based on the comment timestamps and aggregated by post IDs (identification labels). In this way charts for consecutive 10 minute and 60 minute slices where generated, as based on the number of comments received. Post lifetimes were evaluated from the respective top-100 charts.

Twitter Hashtags
The hourly ranking of the 50 most common hashtags were gathered by Mønsted (2019). This dataset, which has been used to study the acceleration of collective attention (Lorenz-Spreen et al., 2019), contains the 50 most common hashtags out of a 10% sample of all tweets, which were gathered every hour during the period from 2013 through 2016. For every hashtag the dataset also shows the number of times it appeared during the last hour in the 10% sample. This additional information allows, modulo the top-50 cut-off, to compile charts with longer observation intervals, e.g. daily or weekly charts.

Adaptive binning
For the generation of probability distribution functions from collected data one needs to group events together into bins. When sets of bins with predefined width are used, the number of events may vary strongly from one bin to another. It is hence advantageous, for statistical relevant binning, to adjust the width of the individual bins dynamically, until a minimum of N (data) min data points per bin has been reached. The binning procedure is finished once this is not any more possible. Comparing results obtained for different N (data) min , as done in Figure 5, allows to gauge the accuracy.

Discussion
Consumption charts, like sales and streaming charts, show remarkable similarities between different cultural products, e.g. when comparing music albums and literature. The evolution of the respective chart diversities, as well as the probability that a number-one title debuts as such, follow parallel courses for these two cultural goods. Similarly, one can fit the distribution of on-chart residence times, the respective lifetimes, in all cases by a generalized log-normal distribution that can be shown to maximize the information content of the lifetime distribution, to the extent as it is stored in the brain. Taking the long-term perspective, one notices distinct features for the evolution of the lifetime distribution of the New York Times best-seller list, and for album sales charts.
For music albums (Schneider and Gros, 2019), the lifetime distribution evolves from a log-normal distribution, as before the 1990s, to a power law, as for nowadays. In this context it is important to notice that power laws belong to the class of generalized log-normal distributions, albeit with a vanishing quadratic term in the exponent. The information-theoretical arguments presented in this study for the occurrence of log-normal distributions suggests that a transition from fully log-normal to a power law takes place when uncertainties, viz the variance, fade into secondary importance. This phenomenon is related to the time needed on the average for decision-making, given that it takes longer to gauge variances than means.
In societies, some time scales remain constant, others are subject to a secular acceleration process (Rosa, 2013). An example of a non-changing time scale is the period of typically 4-5 years between general elections, which contrasts with an accelerating opinion dynamics (Gros, 2017). Of relevance for the present study is the charting period of one week, which did not change since the inception of classical music charts and book bestseller lists. In contrast to the charting period, the decision time to buy an album is likely to have fallen substantially with the rise of the internet. Here we have argued that this development is reflected in the respective chart statistics. It is less clear if the same holds for the time people need to read a book.
Modern streaming charts can be used as a test bed for the relative time scale hypothesis. The available high temporal resolution allows narrowing down the time individuals need to decide to listen to an album, and then to actually do so. The lifetime distributions of Spotify album charts show that this time scale is nowadays between one week and one day. For a single song the decision can be made comparatively faster, which leads for both weekly and daily single charts to power law lifetime distributions.
Activity charts, such as Twitter hashtags or Reddit postings, differ in certain aspects from consumption charts. Human activities are subject to a day-night cycle, which shows up for the case of commenting on Reddit as a resurgence after around 24 hours. Previous-day posts are likely to be revisited after a night of sleep, as evident in Figure 5. This phenomenon, the 24h-cycle, will mask underlying power laws, if existent. The Twitter charts provide evidence for an equivalent occlusion mechanism, see Figure 3.
The lifetime distribution of hourly hashtag charts is characterized by a peak at 18 hours, together with a subsequent drop at 24 hours. Charts averaging over one or more full days, daily and weekly charts, are represented instead by smooth distributions. One finds indications for power laws with minor quadratic corrections. These results suggest that power laws appear ubiquitously for long-enough charting periods. In conclusion we believe that the data examined supports the basic hypothesis presented here, namely that aggregated human decision processes exhibit the pronounced statistical features characteristics of compressed information maximization.
Beyond information maximization, we found marked changes at the top of cultural charts, which started with the raise of the internet in the 1990s. This phenomenon regards the route to become a number-one hit, which took substantially longer in the past. We believe that this observation deserves further investigations, which would however transcend our present framework.