Methods for quantifying effects of social unrest using credit card transaction data
Societal unrest and similar events are important for societies, but it is often difficult to quantify their effects on individuals, hindering a timely and effective policy-making in emergencies and in particular localized social shocks such as protests. Traditionally, effects are assessed through economic indicators or surveys with relatively low temporal and spatial resolutions. In this work, we compute two behavioral indexes, based on the use of credit card transaction data, for measuring the economic effects of a series of protests on consumer actions and personal consumption. Using data from a metropolitan area in an OECD country, we show that protests affect consumers’ shopping frequency and spending, but in noticeably different ways. The effects show strong temporal and spatial patterns, vary between neighborhoods and customers of different socio-demographical characteristics as well as between merchants of different categories, and suggest interesting subtleties in purchase behavior such as displaced or delayed shopping activities. Our method can generally serve for the real-time monitoring of the effects of major social shocks or events on urban economy and consumer sentiment, providing high-resolution and cost-effective measurement tools to complement traditional economic indicators.
KeywordsSocial shocks Economic effect Consumer behavior Spatiotemporal pattern Credit card transaction
The routine of daily life is punctuated by extraordinary public events, ranging from major sports and cultural events, to terror attacks or disasters. Some events are local, and some are national or even wider; some affect only part of the population, while others affect the entire population. Although unlike natural events and disasters, certain social events, such as a protest, a riot, a large police action, or a bomb explosion can still have characteristics of natural emergencies, with fatalities and damage to properties. However, when such localized social shock occurs, we have mostly qualitative or at best retrospective survey data about its effect on the surrounding community. Consequently it is impossible to know how well authorities managed the side-effects of the event, and thus difficult to develop effective event policies for use by police and municipal authorities.
In particular, understanding the effects of major social shocks or events on the economy and consumer behavior can have important implications [1, 2, 3, 4, 5, 6, 7]. Existing measures of economic behavior and attitudes towards economic situations use surveys to assess consumer confidence or consumer sentiment, asking about the intentions to purchase goods or to make large investments. These include objective measures published by national agencies, such as the measure of personal consumption expenditures in the monthly personal income and outlays report, provided by The Bureau of Economic Analysis (BEA) . Other indicators are based on consumers’ subjective reports, such as the monthly consumer confidence index (CCI) issued by The Conference Board , and the monthly index of consumer sentiment (ICS) published by the University of Michigan . These data and the actual economic behavior, as expressed in consumption, can be strongly correlated , and they can therefore be used to evaluate the effects of events. However, their temporal resolution is low (consumer confidence and sentiment surveys are usually conducted on a monthly base). Their spatial resolution is low, too, since the measures aggregate information over large areas and usually entire economies. Finally, the information usually cannot be used to look at the differential effect of events on different segments of a society.
The limitations of traditional approaches may be overcome by using digital tools to study human and social behavior, analyzing large-scale quantitative data in emerging fields such as computational social science [12, 13]. The analysis of large amount of data accumulated in social networks, cellular phone records, or credit card transactions may allow us to gain a new understanding of social processes. Such data have been widely utilized to study situation awareness and response to emergencies [14, 15, 16, 17, 18], mostly due to natural events and disasters, as well as the dynamics of communication and information propagation immediately followed [19, 20, 21, 22]. However, relatively little work, so far, aimed to understand the impact of social events, and especially localized social shocks, using quantitative behavioral data.
Einav et al. recently highlighted the new possibilities of conducting economic research in the age of “Big Data” . Within this scope, in this work, we show how a particular type of data, namely credit card transaction records, can be used to quantify and understand the repercussions localized social events have on individuals’ economic behavior, quantifying when, where and how an event has an effect. In particular, credit card data help us overcome the limitations of traditional indicators by allowing us to measure purchase behavior directly, and, together with information about the types of merchants at which purchases are made, to infer what kinds of products are purchased. Especially, purchases at physical shops (rather than those made online) are clearly linked to specific spatial locations and time stamps. Furthermore, credit card issuers have information about the demographics and economic status of the credit card holder, which enables us to measure the effects of events on different demographic groups in the population. This would allow the authorities to develop targeted and timely policies in case of emergencies.
2 Data and methods
To demonstrate the use of this method, we analyzed more than ten million credit card transaction records provided by a major financial institution in an OECD country about more than 100 thousand individuals, at more than 100 thousand merchants during a period of three months. Each record consists of the date, time and amount (in local currency) for one credit card transaction, along with anonymized customer and store IDs. Additional information about the customers and stores is also available, such as customers’ gender and the neighborhoods in which they live, as well as the category of the stores and the neighborhoods in which they are located.
consumer action: the number of unique customers who made transactions at stores inside the neighborhood;
personal consumption: the median spending amount in local currency for all transactions inside the neighborhood.
We use the score of relative deviation from the mean for the two behavioral indexes, computed in each neighborhood for each day in the data set, to measure the impact of a series of protests that took place in a central site of the metropolitan area during a period of one month. In particular, we focus on the temporal, spatial, heterogeneous, and integral economic effect of these protest events.
3.1 Temporal variation of effects
To confirm our findings, we test the statistical significance of the results shown in Fig. 3(a), (b) in the Appendix (see Table A3(a)). On the days of two major events, we see a significant drop in both behavioral indexes for all three distances in Fig. 3(a), (b). In addition, there also exist statistically significant differences between distances. On days following the major protests (those after Day 62 and Day 77), the number of customers decreased significantly in neighborhoods within 4 km from the protest site, but not beyond. On the other hand, there was no significant decrease in the median spending in any distance, suggesting that the people who went out shopping did not tend to spend less money than usual.
3.2 Spatial decay of effects
These results indicate that, first, the amount of money people spent when making a purchase was less clearly affected by the protests, compared to the change in the number of customers, and the negative effect spread less further away from the event center as well. Second, it seems that the event on Day 77 had a stronger impact on consumer actions within the close vicinity of the event center, which is indicated by a higher initial quantity of 1.25 in the exponential fit and reflected by the magnitude of the first two points in Fig. 5(a), (b). Such effect, however, decayed spatially faster compared to Day 62, which is suggested by a smaller mean distance of 2.86 km and reflected by the generally smaller magnitude from the third point (3 km) onwards in Fig. 5(a), (b). On the other hand, on Day 62 personal consumption was more affected close to the event center, while the negative effect on this index spread further away on Day 77. Such differences might be due to people’s responses to events changing over time. Finally, we can also study the event effect as a function of the estimated travel time (by car) from the event center, thus taking into account geographical constraints and transportation accessibility. The results are similar (in terms the exponentially decaying patterns) and presented in Fig. A2 in the Appendix. We further show in Fig. A3 and Fig. A4 in the Appendix the event effect on the individual neighborhoods. Overall, these results demonstrate the possibility of utilizing transaction data to quantify the spatial decay of event effects.
3.3 Heterogeneous effects
Social events may impact neighborhoods of different characteristics and the purchase behavior of people from different demographic groups in different ways. It may also have a differential effect on purchases from different types of merchants. We therefore study the heterogeneous effects of the social events in the following scenarios, by focusing on neighborhoods within 4 km from the event center.
3.3.1 Socio-economic status
3.3.2 Political preference
We note that the neighborhoods we analyze here are centrally located, hence it is possible that both the socio-economic status and political preference of the neighborhoods mainly reflect the typical profile of the residents but not that of the visiting customers. The results presented here are therefore mainly behavioral change observed in these neighborhoods. However, according to our data, 54% (46%) transactions in the wealthier (less wealthy) neighborhoods being studied come from their residents or residents of other wealthy (less wealthy) neighborhoods in the city, while 61% (48%) transactions in the conservative (liberal) neighborhoods come from their residents or residents of other conservative (liberal) neighborhoods. Therefore we believe that shopping activities observed in these neighborhoods reflect to a certain extent the behavioral change of people of similar characteristics in terms of wealth and political preference. Tests for statistical significance of the differences between the two groups in Fig. 7 are presented in Table A3(c) in the Appendix.
3.3.3 Gender difference
3.3.4 Store category
3.4 Integral economic effects
Although societal unrest has led to major changes in people’s purchase behavior, particularly on the days of major events, it is interesting to investigate whether there exists an integral economic effect due to displaced or delayed shopping activities.
3.4.1 Displaced shopping
The possibility of displaced shopping may be suggested by Fig. 4(c) where it can be seen that certain neighborhoods further away from the event center, in particular those corresponding to the four squares on the top right, actually saw increased activities on Day 62. In this case, each of the squares represents a single neighborhood (with color indicating the number of stores in the neighborhood), which we analyze in detail as follows.
Increased shopping activities in these four neighborhoods were mainly due to (i) new merchants, and (ii) new customers that did not appear in the reference period. First, the neighborhood corresponding to the red square (100% increase in customers and 60% in spending) saw many payments at a fitness club and, since Day 62 happened to be the first day of the month, these may correspond to people paying subscription fees at the club. As the reference period does not contain first days of the month, this fitness club did not appear as a merchant in the reference period, and therefore the corresponding purchase behavior was not observed before.
Second, the two neighborhoods corresponding to the two light blue squares (220%/120% increase in customers and 20%/100% in spending) saw increased activities due to their close proximity to a new theme park and shopping complex, where those purchases were made on a Saturday (Day 62) at stores that were newly opened in the shopping complex.
Finally, instead of increased activities driven by new merchants, those in the neighborhood corresponding to the last square (60% increase in customers and 100% in spending) were mainly driven by new customers. Indeed, 13 out of the 19 customers purchased in that neighborhood on Day 62 never visited the neighborhood in the reference period, and at the same time they did not visit any neighborhood they used to go on the same day of the week. Therefore, this suggests a pattern of displaced shopping where these customers were shifting their shopping locations, which might be due to the influence of the protest events.
In summary, displaced shopping may have indeed taken place, but being further away from the event center, there could be other factors that influence shopping behavior as well, such as those that came with a special day like Day 62 (being the first day of the month as well as a Saturday that came shortly after the opening of a shopping complex).
3.4.2 Delayed shopping
We see from Fig. 3 that, in terms of both number of customers and median spending, shopping activities generally increased on mid-week days after the weekends when a majority of the protests took place. For the number of customers, such increase is most obviously in Fig. 9(a) for purchases at grocery stores. Compared to clothing stores and restaurants, grocery stores provide products needed for daily life, and it is therefore possible that people waited during the weekends and got the necessary shopping done shortly after the major protests. On the other hand, increase in median spending on mid-week days were mainly present in neighborhoods within 2 km from the event center, and is most obviously in synchronization with the increased spending in neighborhoods of lower socio-economic status shown in Fig. 6(b). Given that median spending dropped significantly in these neighborhoods on weekends of major protests, this suggests that people may have decided to save in uncertain situations and regained some confidence for make-up purchases after the protest events.
Given the discernible patterns of displaced or delayed shopping activities, however, the integral economic effect of the protest events remains largely negative, as is evident in Fig. A6 where change in total sales is illustrated for neighborhoods that are of different distances from the event center.
Our study demonstrates the potential of using pervasive and passively collected behavioral data to quantify the impact of localized social shocks and events on an urban population. We found that major events, such as societal unrest due to protests in our case, can alter people’s purchase behavior, both in terms of consumers’ tendency to visit shops in certain areas and in terms of their willingness to spend money. From a temporal perspective, personal consumption levels recover relatively quickly from the negative influence of the events. Spatially, on days of major events, consumption levels also seem more resilient, although both measures show clear spatial exponential decay. Event effects also differ between groups in the society, defined in terms of demographics, socio-economic status, and political conservatism, and between categories of merchants. Finally, given the results above, the existence of discernible patterns such as displaced or delayed shopping activities suggest that the effect of social events was not always distributed in the way one would imagine, e.g., it is not purely a function of geographical distance, and may help explain the rather unintuitive resilience of (or lack of change in) personal computation as a result of inelastic spending driven by needs.
Our analysis has certain limitations. The data set of credit card transaction records used in this study is based on a sample (about 10%) of all the individual customers of one financial institution and does not include people without credit cards. Therefore, sampling bias could exist and could potentially influence the results. Credit card transaction data may also represent only part of people’s daily spendings, as people may choose to pay with cash in certain scenarios. Furthermore, the data set only covers a period of three months, and the observations might be affected by seasonality. Finally, there might exist unobserved external events that bias our results, even though the strong spatiotemporal patterns observed on Days 62 and 77 are probably mainly caused by the major events on those days.
The real-time monitoring of economic behavior can have important social, economic and political implications. As a research tool, it makes it possible to assign quantitative values to the inherently elusive effects of social events. One advantage of the method we present here is the ability to analyze events with relatively high temporal and spatial resolutions. These allow a fine-grained understanding of events, beyond what is often possible with traditional measures. Also, the measures are less affected by demand characteristics or other biases, inherent in methods such as surveys. Such understanding and measures can potentially be used to develop timely and effective policies targeted at specific communities and demographic groups, and can often be critical in emergency situations.
With our method we can also evaluate statements made, for instance, by politicians or the media, such as that people are greatly upset by certain events. We can compare them to the observed changes in behavior, as expressed in the number of customers doing purchases and the average amount spent in a purchase. If people go about their activities as usual, the events probably have less impact, compared to when measures of behavior differ strongly from the usual values. At a practical level, these analyses may allow authorities to prepare for timely interventions to minimize possible negative consequences, possibly through a prediction mechanism that could estimate the recovery time of individual economic activities at an early stage of the event.
The framework we present here complements, rather than replaces, other methods such as traditional economic indicators or surveys. Combinations of it with other tools help create a comprehensive picture of the dynamic response of a population to social events.
The two latter steps aim to remove inactive customers and stores to prevent them from biasing subsequent analyses. The specific thresholds of ten customers and ten stores we used here have little effect on the amount of filtered data: with alternative thresholds of five customers/stores or three, 2.0 or 2.1 million transaction records are left after the filtering process.
An alternative would be to compute a z-score that also captures the variability of the static. However, computing sample standard deviation over a relatively small number of points may lead to large variation in the z-score hence biases the analysis afterwards. We therefore choose the relative deviation from the mean, which is a more stable measure and is also adopted in [20, 22].
Except that we remove three holidays in these six weeks when shopping activities were clearly boosted and would introduce obvious bias in the analysis.
Notice that the behavioral indexes are computed as relative changes with respect to a reference period, therefore the factor that activities may decrease further away from the city center due to less shops available has already been taken into account.
In our data set, we have 34% female customers who account for 38% of the transactions, which is moderately imbalanced. However, since the behavioral indexes are computed at the neighborhood level as relative changes with respect to a reference period, and gender split is assumed to stay reasonably stable across the reference and analysis periods, such imbalance would not cause an issue in the analysis and statistical tests presented in the paper.
X. Dong was supported by a Swiss National Science Foundation Mobility fellowship while completing this work. The authors are grateful to the financial institution that provided the credit card transaction data for this research.
Availability of data and materials
Data sufficient to reproduce all results in this paper will be made available upon request.
XD and JM conceived and designed the study. XD, JM, ES and BB conducted the analyses. XD, JM and ES wrote the manuscript. All authors reviewed the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
- 5.Getz D (2005) Event management and event tourism, 2nd edn. Cognizant Communication Corporation, New York Google Scholar
- 8.Personal income and outlays, U.S. Department of Commerce, Bureau of Economic Analysis (BEA). http://www.bea.gov/newsreleases/national/pi/pinewsrelease.htm
- 9.Consumer confidence index, The Conference Board. https://www.conference-board.org/data/consumerconfidence.cfm
- 10.Surveys of consumers, University of Michigan. http://www.sca.isr.umich.edu
- 11.Carroll CD, Fuhrer JC, Wilcox DW (1994) Does consumer sentiment forecast household spending? If so, why? Am Econ Rev 84:1397–1408 Google Scholar
- 13.Pentland A (2014) Social physics: how good ideas spread. Penguin, Baltimore Google Scholar
- 14.Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the international conference on World Wide Web, pp 851–860 Google Scholar
- 18.Martínez EA, Rubio MH, Martinez RM, Arias JM, Patane D, Zerbe A, Kirkpatrick R, Luengo-Oroz M (2016) Measuring economic resilience to natural disasters with big economic transaction data. In: Proceedings of the data for good exchange Google Scholar
- 19.Kapoor A, Eagle N, Horvitz E (2010) People, quakes, and communications: inferences from call dynamics about a seismic event and its influences on a population. In: Proceedings of AAAI symposium on artificial intelligence for development, pp 51–56 Google Scholar
- 24.Dong X, Jahani E, Morales AJ, Bozkaya B, Lepri B, Pentland A (2016) Purchase patterns, socioeconomic status, and political inclination. In: International conference on computational social science Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.