Skip to main content

Predicting gasoline shortage during disasters using social media


Shortage of gasoline is a common phenomenon during onset of forecasted disasters like hurricanes. Prediction of future gasoline shortage can guide agencies in pushing supplies to the correct regions and mitigating the shortage. We demonstrate how to incorporate social media data into gasoline supply decision making. We develop a systematic approach to examine social media posts like tweets and sense future gasoline shortage. We build a four-stage shortage prediction methodology. In the first stage, we filter out tweets related to gasoline. In the second stage, we use an SVM-based tweet classifier to classify tweets about the gasoline shortage, using unigrams and topics identified using topic modeling techniques as our features. In the third stage, we predict the number of future tweets about gasoline shortage using a hybrid loss function, which is built to combine ARIMA and Poisson regression methods. In the fourth stage, we employ Poisson regression to predict shortage using the number of tweets predicted in the third stage. To validate the methodology, we develop a case study that predicts the shortage of gasoline, using tweets generated in Florida during the onset and post landfall of Hurricane Irma. We compare the predictions to the ground truth about gasoline shortage during Irma, and the results are very accurate based on commonly used error estimates.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11


  • Ashktorab Z, Brown C, Nandi M, Culotta A (2014) Tweedr: mining twitter to inform disaster response. In: ISCRAM 2014 conference proceedings—11th international conference on information systems for crisis response and management (May), pp 354–358.,

  • Atefeh F, Khreich W (2015) A survey of techniques for event detection in twitter. Comput Intell 31(1):132–164

    Article  Google Scholar 

  • Beigi G, Hu X, Maciejewski R, Liu H (2016) An overview of sentiment analysis in social media and its applications in disaster relief. In: Pedrycz W, Chen SM (eds) Sentiment analysis and ontology engineering. Studies in Computational Intelligence, vol 639. Springer, Cham, pp 313–340.

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3(Jan):993–1022

    Google Scholar 

  • Blei DM, Lafferty JD et al (2007) A correlated topic model of science. Ann Appl Stat 1(1):17–35

    Article  Google Scholar 

  • Boulos MNK, Resch B, Crowley DN, Breslin JG, Sohn G, Burtner R, Pike WA, Jezierski E, Chuang KYS (2011) Crowdsourcing, citizen sensing and sensor web technologies for public and environmental health surveillance and crisis management: trends, OGC standards and application examples. Int J Health Geogr 10(1):67

    Article  Google Scholar 

  • Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley, Hoboken

    Google Scholar 

  • Brockwell PJ, Davis RA, Calder MV (2002) Introduction to time series and forecasting, vol 2. Springer, Berlin

    Book  Google Scholar 

  • Cadenas E, Rivera W (2010) Wind speed forecasting in three different regions of Mexico, using a hybrid ARIMA–ANN model. Renew Energy 35(12):2732–2738

    Article  Google Scholar 

  • Caragea C, Squicciarini A, Stehle S, Neppalli K, Tapia A (2014) Mapping moods: geo-mapped sentiment analysis during hurricane sandy. In: ISCRAM 2014 conference proceedings—11th international conference on information systems for crisis response and management (May), pp 642–651.

  • Cheng Z, Caverlee J, Lee K (2010) You are where you tweet: a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM international conference on information and knowledge management. ACM, pp 759–768

  • Chowdhury R, Chowdhury SR, Castillo C (2013) Tweet4act : using incident-specific profiles for classifying crisis-related messages. In: Proceedings of the 10th international ISCRAM conference (May), pp 834–839

  • Conover WJ (1971) Practical nonparametric statistics. Wiley, New York, pp 295–301

    Google Scholar 

  • Cordeiro M, Gama J (2016) Online social networks event detection: a survey. In: Solving large scale learning tasks. Challenges and algorithms. Springer, Cham, pp 1–41.

  • Faulkner M, Olson M, Chandy R, Krause J, Chandy KM, Krause A (2011) The next big one: detecting earthquakes and other rare events from community-based sensors. In: 2011 10th international conference on information processing in sensor networks (IPSN). IEEE, pp 13–24

  • Fdot (2017) Hurricane IRMA report by Florida department of transportation.

  • Feinerer I (2008) An introduction to text mining in R. Newslett R Proj 8/2:19

  • Fessenden H (2017) Price gouging.

  • Flood R (2017) Express UK website.

  • Gasbuddy (2017a)

  • Gasbuddy (2017b),%20NY

  • Gaynor M, Seltzer M, Moulton S, Freedman J (2005) A dynamic, data-driven, decision support system for emergency medical services. In: International conference on computational science. Springer, pp 703–711

  • Geman S, Geman D (1987) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. In: Readings in computer vision. Elsevier, pp 564–584

  • Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235

    Article  Google Scholar 

  • Gu S, Pan C, Liu H, Li S, Hu S, Su L, Wang S, Wang D, Amin T, Govindan R, et al (2014) Data extrapolation in social sensing for disaster response. In: 2014 IEEE international conference on distributed computing in sensor systems (DCOSS). IEEE, pp 119–126

  • Gupta A, Lamba H, Kumaraguru P, Joshi A (2013) Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 729–736

  • Han B, Cook P, Baldwin T (2013) A stacking-based approach to twitter user geolocation prediction. In: Proceedings of the 51st annual meeting of the association for computational linguistics: system demonstrations, pp 7–12

  • Hoffman M, Bach FR, Blei DM (2010) Online learning for latent dirichlet allocation. In: Advances in neural information processing systems, pp 856–864

  • Hope AC (1968) A simplified Monte Carlo significance test procedure. J R Stat Soc: Ser B (Methodological) 30(3):582–598

    Google Scholar 

  • Hornik K, Grün B (2011) topicmodels: an R package for fitting topic models. J Stat Softw 40(13):1–30

    Google Scholar 

  • Hughes AL, St Denis LA, Palen L, Anderson KM (2014) Online public communications by police & fire services during the 2012 hurricane sandy. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1505–1514

  • Imran M, Elbassuoni S, Castillo C, Diaz F, Meier P (2013) Practical extraction of disaster-relevant information from social media. In: Proceedings of the 22nd international conference on World Wide Web. ACM, pp 1021–1024

  • Imran M, Castillo C, Lucas J, Meier P, Vieweg S (2014) AIDR: Artificial intelligence for disaster response. In: Proceedings of the companion publication of the 23rd international conference on World Wide Web companion (October), pp 159–162. imran_castillo_lucas_meier_vieweg_www2014.pdf

  • Kaigo M (2012) Social media usage during disasters and social capital: Twitter and the great East Japan earthquake. Keio Commun Rev 34(1):19–35

    Google Scholar 

  • Ki EJ, Nekmat E (2014) Situational crisis communication and interactivity: usage and effectiveness of Facebook for crisis management by fortune 500 companies. Comput Hum Behav 35:140–147

    Article  Google Scholar 

  • Kumar S, Barbier G, Abbasi MA, Liu H (2011) Tweettracker: an analysis tool for humanitarian and disaster relief. In: Fifth international AAAI conference on weblogs and social media

  • Lachlan KA, Spence PR, Lin X (2014) Expressions of risk awareness and concern through Twitter: on the utility of using the medium as an indication of audience needs. Comput Hum Behav 35:554–559.

    Article  Google Scholar 

  • Lee S, Song J, Kim Y (2010) An empirical comparison of four text mining methods. J Comput Inf Syst 51(1):1–10

    Google Scholar 

  • Liu BF, Fraustino JD, Jin Y (2016) Social media use during disasters: how information form and source influence intended behavioral responses. Commun Res 43(5):626–646.

    Article  Google Scholar 

  • Mendoza M, Poblete B, Castillo C (2010) Twitter under crisis: can we trust what we RT?. In: Proceedings of the first workshop on social media analytics. ACM, pp 71–79

  • Meyer D, Hornik K, Feinerer I (2008) Text mining infrastructure in R. J Stat Softw 25(5):1–54

    Google Scholar 

  • Morstatter F, Lubold N, Pon-Barry H, Pfeffer J, Liu H (2014) Finding eyewitness tweets during crises. arXiv:1403.1773

  • National Hurricane Centre (2017) National hurricane centre website.

  • Nazer TH, Xue G, Ji Y, Liu H (2017) Intelligent disaster response via social media analysis a survey. ACM SIGKDD Explor Newsl 19(1):46–59

    Article  Google Scholar 

  • Ni M, He Q, Gao J (2017) Forecasting the subway passenger flow under event occurrences with social media. IEEE Trans Intell Transp Syst 18(6):1623–1632

    Google Scholar 

  • Nie H, Liu G, Liu X, Wang Y (2012) Hybrid of ARIMA and SVMS for short-term load forecasting. Energy Procedia 16:1455–1460

    Article  Google Scholar 

  • Olteanu A, Castillo C, Diaz F, Vieweg S (2014) CrisisLex: a lexicon for collecting and filtering microblogged communications in crises. In: Proceedings of the 8th international conference on weblogs and social media, p 376.

  • Pai PF, Lin CS (2005) A hybrid ARIMA and support vector machines model in stock price forecasting. Omega 33(6):497–505

    Article  Google Scholar 

  • Panagiotopoulos P, Barnett J, Bigdeli AZ, Sams S (2016) Social media in emergency management: Twitter as a tool for communicating risks to the public. Technol Forecast Soc Change 111:86–96.

    Article  Google Scholar 

  • Phan XH, Nguyen LM, Horiguchi S (2008) Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on World Wide Web. ACM, pp 91–100

  • Said SE, Dickey DA (1984) Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71(3):599–607

    Article  Google Scholar 

  • Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of the 19th international conference on World Wide Web. ACM, pp 851–860

  • Sampson J, Morstatter F, Zafarani R, Liu H (2015) Real-time crisis mapping using language distribution. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 1648–1651

  • Schulz A, Hadjakos A, Paulheim H, Nachtwey J, Mühlhäuser M (2013) A multi-indicator approach for geolocalization of tweets. In: Seventh international AAAI conference on weblogs and social media, pp 573–582

  • Starbird K, Stamberger J (2010) Tweak the tweet: leveraging microblogging proliferation with a prescriptive syntax to support citizen reporting. In: Proceedings of the 7th international ISCRAM conference, information systems for crisis response and management Seattle, WA, vol 1, pp 1–5

  • Stowe K, Paul MJ, Palmer M, Palen L, Anderson K (2016) Identifying and categorizing disaster-related tweets. In: Proceedings of The fourth international workshop on natural language processing for social media, pp 1–6

  • Stříteskỳ V, Stránská A, Drábik P (2015) Crisis communication on facebook. Studia Commercialia Bratislavensia 8(29):103–111

    Article  Google Scholar 

  • Tien Nguyen D, Mannai KAA, Joty S, Sajjad H, Imran M, Mitra P (2016) Rapid classification of crisis-related data on social networks using convolutional neural networks. arXiv:1608.03902

  • Tseng FM, Yu HC, Tzeng GH (2002) Combining neural network model with seasonal time series ARIMA model. Technol Forecast Soc Change 69(1):71–87

    Article  Google Scholar 

  • Ushahidi (2017) Ushahidi.

  • Utz S, Schultz F, Glocka S (2013) Crisis communication online: how medium, crisis type and emotions affected public reactions in the Fukushima Daiichi nuclear disaster. Public Relat Rev 39(1):40–46

    Article  Google Scholar 

  • van Gorp A, Pogrebnyakov N, Maldonado E (2015) Just keep tweeting: emergency responder’s social media use before and during emergencies. In: Proceedings of the 23rd European conference on information systems (ECIS 2015), pp 1–15.

  • Wainwright MJ, Jordan MI et al (2008) Graphical models, exponential families, and variational inference. Found Trends Mach Learn 1(1–2):1–305

    Article  Google Scholar 

  • Waze (2017) Waze.

  • Xu Q, Tsui KL, Jiang W, Guo H (2016) A hybrid approach for forecasting patient visits in emergency department. Qual Reliab Eng Int 32(8):2751–2759

    Article  Google Scholar 

  • Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175

    Article  Google Scholar 

  • Zhu B, Wei Y (2013) Carbon price forecasting with a novel hybrid ARIMA and least squares support vector machines methodology. Omega 41(3):517–524

    Article  Google Scholar 

  • Zook M, Graham M, Shelton T, Gorman S (2010) Volunteered geographic information and crowdsourcing disaster relief: a case study of the Haitian earthquake. World Med Health Policy 2(2):7–33

    Article  Google Scholar 

Download references


The authors would like to thank two anonymous referees who provided detailed comments that significantly enhanced our paper.


Funding was provided by National Science Foundation (Grant No. 1663101).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Rajan Batta.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Khare, A., He, Q. & Batta, R. Predicting gasoline shortage during disasters using social media. OR Spectrum 42, 693–726 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Social media analytics
  • Gasoline shortage prediction modeling
  • Disaster management
  • Hybrid loss function
  • Hurricane Irma