Abstract
This chapter reviews the literature on the scan statistic with particular emphasis on its application. The chapter considers the application of the scan statistic to social media data, where it explores changes in the sentiments of sadness, fear, anger, joy, and love. This compares 2017 to 2016 data for each geographical location around the world. The temporal scan statistic is used to flag periods within 2017 with significantly different sentiments from the average of the whole of 2016. This was carried out firstly within Australia and then, in less details, in other pockets around the world.
This is a preview of subscription content, log in via an institution.
References
Antzoulakos DL, Koutras MV, Rakitzis RC (2009) Start-up demonstration tests based on run and scan statistics. J Qual Technol 41(1):48–59
Bersimis S, Chalkias C, Anthopoulou T (2014) Detecting and interpreting clusters of economic activity in rural areas using scan statistic and LISA under a unified framework. Applied Stoch Models Bus Ind 30(5):573–587
Bersimis S, Sachlas A and Castagliola P (2017). Controlling bivariate categorical processes using scan rules. Methodol Comput Appl Probab 19(4):1135–1149
Bolt S, Sparks, R (2013) Detecting and diagnosing hotspots for the enhanced management of hospital emergency departments in Queensland, Australia. BMC Med Inform Decis Mak 13:132. https://doi.org/10.1186/1472-6947-13-132
Cançado ALF, da-Silva CQ, da Silva MF (2014) A spatial scan statistic for zero-inflated Poisson process. Environ Ecol Stat 21(4):627–650. https://doi.org/10.1007/s10651-013-0272-1
Cao CX, Xu M, Chen JQ et al (2010) Space-time scan statistic based early warning of H1N1 influenza a in Shenzhen, China. Int Arch Photogramm Remote Sens Spat Inf Sci 38(8):349–353
Chen F, Neill DB (2014) Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’14), pp 1166–1175. https://doi.org/10.1145/2623330.2623619
Chen J, Glaz J (2016) Scan statistics for monitoring data modeled by a negative binomial distribution. Commun Stat Theory Methods 45(6):1632–1642. https://doi.org/10.1080/03610926.2014.923460
Cheung YTD, Spittal MJ, Williamson MK et al (2013) Application of scan statistics to detect suicide clusters in Australia. PLoS One 8(1):e54168. https://doi.org/10.1371/journal.pone.0054168
Christian WJ, Huang B, Rinehart J et al (2011) Exploring geographic variation in lung cancer incidence in Kentucky using a spatial scan statistic: elevated risk in the Appalachian coal-mining region. Public Health Rep 126(6):789–796
Christiansen LE, Andersen JS, Wegener HC et al (2006) Spatial scan statistics using elliptic windows. J Agric Biol Environ Stat (JABES) 11:411. https://doi.org/10.1198/108571106X154858
Costa MA, Kulldorff M (2014) Maximum linkage space-time permutation scan statistics for disease outbreak detection. Int J Health Geogr 2014 13:20. https://doi.org/10.1186/1476-072X-13-20
Coulston JW, Riitters KH (2003) Geographic analysis of forest health indicators using spatial scan statistics. J Environ Manage 31(6):764–773
Cressie N (1977) On some properties of the scan statistic on the circle and the line. J Appl Probab 14(2):272–283
Cucala L (2009) A flexible spatial scan test for case event data. Computat Stat Data Anal 53(8):2843–2850
Cucala L (2014) A distribution-free spatial scan statistic for marked point processes. Spat Stat 10:117–125. https://doi.org/10.1016/j.spasta.2014.03.004
Cucala L (2016) A Mann–Whitney scan statistic for continuous data. Commun Stat Theory Methods 45(2):321–329. https://doi.org/10.1080/03610926.2013.806667
De Lima MS, Duczmal LH, Neto JC et al (2015) Spatial scan statistics for models with overdispersion and inflated zeros. Stat Sin 25(1):225–241. http://www.jstor.org/stable/24311013
de Oliveira DP, Neill DB, Garrett Jr JH et al (2011) Detection of patterns in water distribution pipe breakage using spatial scan statistics for point events in a physical network. J Comput J Comput Civil Eng 25(1):1943–5487. ISSN (online)
Dhewantara PW, Ruliansyah A, Fuadiyah EAM et al (2015) Space-time scan statistics of 2007–2013 dengue incidence in Cimahi city, Indonesia. Geospat Health 10(2):255–260
Duczmal L, Assuncao R (2004) A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Computat Stat Data Anal 45(2):269–286.
Duczmal L, Cançado ALF, Takahashi RHC et al (2007) A genetic algorithm for irregularly shaped spatial scan statistics. Comput Stat Data Anal 52(1):43–52
Echevarría-Zuno S, Mejía-Aranguré JM, Mar-Obeso AJ et al (2010) Infection and death from influenza A H1N1 virus in Mexico: a retrospective analysis. Lancet 374:2072–2079. https://doi.org/10.1016/S0140-6736(09)61638-X
Fang X, Siegmund D (2016) Poisson approximation for two scan statistics with rates of convergence. Ann Appl Prob 26(4):2384–2418
Fraker SE, Woodall WH, Burkom HS (2008) A note on the poisson likelihood ratio test statistic for Kulldorff’s scan methods. Commun Stat Theory Methods 37(7):998–1001
Fuchs S, Ornetsmüller C, Totschnig R (2012) Spatial scan statistics in vulnerability assessment: an application to mountain hazards. Nat Hazards 64(3):2129–2151
Gao P, Guo G, Liao K et al (2013) Early detection of terrorism outbreaks using prospective space–time scan statistics. Prof Geogr 65(4):676–691
Glaz J (1996) Discrete scan statistics with applications to minefield detection. In: Detection and Remediation Technologies for Mines and Minelike Targets. International Society for Optics and Photonics, Vol 2765, pp 420–430
Glaz J, Zhang Z (2004) Multiple window discrete scan statistics. J Appl Stat 31(8):967–980
Gould MS, Wallenstein S, Davidson L (1989) Suicide clusters: a critical review. Suicide Life Threat Behav 19(1):17–29
Guerriero M, Willett P, Glaz J (2009) Distributed target detection in sensor networks using scan statistics. IEEE Trans Signal Process 57(7):2629–2639
Gurwith M, Wenman W, Hinde D et al (1981) A prospective study of rotavirus infection in infants and young children. J Infect Dis 144(3):218–224. https://doi.org/10.1093/infdis/144.3.218
Haiman G, Preda C (2002) A new method for estimating the distribution of scan statistics for a two-dimensional poisson process. Methodol Comput Appl Probab 4(4):393–407. 30
Han SW, Tsui K-L, Ariyajunya B et al (2008) A comparison of CUSUM, EWMA, and temporal scan statistics for detection of increases in poisson rates. Qual Reliab Eng Int 26(3):279–289. 31. https://doi.org/10.1002/qre.1056
Han J, Zhu L, Kulldorff M et al (2016) Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics. Int J Health Geogr 15:27. https://doi.org/10.1186/s12942-016-0056-6
Hu Y, Zhang Y, Davis L (2013) Unsupervised abnormal crowd activity detection using semiparametric scan statistic. The IEEE conference on computer vision and pattern recognition (CVPR), pp 767–774
Huang L, Kulldorff M, Gregorio D (2007) A spatial scan statistic for survival data. Biometrics 63(1):109–118
Huang L, Stinchcomb DG, Pickle LW et al (2009) Identifying clusters of active transportation using spatial scan statistics. Am J Prev Med 37(2):157–166
Imanishi M, Newton AE, Vieira AR et al (2015) Typhoid fever acquired in the United States, 1999–2010: epidemiology, microbiology, and use of a space–time scan statistic for outbreak detection. Epidemiol Infect 143(11):2343–2354. https://doi.org/10.1017/S0950268814003021
Jung I, Kulldorff M, Klassen AC (2007) A spatial scan statistic for ordinal data. Stat Med 26(7):1594–1607
Jung I (2009) A generalized linear models approach to spatial scan statistics for covariate adjustment. Stat Med 28(7):1131–1143
Jung I, Kulldorff M, Richard OJ (2010) A spatial scan statistic for multinomial data. Stat med 29(18):1910–1918
Jung I, Cho HJ (2015) A nonparametric spatial scan statistic for continuous data. Int J Health Geogr 14:30. https://doi.org/10.1186/s12942-015-0024-6
Kulldorff M, Athas WF, Feurer EJ et al (1998) Evaluating cluster alarms: a space-time scan statistic and brain cancer in Los Alamos, New Mexico. Am J Public Health (AJPH) 88(9):1377–1380
Kulldorff M, Huang L, Pickle L, Duczmal L (2006) An elliptic spatial scan statistic. Stat Med 25(22):3929–3943
Kulldorff M, Huang L, Konty K (2009) A scan statistic for continuous data based on the normal probability model. Int J Health Geogr 8:58. https://doi.org/10.1186/1476-072X-8-58
Kulldorff M, Dashevsky I, Avery TR et al (2013) Drug safety data mining with a tree-based scan statistic. Pharmacoepidemiol Drug Saf 22(5):517–523
Lachenbruch PA, Foulkes MA, Williams AE(2005) Potential use of the scan statistic for quality control in blood product manufacturing. J Biopharm Stat 15(2):353–366. https://doi.org/10.1081/BIP-200048790
Li XZ, JF Wang, WZ Yang et al (2011) A spatial scan statistic for multiple clusters. Math Biosci 233(2):135–142
Li XZ, Wang JF, Yang WZ et al (2012) A spatial scan statistic for nonisotropic two-level risk cluster. Stat Med 31(2):177–187
Lin PS, Kung YH, Clayton M (2016) Spatial scan statistics for detection of multiple clusters with arbitrary shapes. Biometrics 72(4):1226–1234. https://doi.org/10.1111/biom.12509
Lian M, Warner RD, Alexander JL et al (2007) Using geographic information systems and spatial and space-time scan statistics for a population-based risk analysis of the 2002 equine West Nile epidemic in six contiguous regions of Texas. Int J Health Geogr 6:42
Liu X, Zhang P (2010) A scan statistics based suspicious transactions detection model for anti-money laundering (AML) in financial institutions. In: International conference on multimedia communications, pp 210–213. https://doi.org/10.1109/MEDIACOM.2010.37
Mala S, Sengupta R (2013) Geo-visual approach for spatial scan statistics: an analysis of dengue fever outbreaks in Delhi. Int J Adv Comput Sci Appl 4(10):27–37
McClure DL, Xu S, Weintraub E et al (2012) An efficient statistical algorithm for a temporal scan statistic applied to vaccine safety analyses. Vaccine 30(27):3986–3991. https://doi.org/10.1016/j.vaccine.2012.04.040
Montrone S, Perchinunno P, Di Giuro A et al (2009) Identification of “Hot Spots” of social and housing difficulty in urban areas: scan statistics for housing market and urban planning policies. In: Murgante B, Borruso G, Lapucci A (eds) Geocomputation and urban planning. Studies in computational intelligence, vol 176. Springer, Berlin/Heidelberg
Nagarwalla N (1996) A scan statistic with a variable window. Stat Med 15(7–9):845–850
Nakaya T, Yano K (2010) Visualising crime clusters in a space-time cube: an exploratory data-analysis approach using space-time kernel density estimation and scan statistics. Trans GIS 14(3):223–239. https://doi.org/10.1111/j.1467-9671.2010.01194.x
Neill DB (2011) Fast Bayesian scan statistics for multivariate event detection and visualization. Stat Med 30(5):455–469
Neil J, Hash C, Brugh A et al (2013) Scan statistics for the online detection of locally anomalous subgraphs. Technometrics 55(4):403–414
Neil J, Storlie C, Hash C et al (2014) Statistical detection of intruders within computer networks using scan statistics. In: Data analysis for network cyber-security, pp 71–104. https://doi.org/10.1142/9781783263752_0003
Odoi A, Martin SW, Michel P et al (2004) Investigation of clusters of giardiasis using GIS and a spatial scan statistic. Int J Health Geogr 3(11). https://doi.org/10.1186/1476-072X-3-11
Onozuka D, Hagihara A (2007) Geographic prediction of tuberculosis clusters in Fukuoka, Japan, using the space-time scan statistic. BMC Infect Dis 7:26. https://doi.org/10.1186/1471-2334-7-26
Ozdenerol Z, Williams BL, Kang SY et al (2005) Comparison of spatial scan statistic and spatial filtering in estimating low birth weight clusters. Int J Health Geogr 4:19. https://doi.org/10.1186/1476-072X-4-19
Pacifico MP, Genovese C, Verdinelli I et al (2007) Scan clustering: a false discovery approach. J Multivar Anal 98(7):1441–1469. 63. https://doi.org/10.1016/j.jmva.2006.11.011
Patil GP, Taillie C (2003) Geographic and network surveillance via scan statistics for critical area detection. Statist Sci 18(4):457–465
Patil GP, Bishop JA, Myers WL et al (2004) Detection and delineation of critical areas using echelons and spatial scan statistics with synoptic cellular data. Environ Ecol Stat 11(2): 139–164
Patil GP, Modarres R, Myers WL et al (2006) Spatially constrained clustering and upper level set scan hotspot detection in surveillance geoinformatics. Environ Ecol Stat 13(4):365–377
Pedigo A, Aldrich T (2011) Neighborhood disparities in stroke and myocardial infarction mortality: a GIS and spatial scan statistics approach. BMC Public Health 11:644. 68. https://doi.org/10.1186/1471-2458-11-644
Perez AM, Thurmond MC, Grant PW (2005) Use of the scan statistic on disaggregated province-based data: foot-and-mouth disease in Iran. Prev Vet Med 71(3–4):197–207
Read S, Bath PA, Willett P, Maheswaran R (2013) New developments in the spatial scan statistic. J Inform Sci 39(1):36–47
Rosychuk RJ, Chang HM (2013) A spatial scan statistic for compound Poisson data. Stat Med 32(29):5106–5118
Shiode S (2011) Street-level spatial scan statistic and STAC for analysing street crime concentrations. Trans GIS 15(3):365–383
Siston AM, Rasmussen SA, Honein MA et al (2010) Pandemic H1N1 influenza in pregnancy working group FT. Pandemic 2009 influenza A(H1N1) virus illness among pregnant women in the United States. JAMA 303(15):1517–1525. https://doi.org/10.1001/jama.2010.479
Sparks R, Adolphson A, Phatak A (1997) Multivariate process monitoring using dynamic biplot. Int Stat Rev 65:326–349
Sparks R, Okugami C (2010) Surveillance trees: early detection of unusually high number of vehicle crashes. InterStat 1–24. http://interstat.statjournals.net/YEAR/2010/articles/1001002.pdf
Sparks RS (2010) Enhancing road safety through early detection of outbreaks in the frequency of motor vehicle crashes. Saf Sci 48:135–144. https://doi.org/10.1016/j.ssci.2009.07.003
Sparks R (2012) Spatially clustered outbreak detection using the ewma scan statistics with multiple sized windows. Commun Stat Simul Comput 41(9):1637–1653
Sparks R, Bolt S, Okugami C (2012) Spatio-temporal disease surveillance. In: Morse S (ed) Bioterrorism, Chapter 8. InTech, London, pp 159–178
Sparks R (2017) Shewhart dispersion charts made easy for mild to moderately autocorrelated normally distributed data. Qual Eng 1–17. https://doi.org/10.1080/08982112.2017.1311415
Sparks R, Robinson B, Power R, Cameron M, Woolford S (2017) An investigation into social media syndromic monitoring. Commun Stat Simul Comput 46(8):5901–5923
So HC, Pearl DL, von Königslöw T et al (2013) Spatio-temporal scan statistics for the detection of outbreaks involving common molecular subtypes: using human cases of Escherichia coli O157: H7 provincial PFGE Pattern 8 (National Designation ECXAI.0001) in Alberta as an Example. Zoonoses Public Health 60(5):341–348
Takahashi K, Kulldorff M, Tango T et al (2008) A flexibly shaped space-time scan statistic for disease outbreak detection and monitoring. Int J Health Geogr 7:14. https://doi.org/10.1186/1476-072X-7-14
Taseli A, Benneyan JC (2009) Risk adjusted bernoulli spatial scan statistics. IIE annual conference. Proceedings, pp 2289–2294. Retrieved from https://search.proquest.com/docview/192459405?accountid=26957
Taylor SR, Arrowsmith SJ (2010) Detection of short time transients from spectrograms using scan statistics. B Seismol Soc Am 100(5A):1940–1951. 82. https://doi.org/10.1785/0120100017
Tiwari N, Adhikari CMS, Tewari A et al (2006) Investigation of geo-spatial hotspots for the occurrence of tuberculosis in Almora district, India, using GIS and spatial scan statistic. Int J Health Geogr 5:33. https://doi.org/10.1186/1476-072X-5-33
Tuia D, Kaiser C, Da Cunha A et al (2007) Socio-economic cluster detection with spatial scan statistics. Case study: services at intra-urban scale. University of Lausanne technical report. 86. 87
Tuia D, Ratle F, Lasaponara R et al (2008) Scan statistics analysis of forest fire clusters. Commun Nonlinear Sci Numer Simul 13(8):1689–1694. https://doi.org/10.1016/j.cnsns.2007.03.004
Vadrevu KP (2008) Analysis of fire events and controlling factors in eastern India using spatial scan and multivariate statistics. Geogr Ann A 90(4):315–328
Vieira AR, Houe H, Wegener HC et al (2009) Spatial scan statistics to assess sampling strategy of antimicrobial resistance monitoring program. Foodborne Pathog Dis 6(1):15–21. https://doi.org/10.1089/fpd.2008.0132
Viel JF, Floret N, Mauny F (2005) Spatial and space-time scan statistics to detect low rate clusters of sex ratio. Environ Ecol Stat 12(3):289–299
Wallenstein S, Naus J (2004) Scan statistics for temporal surveillance for biologic terrorism. Morbidity and mortality weekly report 53, Supplement: Syndromic surveillance, reports from a national conference, 2003, pp 74–78
Wan Y, Pei T, Zhou C et al (2012) ACOMCD: a multiple cluster detection algorithm based on the spatial scan statistic and ant colony optimization. Comput Stat Data Anal 56(2):283–296. https://doi.org/10.1016/j.csda.2011.08.001
Wang X, Glaz J (2014) Variable window scan statistics for normal data. Commun Stat Theory Methods 43(10–12):2489–2504
Widyaningsih Y, Pin TG (2010) A space-time scan statistic to detect cluster alarms of dengue mortality in Indonesia, 2005. Makara J Sci 12(1):27–30. https://doi.org/10.7454/mss.v12i1.331
Woodall WH, Marshall JB, Joner Jr MD et al (2008) On the use and evaluation of prospective scan methods for health-related surveillance. J R Stat Soc Ser A 171(1):223–237
Wu TL, Glaz J, Fu JC (2013) Discrete, continuous and conditional multiple window scan statistics. J Appl Probab 50(4):1089–1101
Wu TL, Glaz J (2015) A new adaptive procedure for multiple window scan statistics. Comput Stat Data Anal 82:164–172. https://doi.org/10.1016/j.csda.2014.09.002
Xiao YY, Li XZ, Ye FL et al (2015) Parameters setting on hand-foot-and-mouth cluster detection in prospective space-time scan statistics. Chin J Dis Control Prev. http://en.cnki.com.cn/Article_en/CJFDTOTAL-JBKZ201504018.htm
Xu S, Hambidge SJ, McClure DL et al (2013) A scan statistic for identifying optimal risk windows in vaccine safety studies using self-controlled case series design. Stat Med 32(19):3290–3299. https://doi.org/10.1002/sim.5733
Young RL, Weinberg J, Vieira V et al (2010) A power comparison of generalized additive models and the spatial scan statistic in a case-control setting. Int J Health Geogr 9:37. https://doi.org/10.1186/1476-072X-9-37
Zhang T, Zhang Z, Lin G (2012) Spatial scan statistics with over dispersion. Stat Med 31(8): 762–774
Zhang Z, Assunção R, Kulldorff M (2010) Spatial scan statistics adjusted for multiple clusters. J Probab Stat Article ID 642379:1–11. http://doi.org/10.1155/2010/642379
Zhao B, Glaz J (2016a) Scan statistics for detecting a local change in variance for normal data with known variance. Methodol Comput Appl Probab 18:563–573. https://doi.org/10.1007/s11009-015-9465-4
Zhao B, Glaz J (2016b) Scan statistics for detecting a local change in variance for normal data with unknown population variance. Stat Probab Lett 110:137–145. 106. https://doi.org/10.1016/j.spl.2015.12.020
Zhao B, Glaz J (2017) Scan statistics for detecting a local change in variance for two-dimensional normal data. Commun Stat Theory Methods 46(11):5517–5530
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Sparks, R., Paris, C. (2019). The Scan Statistic for Multidimensional Data and Social Media Applications. In: Glaz, J., Koutras, M. (eds) Handbook of Scan Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8414-1_46-1
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8414-1_46-1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8414-1
Online ISBN: 978-1-4614-8414-1
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering