Scan Statistics pp 1-25 | Cite as

# Joseph Naus: Father of the Scan Statistic

## Abstract

Currently, the literature on the scan statistic is vast, growing exponentially in diverse directions, with contributions by many researchers and groups. As time goes on, the early history of the problem bears telling. Joseph Naus, the father of the scan statistic, originated the modern work on the topic. The process took almost twenty years to reach maturity; I have chosen Naus (1982) as the definition of this maturity. The very name “scan statistic” does not appear to have become attached to the problem for fifteen years, and the interconnections to what is now one problem, in both statement of the problem and common methods of solution, was far from obvious originally. This chapter will not attempt a full review of all of Naus’s statistical contributions, or even a full review of his contributions as they concern the scan statistic. Instead, it will focus on a few themes that had already originated in Naus’s first twenty years of written research (1962–1982), and briefly continue with those threads to the present. Since these early themes include such general issues as applications of the scan statistic, mentoring graduate students, and specific methodological issues, the review will encompass a significant portion of Dr. Naus’s research, without making claim to being exhaustive regarding either his research or the much broader topic of research he influenced on the scan statistic.

## Keywords

Exact Probability Multiple Coverage Poisson Approximation Generalize Likelihood Ratio Test Fixed Grid## References

- 1.Balakrishnan, N. and Koutras, M.V. (2002).
*Runs and Scans with Applications*, Wiley, New York.zbMATHGoogle Scholar - 2.Barton, D.E. and Mallows, C.L. (1965). Some aspects of the random sequence,
*Annals of Mathematical Statistics*,**36**, 236–260.CrossRefzbMATHMathSciNetGoogle Scholar - 3.Berg, W. (1945). Aggregates in one- and two-dimensional random distributions,
*The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*,**36**, 319–336.Google Scholar - 4.Burnside W. (1928).
*Theory of Probability*, Cambridge University Press, Cambridge.zbMATHGoogle Scholar - 5.Cressie, N. (1977). On some properties of the scan statistic on the circle and the line,
*Journal of Applied Probability*,**14**, 272–283.CrossRefzbMATHMathSciNetGoogle Scholar - 6.Cressie, N. (1979). An optimal statistic based on higher order gaps,
*Biometrika*,**66**, 619–627.CrossRefzbMATHMathSciNetGoogle Scholar - 7.Ederer, F., Myers, M.H., and Mantel, N. (1964). A statistical problem in space and time: Do leukemia cases come in clusters?
*Biometrics*,**20**, 626–636.CrossRefGoogle Scholar - 8.Elteren, Van P.H. and Gerrits, H.J.M. (1961). Een wachtprobleem voorkomende bij drempelwaardemetingen aan het oof,
*Statistica Neerlandica*,**15**, 385–401.CrossRefGoogle Scholar - 9.Erdös, P. and Rényi, A. (1970). On a new law of large numbers,
*Journal d’Analyse Mathématique*,**23**, 103–111.CrossRefzbMATHGoogle Scholar - 10.Feller, W. (1958).
*An Introduction to Probability Theory and its Applications*, Vol. I, 2nd Edition, John Wiley & Sons, New York.Google Scholar - 11.Fisher, R.A. (1959).
*Statistical Methods and Scientific Inference*, Hafner, New York.Google Scholar - 12.Fu, J.C. and Lou, W.Y.W. (2003).
*Distribution Theory of Runs and Patterns and Its Applications*, World Scientific, Singapore.zbMATHGoogle Scholar - 13.Glaz, J. (1978).
*Multiple Coverage and Clusters on the Line*, Ph.D. thesis, Rutgers University, New Brunswick, NJ.Google Scholar - 14.Glaz, J. and Balakrishnan, N., Editors (1999).
*Scan Statistics and Applications*, Birkhäuser, Boston, MA.Google Scholar - 15.Glaz, J. and Naus, J. (1979). Multiple coverage of the line,
*Annals of Probability*,**7**, 900–906.CrossRefzbMATHMathSciNetGoogle Scholar - 16.Glaz, J. and Naus, J. (1983). Multiple clusters on the line,
*Communications in Statistics—Theory and Methods*,**12**, 1961–1986.Google Scholar - 17.Glaz, J. and Naus, J. (1986). Approximating probabilities of first passage in a particular Gaussian process,
*Communications in Statistics*,**15**, 1709–1722.CrossRefzbMATHMathSciNetGoogle Scholar - 18.Glaz, J. and Naus, J. (1991). Tight bounds and approximations for scan statistic probabilities for discrete data,
*Annals of Applied Probability*,**1**, 306–318.CrossRefzbMATHMathSciNetGoogle Scholar - 19.Glaz, J. and Naus, J. (2005). Scan Statistics and Applications,
*Encyclopedia of Statistical Sciences*, 2nd Edition, S. Kotz, N. Balakrishnan, C.B. Read and B. Vidacovic, eds., 7463–7471, Wiley, New York.Google Scholar - 20.Glaz, J., Naus, J., Roos, M., and Wallenstein, S. (1994). Poisson approximations for the distribution and moments of ordered
*m*-spacings,*Journal Applied Probability*,**31A**, 271–281.CrossRefMathSciNetGoogle Scholar - 21.Glaz, J., Naus, J., and Wallenstein, S. (2001).
*Scan Statistics*, Springer-Verlag, New York.zbMATHGoogle Scholar - 22.Greenberg, M., Naus, J., Schneider, D., and Wartenberg, D. (1991). Temporal clustering of homicide and suicide among 15–24 year old white and black Americans,
*Ethnicity and Disease*,**1**, 342–350.Google Scholar - 23.Huntington, R. and Naus, J.I. (1975). A simpler expression for
*k*th nearest neighbor coincidence probabilities,*Annals of Probability*,**3**, 894–896.CrossRefzbMATHMathSciNetGoogle Scholar - 24.Hwang, F.K. (1977). A generalization of the Karlin-McGregor theorem on coincidence probabilities and an application to clustering,
*Annals of Probability*,**5**, 814–817.CrossRefzbMATHGoogle Scholar - 25.Ikeda, S. (1965). On Bouman-Velden-Yamamoto’s asymptotic evaluation formula for the probability of visual response in a certain experimental research in quantum biophysics of vision,
*Annals of the Institute of Statistics and Mathematics*,**17**, 295–310.CrossRefzbMATHGoogle Scholar - 26.Karlin, S. and McGregor, G. (1959). Coincidence probabilities,
*Pacific Journal of Mathematics*,**9**, 1141–1164.zbMATHMathSciNetGoogle Scholar - 27.Karwe, V.V. and Naus, J. (1997). New recursive methods for scan statistic probabilities,
*Computational Statistics and Data Analysis*,**23**, 389–402.CrossRefzbMATHGoogle Scholar - 28.Kulldorff, M. (1997). A spatial scan statistic,
*Communications in Statistics, A—Theory and Methods*,**26**, 1481–1496.Google Scholar - 29.Kulldorff, M. (2001). Prospective time-periodic geographical disease surveillance using a scan statistic,
*Journal of Royal Statistical Society A*,**164**, 61–72.CrossRefzbMATHMathSciNetGoogle Scholar - 30.Kulldorff, M. and Williams, G. (1997).
*SaTScan v. 1.0, Software for the Space and Space-Time Scan Statistics*, National Cancer Institute, Bethesda, MD.Google Scholar - 31.Loader, C. (1991). Large deviation approximations to the distribution of scan statistics,
*Advances in Applied Probability*,**23**, 751–771.CrossRefzbMATHMathSciNetGoogle Scholar - 32.Mack, C. (1948). An exact formula for
*Q*_{k}(*n*), the probable number of*k*-aggregates in a random distribution of*n*points,*The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*,**39**, 778–790.zbMATHMathSciNetGoogle Scholar - 33.Mack, C. (1950). The expected number of aggregates in a random distribution of
*n*points,*Proceedings Cambridge Philosophical Society*,**46**, 285–292.CrossRefzbMATHMathSciNetGoogle Scholar - 34.Menon, M.V. (1964). Clusters in a Poisson process [abstract],
*Annals of Mathematical Statistics*,**35**, 1395.Google Scholar - 35.Naus, J. (1962). The distribution of the maximum number of points on the line,
*ASD Paper 8*.Google Scholar - 36.Naus, J. (1963).
*Clustering of Random Points in the Line and Plane*, Ph.D. thesis, Rutgers University, New Brunswick, NJ.Google Scholar - 37.Naus, J. (1965a). The distribution of the size of the maximum cluster of points on a line,
*Journal of the American Statistical Association*,**60**, 532–538.CrossRefMathSciNetGoogle Scholar - 38.Naus, J. (1965b). Clustering of random points in two dimensions,
*Biometrika*,**52**, 263–267.CrossRefzbMATHMathSciNetGoogle Scholar - 39.Naus, J. (1966a). A power comparison of two tests of non-random clustering,
*Technometrics*,**8**, 493–517.CrossRefzbMATHMathSciNetGoogle Scholar - 40.Naus, J. (1966b). Some probabilities, expectations, and variances for the size of largest clusters, and smallest intervals,
*Journal of the American Statistical Association*,**61**, 1191–1199.CrossRefzbMATHMathSciNetGoogle Scholar - 41.Naus, J. (1968). An extension of the birthday problem,
*American Statistician*,**22**, 27–29.CrossRefGoogle Scholar - 42.Naus, J. (1974). Probabilities for a generalized birthday problem,
*Journal of the American Statistical Association*,**69**, 810–815.CrossRefzbMATHMathSciNetGoogle Scholar - 43.Naus, J. (1979). An indexed bibliography of clusters, clumps and coincidences,
*International Statistical Review*,**47**, 47–78.zbMATHMathSciNetGoogle Scholar - 44.Naus, J. (1982). Approximations for distributions of scan statistics,
*Journal of the American Statistical Association*,**77**, 177–183.CrossRefzbMATHMathSciNetGoogle Scholar - 45.Naus, J. (1988). Scan statistics,
*Encyclopedia of Statistical Sciences*, Vol. 8, 281–284, N.L. Johnson and S. Kotz, eds., Wiley, New York.Google Scholar - 46.Naus, J. (2006). Scan Statistics,
*Handbook of Engineering Statistics*, H. Pham, ed., Chapter 43, 775-790.Springer-Verlag, New York.Google Scholar - 47.Naus, J. and Sheng K.N. (1996). Screening for unusual matched segments in multiple protein sequences,
*Communications in Statistics: Simulation and Computation*,**25**, 937–952.CrossRefzbMATHMathSciNetGoogle Scholar - 48.Naus, J. and Sheng K.N. (1997). Matching among multiple random sequences,
*Bulletin of Mathematical Biology*,**59**, 483–496.CrossRefzbMATHGoogle Scholar - 49.Naus, J. and Wartenberg D. (1997). A double scan statistic for clusters of two types of events,
*Journal of the American Statistical Association*,**92**, 1105–1113.CrossRefzbMATHMathSciNetGoogle Scholar - 50.Naus, J. and Wallenstein, S. (2004). Simultaneously testing for a range of cluster or scanning window sizes,
*Methodology and Computing in Applied Probability*,**6**, 389–400.CrossRefzbMATHMathSciNetGoogle Scholar - 51.Naus, J. and Wallenstein S. (2006). Temporal surveillance using scan statistics,
*Statistics in Medicine*,**25**, 311–324.CrossRefMathSciNetGoogle Scholar - 52.Neff, N. and Naus, J. (1980). The distribution of the size of the maximum cluster of points on a line,
*IMS Series of Selected Tables in Mathematical Statistics*, Vol. VI, AMS, Providence, RI.Google Scholar - 53.Newell, G.F. (1963). Distribution for the smallest distance between any pair of
*k*th nearest-neighbor random points on a line,*Time series analysis, Proceedings of a conference held at Brown University*, M. Rosenblatt editor, pp. 89–103, John Wiley & Sons, New York.Google Scholar - 54.Ozols, V. (1956). Generalization of the theorem of Gnedenko-Korolyuk to three samples in the case of two one-sided boundaries,
*Latvijas PSR Zinatnu Akad. Vestis*,**10**(111), 141–152.MathSciNetGoogle Scholar - 55.Parzen, E. (1960).
*Modern Probability Theory and its Applications*, John Wiley & Sons, New York.zbMATHGoogle Scholar - 56.Rabinowitz, L. and Naus, J. (1975). The expectation and variance of the number of components in random linear graphs,
*Annals of Probability*,**3**, 159–161.CrossRefzbMATHMathSciNetGoogle Scholar - 57.Samuel-Cahn, E. (1983). Simple approximations to the expected waiting time for a cluster of any given size for point processes,
*Advances in Applied Probability*,**15**, 21–38.CrossRefzbMATHMathSciNetGoogle Scholar - 58.Saperstein, B. (1972). The generalized birthday problem,
*Journal of the American Statistical Association*,**67**, 425–428.CrossRefzbMATHMathSciNetGoogle Scholar - 59.Sheng. K.N. and Naus, J. (1994). Pattern matching between two non-aligned random sequences,
*Bulleting of Mathematical Biology*,**56**, 1143–1162.zbMATHGoogle Scholar - 60.Sheng, K.N. and Naus, J. (1996). Matching fixed rectangles in 2-dimensions,
*Statistics and Probability Letters*,**26**, 83–90.CrossRefzbMATHMathSciNetGoogle Scholar - 61.Silberstein, L. (1945). The probable number of aggregates in random distributions of points,
*The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science*,**36**, 319–336.MathSciNetGoogle Scholar - 62.Takacs, L. (1961). On a coincidence problem concerning particle counters,
*Annals of Mathematical Statistics*,**32**, 739–756.CrossRefzbMATHMathSciNetGoogle Scholar - 63.Wallenstein S.R. and Naus, J. (1973). Probabilities for the
*k*th nearest neighbor problem on the line,*Annals of Probability*,**1**, 188–190.CrossRefzbMATHMathSciNetGoogle Scholar - 64.Wallenstein S. and Naus, J. (1974). Probabilities for the size of largest clusters and smallest intervals,
*Journal of the American Statistical Association*,**69**, 690–697.CrossRefzbMATHMathSciNetGoogle Scholar - 65.Wallenstein S., Naus, J., and Glaz, J. (1993). Power of the scan statistic for the detection of clustering,
*Statistics in Medicine*,**12**, 1829–1843.CrossRefGoogle Scholar - 66.Wallenstein, S. and Neff, N. (1987). An approximation for the distribution of the scan statistic,
*Statistics in Medicine*,**6**, 197–207.CrossRefGoogle Scholar - 67.Wolf E. and Naus, J. (1973). Tables of critical values for a
*k*-sample Kolmogorov-Smirnov test statistic,*Journal of the American Statistical Association*,**68**, 994–997.CrossRefzbMATHMathSciNetGoogle Scholar