Skip to main content
Log in

Approximations to the birthday problem with unequal occurrence probabilities and their application to the surname problem in Japan

  • Probability Distribution
  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Let X 1, X 2,..., X n be iid random variables with a discrete distribution {p i } i =1m. We will discuss the coincidence probability R n , i.e., the probability that there are members of {X i } having the same value. If m=365 and p i ≡1/365, this is the famous birthday problem. Also we will give two kinds of approximation to this probability. Finally we will give two applications. The first is the estimation of the coincidence probability of surnames in Japan. For this purpose, we will fit a generalized zeta distribution to a frequency data of surnames in Japan. The second is the true birthday problem, that is, we will evaluate the birthday probability in Japan using the actual (non-uniform) distribution of birthdays in Japan.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Arratia, R., Goldstein, L. and Gordon, L. (1989). Two moments suffice for Poisson approximations: the Chen-Stein method, Ann. Probab., 17, 9–25.

    Google Scholar 

  • Barndorff-Nielsen, O. E. and Cox, D. R. (1989). Asymptotic Techniques for Use in Statistics, Chapman and Hall, London.

    Google Scholar 

  • Bloom, D. M. (1973). A birthday problem, Amer. Math. Monthly, 80, 1141–1142.

    Google Scholar 

  • Bolotonikov, Yu. V. (1968). Limiting processes in a model of distribution of particles into cells with unequal probabilities, Theory Probab. Appl., 13, 504–511.

    Google Scholar 

  • Chistyakov, V. P. and Viktorova, I. I. (1965). Asymptotic normality in a problem of balls when probabilities of falling into different boxes are different, Theory Probab. Appl., 10, 149–154.

    Google Scholar 

  • Comtet, L. (1974). Advanced Combinatorics, Reidel, Dordrecht.

    Google Scholar 

  • Daiiti Life Insurance Co. (ed.) (1987) Surnames and Names, Kouyuu Publishing Co., Tokyo (in Japanese).

    Google Scholar 

  • Fang, K.-T. (1985). Occupancy problems, Encyclopedia of Statistical Sciences (eds. S.Kotz and N. L.Johnson), Vol. 6, 402–406, Wiley, New York.

    Google Scholar 

  • Feller, W. (1968). An Introduction to Probability Theory and Its Applications, Vol. 1, Wiley, New York.

    Google Scholar 

  • Flajolet, P., Gardy, D. and Thimonier, L. (1988). Probabilistic languages and random allocations, Lecture Notes in Comput. Sci., 317, 239–253, Springer, Berlin.

    Google Scholar 

  • Flajolet, P., Gardy, D. and Thimonier, L. (1991). Birthday paradox, coupon collectors, caching algorithms and self-organizing search, Discrete Appl. Math. (to appear).

  • Gradshteyn, I. S. and Ryzhik, I. M. (1980). Tables of Integrals, Series, and Products, Academic Press, Orland.

    Google Scholar 

  • Hill, B. M. (1974). The rank-frequency form of Zipf's law. J. Amer. Statist. Assoc., 69, 1017–1026.

    Google Scholar 

  • Johnson, N. L. and Kotz, S. (1977). Urn Models and Their Applications, Wiley, New York.

    Google Scholar 

  • Klotz, J. (1979). The birthday problem with unequal probabilities, Tech. Report, No. 59, Department of Statistics, University of Wisconsin.

  • Kolchin, V. F., Sevast'yanov, B. A. and Chistyakov, V. H. (1978). Random Allocation (translation ed. A. V.Barakrishna), Wistons and sons, Washington D.C.

    Google Scholar 

  • Management and Coordination Agency (ed.) (1990). Japan Statistical Yearbook 1989, Statistics Bureau, Management and Coordination Agency, Tokyo.

    Google Scholar 

  • Ministry of Health and Welfare (ed.) (1988). Vital Statistics 1987, JAPAN, Vol. 1, Statistics and Information Department, Minister's Secretariat, Ministry of Health and Welfare, Tokyo.

    Google Scholar 

  • Moser, L. and Wyman, M. (1958). Asymptotic development of the Stirling numbers of the first kind, J. London Math. Soc., 33, 133–146.

    Google Scholar 

  • Munford, A. G. (1977). A note on the uniformity assumption in the birthday problem. Amer. Statist., 31, 119.

    Google Scholar 

  • Nishimura, K. and Sibuya, M. (1988). Occupancy with two types of balls, Ann. Inst. Statist. Math., 40, 77–91.

    Google Scholar 

  • Niwa, M. (ed.) (1978). Japanese Surnames, Vol. 1 and 2, Nippon Keizai Shinbun-Sha Co., Tokyo (in Japanese).

    Google Scholar 

  • Niwa, M. (1980). Origins of Surnames, Kadokawa Book Co., Tokyo (in Japanese).

    Google Scholar 

  • Roman, S. (1984). The Umbral Calculus, Academic Press, Orland.

    Google Scholar 

  • Tanaka, K. (1972). Statistics of Japanese surnames and names, Gengo-Seikatu, 254, 72–79 (in Japanese).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research is supported in part by Grant-in-Aid for Scientific Research of the Ministry of Education, Science and Culture under the contact number 01540141 and 02640057.

About this article

Cite this article

Mase, S. Approximations to the birthday problem with unequal occurrence probabilities and their application to the surname problem in Japan. Ann Inst Stat Math 44, 479–499 (1992). https://doi.org/10.1007/BF00050700

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00050700

Key words and phrases

Navigation