Human Interactive Proofs and Document Image Analysis
The recently initiated and rapidly developing research field of ‘human interactive proofs’ (HIPs) and its implications for the document image analysis (DIA) research field are described. Over the last five years, efforts to defend Web services against abuse by programs (‘bots’) have led to a new family of security protocols able to distinguish between human and machine users. AltaVista pioneered this technology in 1997 [Bro01, LBBB01]. By the summer of 2000, Yahoo! and PayPal were using similar methods. In the Fall of 2000, Prof. Manuel Blum of Carnegie-Mellon University and his team, stimulated by Udi Manber of Yahoo!, were studying these and related problems [BAL00]. Soon thereafter a collaboration between the University of California at Berkeley and the Palo Alto Research Center (PARC) built a tool based on systematically generated image degradations [CBF01]. In January 2002, Prof. Blum and the present authors ran the first workshop (at PARC) on HIPs, defined broadly as a class of challenge/response protocols which allow a human to authenticate herself as a member of a given group - e.g. human (vs. machine), herself (vs. anyone else), an adult (vs. a child), etc. All commercial uses of HIPs known to us exploit the gap in ability between human and machine vision systems in reading images of machine printed text. Many technical issues that have been systematically studied by the DIA community are relevant to the HIP research program. This paper describes the evolution of HIP R& D, applications of HIPs now and on the horizon, highlights of the first HIP workshop, and proposals for a DIA research agenda to advance the state of the art of HIPs.
KeywordsHuman interactive proofs document image analysis CAPTCHAs abuse of web sites and services the chatroom problem human/machine discrimination Turing tests OCR performance evaluation document image degradations legibility of text
- [Bai92]H. S. Baird, “Document Image Defect Models,” in H. S. Baird, H. Bunke, and K. Yamamoto Eds., Structured Document Image Analysis, Springer-Verlag: New York, 1992, pp. 546–556.Google Scholar
- [BAL00]M. Blum, L. A. von Ahn, and J. Langford, The CAPTCHA Project, “Completely Automatic Public Turing Test to tell Computers and Humans Apart,” http://www.captcha.net, Dept. of Computer Science, Carnegie-Mellon Univ., and personal communications, November, 2000.
- [Bar01]D. P. Baron, “eBay and Database Protection,” Case No. P-33, Case Writing Office, Stanford Graduate School of Business, 518 Memorial Way, Stanford Univ., Stanford, CA 94305-5015, 2001.Google Scholar
- [Bro01]Alta Vista’s “Add-URL” site: altavista.com/sites/addurl/newurl, protected by the earliest known CAPTCHA.Google Scholar
- [CBF01]A. L. Coates, H. S. Baird, R. Fateman, “Pessimal Print: a Reverse Turing Test,” Proc., IAPR 6th Intl. Conf. on Document Analysis and Recognition, Seattle, WA, September 10–13, 2001, pp. 1154–1158.Google Scholar
- [Cro82]R. G. Crowder, The Psychology of Reading, Oxford University Press, 1982.Google Scholar
- [GKB83]L. M. Gentile, M. L. Kamil, J. S. Blanchard ‘Reading Research Revisited’ Charles E. Merrill Publishing, 1983.Google Scholar
- [HB97]T. K. Ho and H. S. Baird, “Large-Scale Simulation Studies in Image Pattern Recognition,” IEEE Trans. on PAMI, Vol. 19, No. 10, pp. 1067–1079, October 1997.Google Scholar
- [HB01]N. J. Hopper and M. Blum, “Secure Human Identi.cation Protocols,” In: C. Boyd Ed. Advances in Crypotology, Proceedings of Asiacrypt 2001, LNCS 2248, pp.52–66, Springer-Verlag Berlin, 2001Google Scholar
- [Kan96]T. Kanungo, Document Degradation Models and Methodology for Degradation Model Validation, Ph.D. Dissertation, Dept. EE, Univ. Washington, March 1996.Google Scholar
- [KLB01]T. Kanungo, C. H. Lee and R. Bradford,“What Fraction of Images on the Web Contain Text?”, Proc. of Int. Workshop on Web Document Analysis, Seattle, WA, Sept. 8, 2001, web publication only, at http://www.csc.liv.ac.uk/~wda2001.
- [KWB80]P. A. Kolers, M. E. Wrolstad, H. Bouma, Processing of Visible Language 2, Plenum Press, 1980.Google Scholar
- [LABB01]M. D. Lillibridge, M. Abadi, K. Bharat, A. Z. Broder, “Method for Selectively Restricting Access to Computer Systems,” U.S. Patent No. 6,195,698, Issued February 27, 2001.Google Scholar
- [NS96]G. Nagy and S. Seth, “Modern optical character recognition.” in The Froehlich / Kent Encyclopaedia of Telecommunications, Vol. 11, pp. 473–531, Marcel Dekker, NY, 1996.Google Scholar
- [Pav00]T. Pavlidis, “Thirty Years at the Pattern Recognition Front,” King-Sun Fu Prize Lecture, 11th ICPR, Barcelona, September, 2000.Google Scholar
- [PBFM02]D. G. Pelli, C. W. Burns, B. Farell, and D. C. Moore, “Identifying letters,” Vision Research, [accepted with minor revisions; to appear], 2002.Google Scholar
- [RNN99]S. V. Rice, G. Nagy, and T. A. Nartker, OCR: An Illustrated Guide to the Frontier, Kluwer Academic Publishers, 1999.Google Scholar
- [RJN96]S. V. Rice, F. R. Jenkins, and T. A. Nartker, “The Fifth Annual Test of OCR Accuracy,” ISRI TR-96-01, Univ. of Nevada, Las Vegas, 1996.Google Scholar
- [SCA00]A. P. Saygin, I. Cicekli, and V. Akman, “Turing Test: 50 Years Later,” Minds and Machines, 10(4), Kluwer, 2000.Google Scholar