Abstract
We consider the problem of estimating the parameters of a distribution when the underlying events are themselves unobservable. The aim of the exercise is to perform a task (for example, search a web-site or query a distributed database) based on a distribution involving the state of nature, except that we are not allowed to observe the various “states of nature” involved in this phenomenon. In particular, we concentrate on the task of searching for an object in a set of N locations (or bins) {C 1, C 2,…, C N }, in which the probability of the object being in the location C i is p i , where P = [p 1, p 2,…, p N ]T is called the Target Distribution. Also, the probability of locating the object in the bin within a specified time, given that it is in the bin, is given by a function called the Detection function, which, in its most common instantiation, is typically, specified by an exponential function. The intention is to allocate the available resources so as to maximize the probability of locating the object. The handicap, however, is that the time allowed is limited, and thus the fact that the object is not located in bin C i within a specified time does not necessarily imply that the object is not in C i . This problem has applications in searching large databases, distributed databases, and the world-wide web, where the location of the files sought for are unknown, and in developing various military and strategic policies. All of the research done in this area has assumed the knowledge of the {p i }. In this paper we consider the problem of obtaining error bounds, estimating the Target Distribution, and allocating the search times when the {p i } are unknown. To the best of our knowledge, these results are of a pioneering sort - they are the first available results in this area, and are particularly interesting because, as mentioned earlier, the events concerning the Target Distribution, in themselves, are unobservable.
Similar content being viewed by others
Notes
It is possible to get very good estimates of θ if one is provided with random occurrences of a known function of X. Thus, instead of receiving {X i }, if we are provided with {Y i }, where, for example, if each Y i = X 2 i , an MLE can be easily devised to estimate θ by observing the Y i ’s.
Even though the review is extensive, we have tried to keep it crisp and “to the point”.
Typically, the value of p 0 should satisfy 0.5 ≤ p 0 ≤ 1.
We are not aware of anyone who has any real-life data on this. While this is generally a long-term goal for learning from the web, we believe that, for the most part, this is an unsolved problem.
These results are based on the joint of work of the second author and his former student, Mr. Amr Ellaithy. The details of these results and other simulations (which are still being compiled) are to be included in a forthcoming paper jointly co-authored by Mr. Ellaithy and the second author.
In the simulation, this essentially meant assigning the binary value of that slot to unity.
For example, consider the scenario of searching for a missing climber in a mountain. Suppose a person was known to be at point x 0 at time 0. Then, when the search begins at time t, the target is distributed in an area that is centered at x 0, and has a radius vt, where v is the maximum speed at which the man can move. Due to the life-threatening risk in question, and due to the fact that we do not have time to repeat the search process to estimate and compare the detection probability, intelligent search techniques akin to what we have explained can be extremely crucial to the search endeavour.
References
Arkin VI (1964) A problem of optimum distribution of search effort. Theory Probab Appl 9:159–160
Badr G, Oommen BJ (2006) A novel look-ahead optimization strategy for trie-based approximate string matching. Pattern Anal Appl J 9(2–3):177–187
Bentley JL, Yao AC-C (1976) An almost optimal algorithm for unbounded searching. Inf Process Lett 5:82–87
Benichou O, Coppey M, Moreau M, Suet PH, Voituriez R (2005) Optimal search strategies for hidden targets. Phys Rev Lett 94:198101
Bickel P, Doksum K (2000) Mathematical statistics: basic ideas and selected topics, vol I, 2nd edn. Prentice-Hall, Englewood Cliffs
Buhrman H, Franklin M, Garay JA, Hoepman J-H, Tromp J, Vitnyi P (1999) Mutual search. J ACM 46:517–536
Calitoiu D, Oommen BJ, Nussbaum D (2007) Periodicity and stability issues of a chaotic pattern recognition neural network. Pattern Anal Appl J 10(3):175–188
Casella G, Berger R (2001) Statistical inference, 2nd edn. Brooks/Cole Pub Co.
Chandramouli R (2004) Web search steganalysis: Some challenges and approaches. In: Proceedings of the IEEE ISCAS: special session on information hiding
Charness A, Cooper WW (1958) The theory of search: optimum distribution of search effort. Manage Sci 5:44–50
Chew MC (1967) A sequential search procedure. Ann Math Stat 38:494–502
Dasgupta B, Hespanha JP, Sontag E (2004) Computational complexities of honey-pot searching with local sensory information. In: Proceedings of ACC 2004, the American Control Conference, pp 2134–2138
De Guenin J (1961) Optimum distribution of effort: an extension of the Koopman basic theory. Oper Res 9:1–7
Dobbie JM (1963) Search theory: a sequential approach. Nav Res Logist Q 10:323–334
Duda R, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley, New York
Fukunaga K (1990) Introduction to statistical pattern recognition. Academic, London
Gage DW (1995) Many-Robot MCM search systems. In: Proceedings of the symposium of autonomous vehicles in mine countermeasures, Monterey CA, pp 4–7
Gal S (1979) Search games with mobile and immobile hider. SIAM J Control Optim 17:99–122
Gilbert EN (1959) Optimal search strategies. SIAM J Appl Math 7:413–424
Gluss B (1959) An optimum policy for detecting a fault in a complex system. Oper Res 7:468–477
Gluss B (1961) Approximately optimal one-dimensional search policies in which search costs vary through time. Nav Res Logist Q 8:277–283
Herbrich R (2001) Learning Kernel classifiers: theory and algorithms. MIT, Cambridge
Jones B, Garthwaite P, Jolliffe I (2002) Statistical inference, 2nd edn. Oxford University Press, Oxford
Kadane JB (1968) Discrete search and the Neyman–Pearson Lemma. J Math Anal Appl 22:156–171
Kadane JB (1971) Optimal whereabouts search. Oper Res 19:894–904
Kadane JB, Simon HA (1977) Optimal strategies for a class of constrained sequential problems. Ann Stat 5:237–255
Kashyap RL, Oommen BJ (1983) Scale preserving smoothing of polygons. IEEE Trans Pattern Anal Mach Intell 5(6):667–671
Kisi T (1966) On an optimal searching schedule. J Oper Res Soc Japan 8:53–65
Koopman BO (1946) Search and screening. OEG Report, no. 56, Center for Naval Analysis, Rosslyn, Va., USA
Koopman BO (1956) The theory of search. Part I: Kinetic bases. Oper Res 4:324–346
Koopman BO (1956) The theory of search. Part II: Target detection. Oper Res 4:503–531
Koopman BO (1957) The theory of search. Part III: The optimum distribution of searching effort. Oper Res 5:613–626
Mela DF (1961) Information theory and search theory as special cases of decision theory. Oper Res 9:907–909
Oommen BJ (1997) Stochastic searching on the line and its applications to parameter learning in non-linear optimization. IEEE Trans Syst Man Cybernet 27:733–739
Oommen BJ, Badr G (2007) Breadth-first search strategies for trie-based syntactic pattern recognition. Pattern Anal Appl J 10:1–13
Oommen BJ, Raghunath G (1998) Automata learning and intelligent tertiary searching for stochastic point location. IEEE Trans Syst Man Cybernet SMC-28B:947–954
Onaga K (1971) Optimal search for detecting a hidden object. SIAM J Appl Math 20:298–318
Pao Y-H (1989) Adaptive pattern recognition and neural networks. Addison-Wesley, Reading
Pavlidis T (1977) Structural pattern recognition. Springer, New York
Pelc A (1989) Searching with known error probability. Theor Comput Sci 63:185–202
Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1986) Numerical recipes: the art of scientific computing. Cambridge University Press, Cambridge
Rao SS (1984) Optimization: theory and applications, 2nd edn. Wiley, New Delhi, pp 613–626
Rezaiifar R, Makowski AM (1997) From optimal search theory to sequential paging in cellular networks. IEEE J Sel Areas Commun 15(7):1253–1264
Ross S (2002) Introduction to probability models, 2nd edn. Academic, New York
Santharam G, Sastry PS, Thathachar MAL (1994) Continuous action set learning automata for stochastic optimization. J Franklin Inst 331B5:607–628
Shao J (2003) Mathematical statistics, 2nd edn. Springer, Heidelberg
Sprinthall J (2002) Basic statistical analysis, 2nd edn. Allyn and Bacon, Boston
Staroverov OV (1963) On a searching problem. Theory Probab Appl 8:184–187
Stone LD (1972) Incremental approximation of optimal allocations. Nav Res Logist Q 19:111–122
Stone LD (1973) Total optimality of incrementally optimal allocations. Nav Res Logist Q 19:419–430
Stone LD (1976) Incremental and total optimization of separable functionals with constraints. SIAM J Control Optim 14:791–802
Tognetti KP (1968) An optimal strategy for a whereabouts search. Oper Res 16:209–211
Wasserman PD (1989) Neural computing: theory and practice. van Nostrand Reinhold, New York
Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New York
Wegener I (1981) The construction of an optimal distribution of search effort. Nav Res Logist Q 28(4):533–543
Wegener I (1982) The discrete search problem and the construction of optimal allocations. Nav Res Logist Q 29(2):533–543
Weisinger JR, Benkoski SJ (1989) Optimal layered search. Nav Res Logist 36:43–60
Williams RJ (1992) Simple statistical gradient-following algorithms for connectioninst reinforcement learning. Mach Learn 8:229–256
Author information
Authors and Affiliations
Corresponding author
Additional information
B. J. Oommen is Fellow of the IEEE and IAPR. The work of B. J. Oommen was partially supported by the Natural Sciences and Engineering Research Council of Canada.
Rights and permissions
About this article
Cite this article
Zhu, Q., Oommen, B.J. Estimation of distributions involving unobservable events: the case of optimal search with unknown Target Distributions. Pattern Anal Applic 12, 37–53 (2009). https://doi.org/10.1007/s10044-007-0095-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-007-0095-5