Picking the Best Expert from a Sequence
We examine the problem of finding a good expert from a sequence of experts. Each expert has an “error rate”; we wish to find an expert with a low error rate. However, each expert’s error rate is unknown and can only be estimated by a sequence of experimental trials. Moreover, the distribution of error rates is also unknown. Given a bound on the total number of trials, there is thus a tradeoff between the number of experts examined and the accuracy of estimating their error rates.
We present a new expert-finding algorithm and prove an upper bound on the expected error rate of the expert found. A second approach, based on the sequential ratio test, gives another expert-finding algorithm that is not provably better but which performs better in our empirical studies.
Unable to display preview. Download preview PDF.
- [Drescher 89]Drescher, G. L. (1989), Made-Up Minds: A Constructivist Approach to Artificial Intelligence, PhD thesis, MIT.Google Scholar
- [Holland 85]Holland, J. H. (1985), Properties of the bucket brigade algorithm, in ‘First International Conference on Genetic Algorithms and Their Applications’, Pittsburg, PA, pp. 1–7.Google Scholar
- [Kaelbling 90]Kaelbling, L. P. (1990), Learning in Embedded Systems, Technical Report TR-90–04, Teleos Research.Google Scholar
- [Laird et al. 87]
- [Rice 88]
- [Sutton 90]Sutton, R. S. (1990), First Results with DYNA, an Integrated Architecture for Learning, Planning, and Reacting, in ‘Proceedings, AAAI-90’, Cambridge, Massachusetts.Google Scholar
- [Sutton 91]Sutton, R. S. (1991), Reinforcement Learning Architectures for Animats, in ‘First International Conference on Simulation of Adaptive Behavior’, The MIT Press, Cambridge, MA.Google Scholar
- [Wald 47]
- [Watkins 89]Watkins, C. (1989), Learning from Delayed Rewards, PhD thesis, King’s College.Google Scholar