Biomimetics of Choice Behaviour for Autonomous Agents
Autonomy is inseparable from choice. Natural agents, including humans, make choices with probabilities that are proportional to the average reward from each choice (the ‘matching law’), rather than always choosing the alternative with maximum reward. Imbuing artificial agents with this flexibility is non-trivial. We introduce anti-competition mapping to overcome the tendency for high reward choices to dominate (mask) those with lower reward. The simplest solution is to map reward to new choice signals that generate proportional hazard functions. We demonstrate a procedure based on the reciprocal Normal distribution, a popular model of human reaction times. We then discuss the value of the matching law in synthetic agents.
KeywordsMatching law decision making reward choices rate of response hazard rate exploitation exploration
Unable to display preview. Download preview PDF.
- 9.Fretwell, S.: Populations in seasonal environments. Princeton University Press, Princeton (1972)Google Scholar
- 12.Pieron, H.: The sensations: their functions, processes and mechanisms. Frederick Muller Ltd., London (1952)Google Scholar
- 13.Cox, D.R., Oakes, D.: Analysis of survival data. Chapman and Hall, London (1984)Google Scholar
- 17.Everling, S., Munoz, D.P.: Neuronal correlates for preparatory set associated with pro-saccades and anti-saccades in the primate frontal eye field. J. Neurosci. 20, 387–400 (2000)Google Scholar