Abstract
Can an agent’s intelligence level be negative? We extend the Legg-Hutter agent-environment framework to include punishments and argue for an affirmative answer to that question. We show that if the background encodings and Universal Turing Machine (UTM) admit certain Kolmogorov complexity symmetries, then the resulting Legg-Hutter intelligence measure is symmetric about the origin. In particular, this implies reward-ignoring agents have Legg-Hutter intelligence 0 according to such UTMs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Thus, this paper falls under the broader program of advocating for intelligence measures having different ranges than the nonnegative reals. Alexander has advocated more extreme extensions of the range of intelligence measures [1, 2]; by contrast, here we merely question the assumption that intelligence never be negative, leaving aside the question of whether intelligence should be real-valued.
- 2.
It is worth mentioning another difference between these two transforms. The hypothetical agent \(\mathrm {AI}_\mu \) with perfect knowledge of the environment’s reward distribution would not change its behavior in response to \(r\mapsto r-1\) (nor indeed in response to any positive linear scaling \(r\mapsto ar+b\), \(a>0\)), but it would generally change its behavior in response to \(r\mapsto -r\). Interestingly, this behavior invariance with respect to \(r\mapsto r-1\) would not hold if \(\mathrm {AI}_\mu \) were capable of “suicide” (deliberately ending the environmental interaction): one should never quit a slot machine that always pays between 0 and 1 dollars, but one should immediately quit a slot machine that always pays between \(-1\) and 0 dollars. The agent AIXI also changes behavior in response to \(r\mapsto r-1\), and it was recently argued that this can be interpreted in terms of suicide/death: AIXI models its environment using a mixture distribution over a countable class of semimeasures, and AIXI’s behavior can be interpreted as treating the complement of the domain of each semimeasure as death, see [14].
- 3.
Note that measuring intelligence as averaged performance might conflict with certain everyday uses of the word “intelligent”, see Sect. 5.
- 4.
An answer to Leike and Hutter’s [13] “what are other desirable [UTM properties]?”.
- 5.
To quote Socrates: “Don’t you think the ignorant person would often involuntarily tell the truth when he wished to say falsehoods, if it so happened, because he didn’t know; whereas you, the wise person, if you should wish to lie, would always consistently lie?” [15].
- 6.
Arrange that \(\varUpsilon ^\sqcap _U\) is dominated by \(\mu \) and \(\bar{\mu }\) where \(\mu \) is an environment that initially gives reward .01, then waits for the agent to input the code of a Turing machine T, then (if the agent does so), gives reward \(-.51\), then gives rewards 0 while simulating T until T halts, finally giving reward 1 if T does halt. Then if \(\mathrm {sgn}(\varUpsilon ^\sqcap _U(\pi ))\) were computable (even in the weak sense), one could compute it for strategically-chosen agents and solve the Halting Problem.
References
Alexander, S.A.: The Archimedean trap: why traditional reinforcement learning will probably not yield AGI. JAGI 11(1), 70–85 (2020)
Alexander, S.A., Hibbard, B.: Measuring intelligence and growth rate: variations on Hibbard’s intelligence measure. JAGI 12(1), 1–25 (2021)
Bostrom, N.: The superintelligent will: motivation and instrumental rationality in advanced artificial agents. Minds Mach. 22(2), 71–85 (2012)
Gamez, D.: Measuring intelligence in natural and artificial systems. J. Artif. Intell. Conscious. 08(2), 285–302 (2021)
Gavane, V.: A measure of real-time intelligence. JAGI 4(1), 31–48 (2013)
Goertzel, B.: Patterns, hypergraphs and embodied general intelligence. In: IJCNNP, IEEE (2006)
Hernández-Orallo, J.: C-tests revisited: back and forth with complexity. In: CAGI (2015)
Hernández-Orallo, J., Dowe, D.L.: Measuring universal intelligence: towards an anytime intelligence test. AI 174(18), 1508–1539 (2010)
Hibbard, B.: Bias and no free lunch in formal measures of intelligence. JAGI 1(1), 54 (2009)
Hibbard, B.: Measuring agent intelligence via hierarchies of environments. In: CAGI (2011)
Legg, S., Hutter, M.: Universal intelligence: a definition of machine intelligence. Minds Mach. 17(4), 391–444 (2007)
Legg, S., Veness, J.: An approximation of the universal intelligence measure. In: Dowe, D.L. (ed.) Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence. LNCS, vol. 7070, pp. 236–249. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-44958-1_18
Leike, J., Hutter, M.: Bad universal priors and notions of optimality. In: Conference on Learning Theory, pp. 1244–1259. PMLR (2015)
Martin, J., Everitt, T., Hutter, M.: Death and suicide in universal artificial intelligence. In: CAGI (2016)
Plato: Lesser Hippias. In: Cooper, J.M., Hutchinson, D.S., et al. (eds.) Plato: Complete Works. Hackett Publishing, Indianapolis (1997)
Acknowledgments
We acknowledge José Hernández-Orallo, Shane Legg, Pedro Ortega, and the reviewers for comments and feedback.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Alexander, S.A., Hutter, M. (2022). Reward-Punishment Symmetric Universal Intelligence. In: Goertzel, B., Iklé, M., Potapov, A. (eds) Artificial General Intelligence. AGI 2021. Lecture Notes in Computer Science(), vol 13154. Springer, Cham. https://doi.org/10.1007/978-3-030-93758-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-93758-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93757-7
Online ISBN: 978-3-030-93758-4
eBook Packages: Computer ScienceComputer Science (R0)