Process Account of Curiosity and Interest: A Reward-Learning Perspective

Previous studies suggested roles for curiosity and interest in knowledge acquisition and exploration, but there has been a long-standing debate about how to define these concepts and whether they are related or different. In this paper, we address the definition issue by arguing that there is inherent difficulty in defining curiosity and interest, because both curiosity and interest are naïve concepts, which are not supposed to have a priori scientific definitions. We present a reward-learning framework of autonomous knowledge acquisition and use this framework to illustrate the importance of process account as an alternative to advance our understanding of curiosity and interest without being troubled by their definitions. The framework centers on the role of rewarding experience associated with knowledge acquisition and learning and posits that the acquisition of new knowledge strengthens the value of further information. Critically, we argue that curiosity and interest are the concepts that they subjectively construe through this knowledge-acquisition process. Finally, we discuss the implications of the reward-learning framework for education and empirical research in educational psychology.

, and enhancing student interest and curiosity has been one of the primary goals in many educational programs (e.g., Engel 2015;Harackiewicz et al. 2012;Hulleman and Harackiewicz 2009). Indeed, a vast number of empirical studies over the past decades have revealed the beneficial effects of curiosity and interest on a variety of learning outcomes such as self-regulation and academic performance (e.g., Hidi 1990;Sansone et al. 2010;von Stumm et al. 2011).
Despite the abundance of literature, the distinction between curiosity and interest has been a source of frustration for researchers working on these concepts. Researchers have discussed the similarities and differences in curiosity and interest (Grossnickle 2014;Jirout and Klahr 2012;Silvia 2006), but they failed to reach agreed-upon definitions. Researchers often use these terminologies interchangeably, or they strategically use only one of the terminologies. However, such practices are frequently discouraged by editors who request precise and distinctive definitions. These requests to separately define curiosity and interest often put researchers in a difficult position. On the one hand, it is true that we need precise and accurate definitions of concepts to conduct scientific work. On the other hand, it may not be easy to provide a definition that everyone agrees with.
The current article aims to address the definition issue of curiosity and interest in the following way. First, we point out that the difficulty of defining curiosity and interest lies in the fact that they reflect our subjective construction of an underlying psychological process (i.e., they are naïve concepts). To make our point, we propose a process account of curiosity and interest. Specifically, we present a reward-learning framework of autonomous knowledge acquisition, arguing that curiosity and interest do not have unique characteristics that define them, but that they are what people subjectively construct to capture some aspects of this knowledge-acquisition process. Second, by presenting the reward-learning framework, we highlight the importance of reward learning in the knowledge-acquisition process underlying curiosity and interest. This framework is greatly inspired by prominent theorizations of curiosity and interest in psychology (e.g., Hidi and Renninger 2006;Loewenstein 1994) and neuroscience (e.g., Gottlieb and Oudeyer 2018) and represents our preliminary attempt to provide insight into how a wide variety of ideas of curiosity and interest can be understood in a coherent manner.

A Process Account as an Alternative Approach to Understand Curiosity and Interest
Curiosity and interest are both naïve (folk) concepts in their origin which are unlikely to have a priori objective definitions. Long before psychologists initiated the scientific endeavor to understand curiosity and interest, these words had been used by people in daily life. People daily use the terms "curiosity" and "interest" to describe distinct subjective feelings that are different from other mental states. These subjective feelings are likely to reflect a certain set of behavioral, psychological, and neural processes. However, lay people do not normally have the capability to directly access these mechanisms (Nisbett and Wilson 1977); they intuitively and loosely construct the concepts of curiosity and interest from their subjective feelings. Therefore, there is no reason to expect that curiosity and interest have a priori essential definitive characteristics. Thus, in our view, searching for correct definitions of curiosity and interest may not be a very fruitful task (see Kidd and Hayden 2015, for a similar point), because there is unlikely to be a definite right answer to this type of question. 1 Note that by calling curiosity and interest naïve concepts, we never mean that studying curiosity and interest is futile. Researchers can still examine the behavioral, psychological, and neural processes underlying these constructs. This has significant theoretical and practical implications. Indeed, previous theoretical and empirical work on curiosity and interest has provided critical insights into these mechanisms (e.g., Hidi and Renninger 2006;Loewenstein 1994;Silvia 2001), leading to greatly improved understanding of the roles or curiosity and interest in various educational issues. However, even if we were able to correctly specify the psychological and neural mechanisms, our naïve perceptions of curiosity and interest would likely still be too subjective and ambiguous to map directly and precisely onto these mechanisms. In other words, understanding the underlying processes would not necessarily tell us the correct definition of curiosity and interest.
To instantiate our point, in this article, we present a reward-learning framework of autonomous knowledge acquisition (for a full-fledged version of the framework, see Murayama, 2019a) as a process account of curiosity and interest. The framework centers on reward learning as the key to understand sustainable knowledge acquisition. The critical role of reward processing in curiosity and interest has been a recurrent idea in the literature (Berlyne 1971;Panksepp 1998) and has received increased attention in the recent literature of cognitive neuroscience (Gruber et al. 2014;Kang et al. 2009;Kidd and Hayden 2015;Murayama et al. 2010;Sakaki et al. 2018). However, the role of reward processing in curiosity and interest has been somewhat overlooked in the literature of educational psychology, except for a few notable exceptions (e.g., Ainley and Hidi 2014;Hidi 2016;Renninger and Hidi 2016). But as we will see, it is essential to incorporate the idea of reward learning to explain a variety of aspects of the knowledge-acquisition process (see also General Discussion on this issue). Our reward-learning framework is greatly inspired by the neuroscientific literature mentioned above, as well as by a number of influential theories on curiosity and interest in psychology, including the knowledgegap theory of curiosity (Loewenstein 1994), the four-phase model of interest development (Hidi and Renninger 2006; for an updated version, see Renninger and Hidi 2016), the expectancyvalue approach to interest (Schiefele 2009;Wigfield and Cambria 2010), and the self-regulation of motivation model (Sansone and Thoman 2005). These existing theories provided considerable insights into the psychological process underlying curiosity and interest. Thus, our rewardlearning framework can be considered as an integration of these neuroscientific and psychological theories on curiosity and interest.
Critically, however, the current framework attempts to explain curiosity and interest with a knowledge-acquisition process that does not include curiosity or interest as a constituent element. Our argument is that curiosity and interest are concepts that people subjectively constructed or interpreted to capture some aspects of the knowledge-acquisition process; as such, once we have explained the precise neural and psychological processes that underpin knowledge acquisition, it is no longer necessary to assume curiosity or interest in the psychological process itself. Our framework also sheds light on how we should approach the various operational conceptualizations of curiosity and interest that researchers have made in the literature (for reviews, Grossnickle 2014; Renninger and Hidi 2011); these operational conceptualizations can be considered as important parts of the knowledge-acquisition process, but as you will see, our framework indicates that there is no need to decide on which conceptualization is the right one. We suggest that, to understand students' motivation and learning in education, our primary attention and effort should be paid to this knowledgeacquisition process, rather than defining curiosity or interest themselves. We think the current state of confusion about curiosity and interest arises because researchers started from these naïve concepts as if they had some definitive features. But we should change the order of thinking-we should start from thinking about how we can sustain the knowledge-acquisition process, and then turn to how this process maps onto our various views on curiosity and interest.

A Reward-Learning Framework of Autonomous Knowledge Acquisition
General Framework Figure 1 presents the proposed framework that explains how people engage in sustainable knowledge acquisition (Murayama, 2019a). Essentially, the framework posits that the acquisition of knowledge serves as a reward and that the expected feeling of reward is the key modulator of our information-seeking behavior. When gaps in knowledge are recognized, people initiate information-seeking behavior because they expect the positive rewarding value of acquiring the knowledge. The information-seeking behavior results in acquisition of knowledge, which is integrated into people's pre-existing knowledge base. Importantly, the acquisition generates the actual feeling of reward, and this rewarding feeling strengthens the further information-seeking behavior by increasing the expected reward value of new information. Furthermore, the expanded knowledge base also facilitates the awareness of further knowledge gaps, motivating additional  Fig. 1 A reward-learning framework of autonomous knowledge acquisition information-seeking behavior. Consequently, the process forms a positive feedback loop, enabling a sustainable knowledge acquisition/seeking process of learning.
The framework is based on reward-learning (or reinforcement-learning) models (Dayan and Niv 2008;Montague and Berns 2002), but we incorporated several features that are specific to the knowledge-acquisition process (e.g., new knowledge provides room for further knowledge gaps). The reinforcement-learning model is an extension of modern operant learning theory (Berridge 2004;Dickinson and Balleine 2002) and posits that our behavior is guided by the rewarding value of the behavior that is computed and updated through a reinforcement process. Notably, although some researchers (especially researchers in education) tend to use "reward" to refer only to extrinsic incentives such as food or money (Kohn 1993), reinforcement-learning theory is not concerned whether the rewarding value comes from extrinsic incentives or more internally generated values (often called "intrinsic rewards"; see Murayama 2019b). In our framework, we use the term reward in this general sense, rather than referring to tangible rewards such as money. The stages of the model are described in more detail in the following sections.

Knowledge Gaps and Information-Seeking Behavior
We begin by describing the core knowledge-acquisition process as it is specified in our framework. We start from the state where people become aware of the lack of knowledge on a specific topic (e.g., how to solve quadratic equations in mathematics). This is also equivalent to the state of uncertainty. As Loewenstein (1994) argued in his influential knowledge-gap theory of curiosity, when a knowledge gap is made salient, people are motivated to initiate information-seeking behavior to acquire the knowledge. The degree to which people are motivated for information-seeking behavior depends on the "expected" rewarding value of the information (e.g., "how pleasant it would be to understand the solution of quadratic equation?"). If the knowledge is not felt to be rewarding enough, there is little motivation for information-seeking behavior.
Information-seeking behavior is likely to lead to acquisition of knowledge. It should be noted that, in the context of classroom education, information or knowledge may be provided externally (e.g., the solution was simply taught by a teacher), which does not involve students' explicit information-seeking behavior. However, even in such a case, whether the student understands the externally provided information depends on the extent to which the student actively processes the information, and in this respect, students still mentally perform information seeking. In other words, information-seeking behavior does not always mean explicit and visible behavior, but it also includes mental sense making process.

Knowledge Acquisition and Rewarding Experience
Upon the acquisition of new knowledge, people experience the feeling of reward (e.g., excitement produced when you think you understood how to solve the quadratic equation). Indeed, some studies have shown that closing knowledge gaps, for example by discovering what a blurred picture represents, activates the reward network in the brain (Brydevall et al. 2018;Jepma et al. 2012;Ligneul et al. 2017). This rewarding experience is at the core of the sustainable knowledge-acquisition process, because this experience reinforces the value of new knowledge, motivating further information-seeking behavior (explained in more detail later). The magnitude of this rewarding feeling is of course influenced by the (subjective) amount of knowledge the person gained (the magnitude of uncertainty reduction) but also by whether and how people value the new knowledge in light of their pre-existing knowledge base (the value signal; shown as a dashed arrow in Fig. 1).
Here we refer to the knowledge base as the compound of person's pre-existing knowledge, goals, and experiences (e.g., a student's general knowledge about algebra, their reasons for studying algebra, their personal experiences in mathematics class). In this respect, the knowledge base can be also called the self-scheme (Conway and Pleydell-Pearce 2000). Importantly, the rewarding value of every piece of new information is gauged in relation to this knowledge base-"is the new information consistent with my future goal?" and "how much is the new information related to my pre-existing knowledge?". For example, understanding how to solve quadratic equations should provide some rewarding experience to many people but this would be more so if they want to study mathematics in their future career or they can connect the knowledge to other domains such as physics. It is important to note that the newly acquired knowledge would also contribute to the knowledge base, and it could boost the value of new information in the next learning cycle (e.g., learning a new way of factoring to solve quadratic equations can increase the value of solving more quadratic equations).

Formation of the Positive Feedback Loop
After this phase, two psychological mechanisms operate to strengthen further informationseeking behavior. First, the rewarding experience updates the expected value of future new information, which would influence information-seeking behavior in the future. According to the reinforcement-learning model, the amount of the change in the expected reward value is determined by so-called reward prediction errors-the difference between the expected reward value and the actual reward value that individuals obtained. In other words, reward prediction errors represent how much individuals are "surprised" by the new reward. Similarly, in our reward learning model of knowledge acquisition, we suppose that the amount of change in the reward value of the information is (at least partially) determined by how surprising the new information was. Marvin and Shohamy (2016) called this surprise signal "information prediction errors." Formally, information prediction errors are defined as the reward value of the new information relative to the expected reward value of the new information. If the obtained information is unexpected, information prediction errors are positive (i.e., individuals are surprised) and increase the expected rewarding value of future new information. Indeed, previous research has indicated that new information-seeking behavior is enhanced when new knowledge is inconsistent with their expectation (high confidence error; e.g., when people learned that the statement "chameleons match their color to their environment" is false; Vogl et al. in press). On the other hand, if the new information is not more than what people expected, people tend to be disappointed and the value of new information may be undermined (e.g., when a student found out that solving a quadratic equation requires no more skills than just applying a formula or when people learned that the solution of a detective story is nothing more than what they expected).
Expanded knowledge base also triggers the second mechanism that strengthens information seeking behavior. Specifically, in real learning situations, the acquisition of new knowledge and the expanded knowledge base often creates more questions, which leads to further awareness of knowledge gaps. This notion is pointed out by Renninger (2000) and is nicely phrased by Loewenstein (1994) as "new information provides ever-changing idea of what there is to be known" (p. 89). For example, upon understanding the solution of quadratic equations, students may wonder whether it is possible to solve cubic equations in a similar manner. Indeed, previous studies suggested positive relations between domain knowledge (e.g., knowledge about psychology, which was assessed by a test asking factual matters in psychology) and domain-specific interest (assessed by self-reported questions; e.g., Alexander et al. 1995;Lawless and Kulikowich 2006). Note that this feature does not exist for standard reinforcement learning with extrinsic incentives. Once extrinsic rewards are consumed (i.e., the food was eaten), hunger is immediately satiated, and consumption does not normally produce further appetite. In that respect, reward-learning processes for knowledge and extrinsic rewards are critically different, although both learning processes are governed by the same computational mechanism.
Importantly, these two processes explain why people can sustain or even accelerate their commitment to the learning process over time-knowledge acquisition has an inherent mechanism to self-boost the feeling of expected reward value of information as learning progresses. In other words, the knowledge-acquisition process involves cyclical generation of intrinsic rewards through the formation of a positive feedback loop.

Moderators of Information-Seeking Behavior
Of course, this positive reinforcement process does not always work in reality. In classrooms, for example, we observe many students who do not sustain the knowledge-acquisition process. Clearly, there are some critical factors that moderate the process. Some such factors are noted in Fig. 1. For example, certain general personality traits (e.g., sensation seeking, conscientiousness) play a role here. In addition, people's decision to initiate information-seeking behavior should strongly depend on one's perceived ability to acquire the information (expectancy belief) and its cost. If one does not believe that s/he has sufficient ability to acquire the information (e.g., "I am not confident that I will understand the logic behind quadratic equations") or that the behavior is too costly (e.g., "it would require a lot of effort to understand the solution"), s/he would be discouraged to initiate information-seeking behavior 2 , 3 .
Another important moderating factor is the emotional valence of the expected information. Our motivation to seek information is often compromised by the potential negative valence of the information (see Hertwig and Engel 2016, for a review). For example, some pregnant women may want to avoid taking prenatal testing that informs them of the potential risk of genetic disease (e.g., Down syndrome) for their expected child. In fact, Charpentier et al. (2018) empirically showed that people tend to seek positively valenced information more frequently than negatively valenced information (see also Marvin and Shohamy 2016). Although this sort of situation is less common in education, it is possible that students are reluctant to seek information for the materials that include negatively valenced information (e.g., stories of holocaust).
It is worth noting that these factors-perceived expectancy (including skills), perceived cost, and expected valence of information-are also subjective beliefs that are updated in the course of learning. For example, if one fails to acquire knowledge, the expectancy belief may be decreased and, as a result, the person may not feel motivated to initiate information-seeking behavior again (Durik et al. 2015;Tanaka and Murayama 2014). This point implies that the knowledge-acquisition process in real life involves constant on-the-fly calculations and integration of the expected reward value of information, the perceived ability to find the information, the perceived cost of information-seeking behavior, and the expected valence of information. This computation process may well be automatic and unconscious. Although our framework highlights the importance of the reward value of information, a full understanding of the process of knowledge acquisition that is applicable to practical settings such as education would require a comprehensive description of this dynamic computational process.
In addition, after gaining knowledge, there is the possibility that a person does not find any more information gaps, subjectively feeling that s/he completely understands the topic. This "knowledge satiation" is likely to halt the task engagement process. This situation is common when curiosity and interest are empirically examined using controlled materials such as trivia questions (e.g., Marvin and Shohamy 2016;Murayama and Kuhbandner 2011). For most participants, the answers to trivia questions are indeed trivial and have little practical value. They are normally prepared as stand-alone materials that can trigger and satisfy curiosity by themselves. As a result, the knowledge of the answer is likely to cause satiation. Even in real learning contexts, however, students may sometimes stop learning because they, often falsely, believe that they have mastered the materials, even if they actually have not (e.g., students may feel that they understand perfectly how to solve quadratic equations when they are presented with the formula but they actually do not understand the logic behind it). In fact, literature in metacognition has repeatedly shown that learners are inaccurate and optimistic about their mastery of learning materials (e.g., learning foreign words), often terminating their learning behavior prematurely (e.g., Kornell and Bjork 2008;Murayama et al. 2016).

Extraneous Factors That Influence Awareness of Knowledge Gaps
We believe awareness of knowledge gaps mainly comes from the learning process (i.e., expanded knowledge base), but there are some other factors contributing to this awareness. For example, we are often motivated to seek information for the things that we came across just by chance (e.g., an interesting book seen through the window of a book store). Such encounters may appear random, but this process is also constrained by the environment in which people are placed. For example, if a child is raised in a family of piano players, the child has much higher chance of encountering novel information about piano music. In that respect, this is not a completely random process but is still an important extra means of becoming aware of new knowledge gaps.

How the Framework Can Explain Different Definitions/Types of Curiosity and Interest?
Again, a critical feature of the process framework presented in Fig. 1 is that the framework does not include "curiosity" or "interest" as an element, indicating that there is no absolute need to assume such constructs to explain sustained knowledge acquisition, although we acknowledge that our framework is still preliminary. We believe that what people naïvely call curiosity and interest are things that they subjectively construe through this knowledgeacquisition process.
Researchers have conceptualized curiosity and interest in many different ways (for detailed reviews, see Grossnickle 2014;Renninger and Hidi 2011;Renninger and Hidi 2016). For example, some researchers argue that interest is an emotion (Reeve et al. 2015;Silvia 2008) and other researchers regard interest as a collection of values (Gati 1991;Schiefele 2009). Some researchers define curiosity as a feeling or emotion arising from a knowledge gap (Loewenstein 1994) while others consider curiosity as resulting from or as part of intrinsic motivation (Day 1971;Deci and Ryan 1985;White 1959). We agree with the importance of providing operational definitions of these concepts. In light of our framework, these elements are captured by components or outcomes of the knowledge-acquisition process. For example, value belief is one of the important factors that determine the reward value of new knowledge. There are multiple places where emotion is generated in our framework (and researchers have different opinions on which emotion should be called interest/curiosity); for example, the rewarding feeling after knowledge acquisition may be considered as an emotion. Subjective experience should also arise when knowledge gap is made salient, and this may also be considered an emotion. Importantly, our framework makes it clear that these different conceptualizations are all matters of labeling (which part should be labeled as curiosity/interest), and it is not essential to determine which label mappings are correct or not, under the aim of understanding how people sustain learning and engagement; they are all important parts of the knowledge-acquisition process.
What is the difference between curiosity and interest? Again, this is a matter of labeling in our view but generally speaking, literature on curiosity seems to focus on the series of processes in which awareness of a knowledge gap leads to information-seeking behavior (Kidd and Hayden 2015;Loewenstein 1994;Sakaki et al. 2018). On the other hand, research on interest seems to be more diverse: Some researchers are mainly concerned with the maturity of this engagement process (i.e., how the feedback loop is completed; Alexander et al. 1995;Hidi and Renninger 2006;Krapp and Prenzel 2011), whereas some others focus more on emotional aspects in the process (Silvia 2006). The strength of our framework is that, regardless of how researchers define curiosity or interest, the framework is clear that both curiosity and interest are integrated in the same knowledge-acquisition process, providing a specific picture about how curiosity and interest can be related. For example, Markey and Loewenstein (2014) and Renninger and Hidi (2016) argued that curiosity is distinct from interest in that curiosity is a short-term psychological state that would disappear on closing the momentary knowledge gap, whereas interest represents long-term commitment on a particular topic. We are fine with these definitions. From our perspective, however, it is important to be clearer that curiosity (defined by these researchers) is still an essential element of long-term knowledge-acquisition process: People build knowledge as a consequence of these numerous curious experiences, and the accumulation of curious experiences would be the basis for selfgeneration of intrinsic rewards, which makes sustainable engagement possible. We believe both Markey and Loewenstein (2014) and Renninger and Hidi (2016) share a similar view with us, but by demarcating a borderline between curiosity and interest, this critical point may be somewhat obscured.
Beyond the general distinction between curiosity and interest, there have been attempts to identify different types and dimensions of curiosity and interest. In the following, we will show how our process framework can explain these typologies.

Situational Interest vs. Individual Interest
A developmental continuum of situational interest and individual interest was originally proposed by Hidi and Renninger (2006) in their four-phase model of interest development, and it has been one of the most influential theories in the literature. People are said to have situational interest when they initiate and sustain their learning behavior, but it also depends on available resources such as social relationships and environmental resources. On the other hand, people are said to have more developed individual interest when they can value the opportunity to (re)engage in a task and can self-generate interest for that task, engaging in enduring learning behavior voluntarily and autonomously.
As noted earlier, our reward-learning framework is influenced by the four-phase model of interest development as a psychological theory to explain long-term engagement (note also that the authors discussed the role of rewarding experiences in some papers; see Ainley and Hidi 2014;Hidi 2016;Renninger and Hidi 2016). Therefore, it would not be surprising that our framework can accommodate the distinction between situational interest and individual interest. But critically, the framework does so by elaborating more closely the reward-learning process behind these two distinct developmental phases of interest. Specifically, at the beginning of the learning process, the feedback loop is unstable and easily disrupted (thus requires external input or support) as the value of knowledge acquisition is not well developed and knowledge base is not rich enough to generate new questions (e.g., students may be intrigued by the fact that quadratic equations produce a U-shaped curve but this initial interest easily wanes unless students really understand why it does). Also, at this early phase, the knowledge-acquisition process is more likely to be driven by chance encounters. Once the feedback loop is established, developed knowledge generates a number of new questions and adds value to behaviors aimed towards answering such questions, allowing the system to autonomously and continuously self-generate the feeling of reward (e.g., students who deeply understand the shape of quadratic equations may autonomously seek the way to graph cubic equations). That is, people store the value of the knowledge. Situational interest and individual interest can be interpreted as the respective phases of this knowledge-acquisition process.

Specific Curiosity vs. Diversive Curiosity
Berlyne (1960) distinguished two types of curiosity-specific curiosity and diversive curiosity. Specific curiosity involves detailed investigation of novel stimuli, whereas diversive curiosity is more unspecified exploration of the environment to find novel stimuli. Specific curiosity serves to decrease uncertainty whereas diversive curiosity is aimed at increasing uncertainty. This distinction is intuitive and has been widely adopted especially in the studies that develop self-report measures of curiosity (e.g., Kashdan et al. 2009;Litman and Spielberger 2003;Mussel 2013).
In our model, specific curiosity is straightforward-a label assigned to information-seeking behavior to close a knowledge gap. This is consistent with the process of curiosity that Loewenstein (1994) described (e.g., a situation in which a person works on a puzzle). But what about diversive curiosity? From our perspective, there is no pure exploratory motivation to increase knowledge gaps, as all of our epistemic behavior is, either explicitly or implicitly, guided by the desire to close a knowledge gap or to gain information (see Murayama, 2019a for more elaborated discussion). Think about a student who scans through a book shelf of detective stories in a library without having a specific book in his/her mind. This behavior seems like a random exploration, but the close scrutiny of the behavior reveals that it is not that simple. Specifically, the psychological mechanisms behind this scanning behavior can be explained by his/her motivation to close the generic knowledge gap or gain information (and ultimately feel rewarded) on the genre of detective stories.
Think also about infant's exploration behavior with which they seek novel and arousing stimuli. This is considered as one of the typical manifestations of diversive curiosity. Even in this situation, for every moment, infants' behavior is guided by the expected reward value of information that is afforded by the stimulus they are confronting, and in that respect, their behavior is still driven by a knowledge gap (i.e., difference between expected amount of information and current information). Indeed, this type of exploration behavior has been well explained by the existing reward-learning models of information seeking (e.g., Oudeyer et al. 2007). In contrast to specific curiosity, the content of information one expects for diversive curiosity is more generic, and in that respect, specific curiosity and diversive curiosity may be distinguished. In our view, however, the generality of expected information is on a continuum (i.e., not a categorical matter), and however general/specific the information is, the basic psychological process underlying knowledge acquisition is the same. Therefore, we do not believe that there is a hard borderline between these two types of curiosity.

Dimensions or Domain Specificity
Researchers also proposed that different types of curiosity can be distinguished by what it is that the curiosity is directed towards. For example, Berlyne (1954) distinguished epistemic curiosity and perceptual curiosity. Epistemic curiosity refers to the curiosity triggered by conceptual stimuli, whereas perceptual curiosity refers to the curiosity aroused by sensory stimulation (e.g., ambiguous pictures). Some other researchers also identified social or interpersonal curiosity (i.e., curiosity for social relationship) as distinct types of curiosity (Kashdan et al. 2018;Litman and Pezzo 2007).
From our perspective, the basic knowledge acquisition and maintenance process would be the same, regardless of the stimulus type. Even for perceptual curiosity, awareness of ambiguity of sensory stimuli requires knowledge, and such sensory curiosity may be a basis of our curiosity about more complex stimuli (e.g., abstract arts, see Van de Cruys and Wagemans 2011). Therefore, rather than considering them as completely separate types of curiosity, we believe it would be more promising to search for potential common psychological mechanisms underlying these different types of curiosity.
Note, however, that this claim does not refute any previous attempts to develop a scale to distinguish individual differences in these distinct types of curiosity. It is also true that there are huge individual differences in the types of topics people are curious about (see Fastrich et al. 2017, for a quantitative demonstration). Domain specificity in curiosity and interest (i.e., individual differences in the topics people are curious about/interested in) is likely to exist. One important future question may be to consider why domain specificity exists (rather than asking how these domains are different), despite the domain general psychological mechanism underlying curiosity and interest. We speculate that the self-boosting property of curiosity and interest would be the key to address this question but further investigation would be needed to empirically address the issue.

Traits
Both curiosity and interest posit state-like forms and more enduring, trait-like forms. Indeed, in the literature of personality, there are several personality scales that are closely related to some aspects of curiosity or interest, such as sensation seeking (Zuckerman 1979), need for cognition (Cacioppo and Petty 1982), and novelty seeking (Cloninger et al. 1993). Curiosity and interest are also related to two of the Big-Five factors (McCrae and Costa 1987): openness to experience and conscientiousness in many respects (see also Trautwein et al. 2009). In the field of occupational psychology, researchers have identified several different categories of vocational interests depending on people's general preference for different types of jobs (Gati 1991).
Importantly, most of these trait-level constructs can be considered as representing the generalized reward-value of knowledge in our proposed framework. For example, novelty seeking can be considered as the general expected reward value for the acquisition of new information (rewarding value for novelty may be modulated by the noradrenergic system; see Sakaki et al. 2018). Need for cognition can be regarded as the general action value for information-seeking behavior (see also the discussion on "Learned industriousness, " Eisenberger 1992). Vocational interests are the general rewarding value for the different types of learning and knowledge. These generalized reward values may in large part be explained by the generalization process in reward learning (Wimmer et al. 2012). As individuals go through the cycle of knowledge acquisition repeatedly, they develop a default rewarding value of these different processes, forming a context independent, personality-like curiosity or interest. This idea is consistent with a social-cognitive view of personality development (Williams and Cervone 1998) and explains why curiosity and interest can be described as both state and trait (Grossnickle 2014).

Curiosity as an Aversive State
In the literature of curiosity, there has been a recurrent idea that curiosity involves an aversive state, and people seek information to eliminate the aversive feelings (i.e., "relief" of curiosity). For example, Berlyne (1954) approached curiosity from drive reduction theory and argued that curiosity produces an uncomfortable state of uncertainty. From this perspective, people are motivated for new information because it would reduce uncertainty and eventually the uncomfortable feeling associated with uncertainty. Loewenstein (1994) also argued that the awareness of knowledge gaps shifts people's attention to the lack of knowledge, and this aversive feeling of deprivation is the driving force of people's curious behavior (i.e., loss aversion). One of the reasons why this idea has attracted popularity is that it can explain strong seductive power of curiosity, which sometimes lead people to make irrational decisions (e.g., Hsee and Ruan 2016;Lau et al. 2018;Oosterwijk 2017). However, this idea is still controversial and empirical evidence is not conclusive (Ruan et al. 2018;Silvia 2006). For example, Jepma et al. (2012) found that presentation of ambiguous pictures (i.e., material that is likely to trigger perceptual curiosity) activates the anterior cingulate cortex and anterior insular cortex. As these brain areas are associated with negative arousal states, the authors indicated that perceptual curiosity involves aversive mental states. However, the anterior cingulate cortex and anterior insular cortex have been implicated in many different functions (including reward expectation, see Shidara and Richmond 2002), and therefore, it is not easy to derive a strong conclusion. To our knowledge, there is little direct and conclusive evidence supporting the aversive nature of curiosity. 4 In the literature of reward learning, the idea of drive reduction and satiation is no longer a viable account for motivated behavior (Berridge 2004;Dickinson and Balleine 2002). Instead, to explain strong urges or craving of motivated behavior, recent researchers have proposed an incentive salience model of learning, arguing that expected reward value consists of two distinct appetitive components-liking and wanting (Berridge 2004). Liking refers to hedonic experiences or subjective feelings associated with rewards or expected rewards. On the other hand, wanting (also called incentive salience) is purely an incentive motivational value (i.e., feeling of craving or vigor) and this state of wanting is often triggered during the expectation of a reward (perhaps through tonic dopamine release; Niv et al. 2007). Research in neuroscience has provided support of the distinction of liking and wanting in the reward learning process (Berridge 2012).
The proposed framework of autonomous knowledge acquisition can naturally incorporate the distinction as our framework is a reward-learning model. Specifically, wanting and liking can be regarded as two distinct components of the expected reward value in the framework. In other words, when people become aware of knowledge gap, information-seeking behavior may be driven by two different components of expected reward value-incentive salience of new information ("I crave the information!") and expected positive feeling about the information ("It would be pleasant if I knew that").
A nice feature of this extension is that the framework can explain both the hedonic experience and the strong motivational power of the knowledge-acquisition process, without making an unnecessary assumption that the knowledge-acquisition process can involve an aversive feeling and avoidance-based motivation, which has been the main source of the controversy. Theoretically speaking, wanting is a valence-free pure motivational vigor (Berridge 2004). Some animal research has suggested that the state of wanting involves aversive facial expressions (e.g., Berridge and Valenstein 1991), but the evidence is still not conclusive. Our perspective is that viewing incentive salience as a pure motivational factor would provide a more parsimonious and intuitive account for curiosity-related behavior. 5

Intrinsic Motivation
The difference between intrinsic motivation and interest is another matter of debate. Quite a few researchers use them almost as synonymous, whereas others suggest that interest is one constituent element of intrinsic motivation or a "source" of intrinsic motivation (e.g., Deci and Ryan 1985). We believe that these discrepancies come from the fact that researchers tend to conceptualize interest and intrinsic motivation as entities, ratherthan processes. In our view, intrinsic motivation refers to the internally generated rewarding process that is not dependent on extrinsic incentives (Murayama 2019b;Oudeyer and Kaplan 2009). This is exactly the selfrewarding process we have laid out in this paper. Therefore, it is an arbitrary matter which part of the process we should call interest or intrinsic motivation. In other words, their relationship simply depends on how you define them. 5 Litman and his colleagues also argued that the liking and wanting components are the two underlying processes of curiosity (Litman 2005;Litman and Jimerson 2004). However, they called the wanting component deprivation-type (D-type) curiosity, retaining the aversive state and avoidance-based motivation arising from the state of deprivation as one of the definitive features of curiosity. This may be because their model was originally motivated by Loewenstein's (1994) knowledge gap theory (Litman and Jimerson 2004), and the concept of wanting was integrated after D-type of curiosity was proposed (Litman 2005).

General Discussion
The current paper presents a reward-learning framework of autonomous knowledge acquisition, with the aim to conceptualize curiosity and interest as the naïve constructs emerging from a knowledge-acquisition process. Our proposed framework seems to be able to capture many aspects of what people naïvely call curiosity and interest, as well as some theoretically important constructs discussed in the literature (e.g., situational interest vs. individual interest, state vs. trait curiosity). Importantly, by focusing on the knowledge-acquisition process rather than the constructs of curiosity and interest themselves, we can bypass debates on the definitions of curiosity and interest and facilitate constructive discussions on how curiosity and interest are related to our learning process. This process-oriented approach is common in other fields of psychology (e.g., cognitive psychology). Even in the literature of educational psychology, a well-cited definition of motivation actually takes a process perspective. Specifically, Pintrich and Schunk (2002) defined motivation as "the process whereby goal-directed activity is instigated and sustained" (p. 5, emphasis added). Nevertheless, this process perspective has not been incorporated well enough in the motivation research in educational psychology.
The proposed framework is also an attempt to incorporate the idea of curiosity and interest into general reward-learning models (Dayan and Niv 2008;Sutton and Barto 1998). Currently, research on curiosity and interest seems to be segregated across different fields including educational psychology, with different terminologies and concepts used within the groups. However, as the research on curiosity and interest becomes more and more interdisciplinary, we believe it is important to provide common and unified ground that serves as the basis for further cross-disciplinary discussion (Braver et al. 2014). For example, current research on curiosity tends to focus on information-seeking behavior for knowledge that has only shortterm consequences (e.g., information about an outcome that would be revealed a few seconds later; Bennett et al. 2016) or little practical value (e.g., trivia questions; Kang et al. 2009). This approach, together with computational and neuroscientific methods, can provide a fine-grained picture of the mechanisms of information-seeking behavior. But our proposed framework indicates that this approach only provides a limited window into the whole knowledgeacquisition process. On the other hand, empirical research on interest is abundant in the field of educational psychology and prefers using realistic materials (e.g., text passages; Alexander et al. 1995), which may be well suited to capture the development of the knowledgeacquisition process. But realistic materials may make it difficult to examine knowledge acquisition in a fine-grained manner. For example, one may find a relationship between the amount of knowledge for a topic and perceived value for the topic, but with realistic materials which typically do not have experimental control, it may be difficult to further examine the psychological mechanisms about why and how the accumulation of knowledge promotes the valuation of the topic. Future research should look for new research paradigms that achieve granular investigation of the knowledge-acquisition process as well as its long-term development.

Recommendation
Given that it is inherently difficult to provide a precise definition of curiosity and interest (as they are naïve concepts), how should we use these terms in the research of educational psychology and other fields? One obvious (but perhaps radical) option is to completely eliminate the terms of curiosity and interest in scientific work and simply to specify which aspect or component(s) of the knowledge-acquisition process are being referred to (e.g., "rewarding experience of knowledge acquisition," "information prediction errors"). Indeed, in the field of cognitive science, researchers prefer to using the term "information seeking" rather than curiosity when describing curiosity-related phenomena (e.g., Gottlieb et al. 2013). Alternatively, one potential compromise is to specify the aspects or components of the knowledge-acquisition process whenever using these terms, rather than boldly using the terms of "curiosity" and "interest" as they are. For example, one might refer to "the rewarding experience of interest," "information-seeking behavior in curiosity," or "the self-rewarding process in individual interest." This way, researchers can avoid unnecessary confusion, enabling smooth and efficient communication between researchers.

Implications for Measurements
In educational psychology, curiosity and interest have been commonly assessed using selfreport questionnaires (Renninger and Hidi 2011). This methodology provides us with a great opportunity to examine the interrelationship between curiosity and interest but in light of our framework, our biggest reservation is that it is extremely difficult to capture the recursive process of knowledge acquisition with one-shot survey questions. Learning is a continuous process. During a class, students' knowledge is repeatedly updated, and students experience numerous small feelings of reward, which fluctuate moment-by-moment. Even if a researcher focuses on one specific component of the reward-learning process (e.g., the feeling of value), one could only capture the aggregated view of experience, which is likely to be biased and inaccurate. An experience sampling approach and within-person analysis may be a promising methodology to capture the momentary nature of the knowledge-acquisition process (see also Murayama et al. 2017a, for the advantage of this methodology), but this method still at most provides only some snapshots of the process.
Nevertheless, we suggest that our framework would help researchers craft a set of items that can better capture theoretical concepts of curiosity and interest. For example, situational interest and individual interest represent a maturity of the knowledge-acquisition process as a whole. By analyzing the distinct features of these two developmental phases for each step of the knowledge-acquisition process, we may be able to provide a comprehensive set of items that sufficiently cover the distinct features of situational interest and individual interest.
As indicated in our paper, there are a number of trait-level scales that aimed to understand the individual differences in different aspects of curiosity and interest (Kashdan et al. 2009;Litman 2008). These scales are typically validated using factor analysis but factor analysis is of limited use to detect the items that are essential and important, as factor analysis simply makes use of the correlational information to group the items. As correlation can be produced by many miscellaneous confounding factors (e.g., similarities of the wording), it is difficult for factor analysis to reveal the psychological process underlying a construct. As Whitely (1983) argued, construct validity may be better established by focusing on psychological processes, rather than focusing on the nomothetic network. Our reward-learning framework may provide a useful theoretical guideline on which items should be deemed relevant or not.

Implications for Education
The value of reward in education has been controversial. In fact, while some educational programs emphasize the value of reward systems (e.g., badges; see also Howard-Jones and Jay 2016;O'Byrne et al. 2015), some other researchers argue against any form of rewards or incentive systems in education (e.g., Kohn 1993). An important phenomenon to discuss, relating to the role of reward in education, is the undermining effect (Deci 1971). Specifically, a number of studies have shown that removing extrinsic rewards for an enjoyable task undermines people's voluntary engagement in the task. Although this phenomenon is conditional on several factors (e.g., the task needs to be interesting to participants; rewards should be contingent on task performance), once the conditions are met, the undermining effect has been robustly observed in the empirical literature (for a meta-analysis, see Deci et al. 1999).
Our proposed framework provides a potential middle-ground on this debate. As indicated in our framework, once students are deeply engaged in an activity (i.e., students enjoy it), the feedback-loop system generates internal rewards, making task engagement self-sustainable. At this point, providing extrinsic rewards may shift students' attention away from internal rewards, potentially causing a breaking-up of the self-rewarding cycle (i.e., the undermining effect). In fact, Murayama et al. (2010) showed that providing monetary rewards for an intrinsically motivating activity dramatically decreases the reward network activation in the brain (after the rewards are removed). On the other hand, when students are in the initial stage of task engagement, in which students do not feel sufficient reward value to acquire knowledge, extrinsic incentives or external supports to motivate information-seeking behavior should help to "start-up" the system. In other words, effectiveness of extrinsic rewards or external supports depends on the phase of students' task engagement. Indeed, Ryan and Deci (2000) suggested that, when people are not engaged in a task, satisfaction of the need for competence and the need for relatedness (e.g., providing the means to increase their selfconfidence and perceived acceptance from others) should help them internalize the value, realizing a more autonomous form of motivation. We believe certain forms of rewards and social support should be useful to facilitate task engagement. 6 It is important to note, however, that people generally overestimate the motivating power of extrinsic rewards and underestimate people's capacity to sustain task engagement. For example, Murayama et al. (2017b) showed that people believe that extrinsic incentives are effective even in situations where the undermining effect is likely to happen. This tendency was observed even for participants who are currently in a position of teaching others. Murayama et al. (2018) also showed that people underestimate how much they can sustain task engagement for boring tasks. These studies indicate that our belief about motivation (known as metamotivation; see Miele and Scholer 2018;Murayama et al. 2017b;Scholer et al. in press) is often inaccurate. Such an inaccurate belief may lead schoolteachers to rely on extrinsic rewards more often than necessary in educational practice, hampering the potential development of interest. As discussed in the paper, we have an inherent (evolutionally developed) mechanism to engage the knowledge-acquisition process without extrinsic rewards-this assumption should be well recognized and shared by educators and teachers. To facilitate task engagement, we should support students' rewarding experience in a variety of ways (e.g., rewards, social interaction) but once the wheels have started turning, appropriate adjustments should be made, with slightly more belief in students' capacity to self-sustain their task engagement.