Introduction

“Aesthetics” is a philosophical concept, rooted in the Greek word aesthesis that can be translated as understanding through sensory perception (Hekkert & Leder, 2008). In recent years, aesthetics has emerged as a highly popular field of research in psychology, cognitive science, neuroaesthetics and affective science. When we talk about aesthetics usually we refer to the concept of beauty or attraction in nature or artwork—something very special and different from other objects or events in the world. For example, we are attracted to certain natural objects or scenes (e.g., flower, landscape, seascape, fauna, flora, etc) but not to others (Carlson, 2000; Porteous, 1996). This biased sensory attraction is not limited to natural objects or scenes only but can expand to artificial objects or events as well. Thus, in many different settings, from art galleries to supermarkets, we choose specific arts or artifacts to purchase, while discarding a number of alternatives.

The biased sensory attraction reflects our innate affinity for beauty that fulfils our psychological needs (e.g., pleasure, mental wellbeing; Postrel, 2003), and highly influences our attitudes and decision making in different walks of life. For example, we prefer someone to love, to marry or date with because of the pretty appearance with smooth skin, thick shiny hair, symmetrical faces, and curved waists, or because of the smart and tall figure (Scheller et al., 2021); we prefer to wear a cloth which looks beautiful and gives us a feeling of comfort; we choose a nice place to visit; and in our wedding ceremony we may hire a singer whose voice sounds very sweet and a dancer who dances in an appealing, eye-catching fashion. The practical implication of aesthetics, particularly in designing and packaging, is also hard to deny. In this digital arena, we are surrounded by high-performance interactive technology and products, such as cars, smart phones and tablet computers. These products are mostly oriented toward enhancing user experience, and much of the battle involves attempts to catch the consumer’s eye and heart with appearance and design-based symbolic value (Tractinsky, 2013). The overall design aesthetics of these interactive products can be improved by acoustic quality. Research has shown that acoustic quality plays an important role not only in design aesthetics of interactive technology but in overall perception and evaluation of such interactive products as cars and cell phones (Mahlke et al., 2007). Marketing research and practice have acknowledged the importance of product aesthetics as a source of competitive advantage (Bloch et al., 2003; Cox & Cox, 2002; Liu et al., 2017). Leading brands, such as Apple and Dell, are adored and coveted due to the high aesthetics and superior design of their products, which enable them to sustain in competitive global markets (Hsiao, 2017). Because product design affects the quality of our life (Crilly et al., 2004) we often give more importance to (visual) aesthetic appeals than functional attributes while choosing a product. Moreover, visual aesthetics adds value to the product (Bloch, 1995), reduces consumer’s price sensitivity (Mumcu & Kimzan, 2015), and enhances their purchasing intention (Postrel, 2003). Thus aesthetics is increasingly becoming an important criterion for consumers to evaluate and differentiate between product qualities and make purchasing decisions. For this reason, companies or marketers take aesthetics into account in their marketing strategies (Bloch, 1995; Bloch et al., 2003; Cox & Cox, 2002; Simonson & Schmitt, 1997), setting higher prices to aesthetically attractive products (Kristensen et al., 2012). The iPhone is a good example of how a phone manufacturer uses visual aesthetics as a differentiating factor – in everything from the actual phone to its packaging (Tractinsky, 2013). Visual aesthetics is important not only for product design and packaging but for store or interior designs as well. The visual design aesthetics of a store or indoor environment is highly influenced by acoustic quality and is a critical determinant of consumer response and a retailer‘s success. Research has demonstrated that the aesthetic quality of a store or room ─ the extent to which it is attractive and induces hedonic (pleasant or unpleasant) experience ─ affects store loyalty and the sorts of evaluations made while in that setting (Kopec, 2006; Muhammad et al., 2014). The well designed or highly aesthetic shopping environments have the power to evoke positive response, introduce environmental cues (Sharma & Stafford, 2000), and stimulate perception that affects consumer's purchasing behavior. Aesthetic aspects, such as color (Babin et al., 2003), scent (J. Chebat & Michon, 2003), and music (J. Chebat et al., 2001), are capable of swaying consumer preference, shopping duration, arousal, and acquisition. Like visual aesthetics tactile aesthetics may also influence consumer perception in various ways. Prosaically, we experience the touch of clothing against our bodies every day, and this tactual contact determines the comfort of the garments we wear (Cardello et al., 2003). Thus hedonic touch likely determines the estimated product quality (Grohmann et al., 2007), and guides consumer behavior and attitudes (Carbon & Jakesch, 2013; Peck & Childers, 2003; Peck & Shu, 2009). A growing body of evidence suggests that hedonic tactile stimulations are powerful motivators that facilitate product evaluation, product choices, and purchase decisions (see Arora et al., 2017; De Canio & Fuentes-Blasco, 2021; Duarte & e Silva, 2020; Manzano et al., 2016; McCabe & Nowlis, 2003). Thus retailers can directly benefit from allowing customers to touch their arts and other products. However, the influence of aesthetics on consumers is not limited to tangible or visible products in the physical shopping environments but can expand to those in virtual environments as well. The aesthetic appeal of virtual environments is determined by such features as color, graphics, and the layout of a website (Cai & Xu, 2011). Website aesthetics is a significant component of perception of online service quality, security, and convenience (Yoo & Donthu, 2001). The high aesthetic appeal of a website is related to an enjoyable virtual experience (Cai & Xu, 2011; van der Heijden, 2003), and can garner positive reviews, regardless of its utility (Lindgaard & Dudek, 2003). Taken together, aesthetics can significantly affect consumer's product perception or evaluation, purchase intention, and satisfaction (Bitner, 1992; Donovan et al., 1994; Morrin & Ratneshwar, 2003) which in turn determine the success and satisfaction of the retailers or marketers.

It follows from the above discussion that aesthetics has to do with human perception from all of the sensory modalities, both visual and nonvisual (Barry, 2014; Joy, & Sherry Jr., 2003; Lauwrens, 2019; Roberts, 2022; Thakral et al., 2012), including how it feels to interact with something (e.g., as a result of physically touching an artifact, sculpture, architecture), listen to something (e.g., music, melody), taste something (food) and smell something (e.g., food, body odor, or cosmetics). Recent research suggested that beauty lies not only in the eye but in the ear and nose of the beholder as well (Groyecka-Bernard et al., 2017). Perhaps it is more appropriate to say beauty lies in each sense of the beholder (Scheller et al., 2021). Indeed, our experience of the world is mostly multisensorial and integrated across different sensory modalities (Karim, Proulx, et al., 2021b). Therefore, in everyday life, many of our decisions are based on aesthetics sensed by multiple sensory modalities, rather than a single sensory modality. For example, as attractiveness lies in both visual cues (Sorokowski et al., 2013; Yu & Shepard, 1998) and nonvisual cues, such as voice (for reviews see, Hill & Puts, 2016; Pisanski, 2017), we might be more willing to choose our potential partners who are both physically attractive and have a nice voice than those who have attractive looks but very rude voice or very nice voice but unattractive or ugly looks (both physical and vocal attractiveness are aesthetic qualities; see Hill & Puts, 2016; Jefferson, 2004; Johnson & Tassinary, 2007; Livingston, 2008; Mchiza & Parker, 2020; Pisanski, 2017; Sarwer et al., 2003; Swami et al., 2006a, b; Vadachkoriia et al., 2007; Zangwill, 1995). We may not be willing to buy a cloth which is visually beautiful but is not pleasant or comfortable to touch (comfort is an aesthetic quality of interactive objects; Jeon, 2010; Karim, Prativa, & Likova, 2021a; Salem et al., 2009; Suzuki, 2019). Similarly, if a food looks nice but is not tasty most of us will not choose that food to eat, and we will not buy a scent just by seeing its color, but by sensing its smell as well. Thus multisensory cues can, separately or in combination, influence our perceived attractiveness or aesthetics of an individual or object, and our attitudes and actions toward that person or object. Therefore, it is important to deepen understanding of how the process of human aesthetics operates in various sensory modalities, and how this process is different from basic perceptual process. The current literature cannot tell us anything about the general nature of aesthetics in various sensory modalities. There is no unified model of aesthetics that can explain how human aesthetics in different sensory modalities are similar or different. Though we are not interested in multimodal aesthetics in this integrative review we intend here to highlight aesthetic processing in both visual and nonvisual modalities, to advance holistic understanding of aesthetics as differentiated from basic perception and their neural underpinnings in humans.

To this end, we rearticulate the notion of human aesthetics by critically appraising the conventional definitions; offering a new, more comprehensive definition, and identifying the fundamental components associated with it. As part of this rearticulation, we also differentiate aesthetic sensitivity from basic perceptual sensitivity. Then we analyze the nature of information processing in the brain, and propose a novel local-global integrative model, starting with a foundation on vision and visual aesthetics to build toward newer propositions about nonvisual aesthetics. This model builds on hierarchical information processing styles, disentangling aesthetic processing from basic perceptual processing. It also sheds light on how the affective and cognitive influences interact to modulate aesthetic preferences under top-down and bottom-up control. In support of this model, we present findings from cognitive neuroscience, neuroaesthetics, affective science, psychology, and the arts that highlight the crucial role different cortical regions play in object or stimulus recognition and appreciation of its beauty in both visual and nonvisual modalities, and how their roles can be mediated by experience. Our current challenge is to understand the mechanisms and processes that distinguish perception geared toward aesthetic experience from perception geared toward object or stimulus identification. Contemporary studies of arts and culture focus on aesthetic understanding of arts through the eye and the ear, and with the advent of the cutting-edge brain-imaging techniques, there is now strong evidence that beauty lies not only in the eye or the ear but in the brain of the beholder as well (Cheung et al., 2014). However, there is still gap in the current literature as several significant issues have not yet been addressed and explained clearly, particularly about the associative or dissociative nature of basic feature processing and aesthetic processing in the brain. So, in addition to the development of a novel hierarchical model for human aesthetics, a second goal of this review is to fill that gap to a certain degree by analyzing how basic perceptual processing and aesthetic processing are accomplished in visual and nonvisual modalities and how they are related to each other. Thus, we attempt to clarify the aesthetic phenomena by differentiating the mechanisms of visual aesthetics and nonvisual aesthetics, and those of basic perception and aesthetic perception in both visual and nonvisual modalities. We also highlight the extent to which the proposed model of human aesthetics can be generalized to nonvisual modalities. Based on the model of aesthetics we propose, theoretical considerations and the past findings outlined in this review, we conclude with specific questions or hypotheses that remain to be addressed or tested in future studies directing to further advance this burgeoning field of research.

Method

The methodology of integrative reviews varies substantially, because there is no well-established or standard format for this kind of research as there is for empirical research (Christmals & Gross, 2017; Jackson, 1980; Torraco, 2005). However, following the conventional guidelines available in the literature on integrative review methodology (Cooper, 1982; de Souza et al., 2010; Whittemore & Knafl, 2005), this review was implemented in five overlapping stages as discussed below.

  • 1. Defining the problem or guiding question. The following research question was formulated to answer: Do we enjoy what we sense and perceive? An elaboration of this research question can be: How is aesthetic appreciation dissociated from basic perception in various sensory modalities that we use to explore, understand, and appreciate the world? This elaborated research question guided us by identifying what should be approached to contemplate the theme of our interest. Here, we defined what would be extracted from the selected studies, with the aim of organizing the key information in a concise and comprehensive way to construct the review.

  • 2. Searching or sampling the literature. The literature search was carried out in a wide, diversified way in the reliable databases. First, an electronic search of both behavioral and neuroimaging studies about human aesthetics and perception was done using a large number of keywords encompassing five major sensory modalities in a variety of databases, namely PsycNET, PubMed, Scopus, PsychInfo, MEDLINE, ScienceDirect, Google Scholar, and Web of Science. The keywords used in this search were:

    Perception, perceptual sensitivity, perception without attention or awareness, visual perception, tactile perception, tactile perception in blindness, auditory perception, gustatory perception, olfactory perception, perception in visual modality, perception in nonvisual modalaity, perception in tactile modality, perception in auditory modality, perception in gustatory modality, perception in olfactory modality, art appreciation, arts and aesthetics/esthetics, empirical aesthetics, ecological aesthetics, sense of aesthetics/beauty, beauty of arts, neuroaestehtics/neuroesthetics, aesthetics and moral appraisal, aesthetic/esthetic perception, aesthetic sensitivity, aesthetic preference, aesthetic value, aesthetic pleasure, aesthetic interest, aesthetic chill, aesthetic catharsis, aesthetic emotions, everyday/basic emotions, affective appraisal/evaluation, aesthetic perception versus everyday perception, visual aesthetics, physical attractiveness as aesthetics, tactile/haptic aesthetics, tactile aesthetics in blindness, comfortableness as aesthetics, affective touch versus discriminative touch, affective/aesthetic touch, affective/social touch hypothesis, auditory aesthetics, vocal attractiveness as aesthetics, musical aesthetics/pleasures, hedonics of a music/melody, emotion in music/melody, gustatory aesthetics/pleasures, olfactory aesthetics/pleasures, aesthetics in visual modality, aesthetics in nonvisual modalaity, aesthetics in tactile modality, aesthetics in auditory modality, aesthetics in gustatory modality, aesthetics in olfactory modality, hedonic aspects of taste, hedonic aspects of olfaction, aesthetic/hedonic experience, complexity of aesthetic experience, aesthetic pleasures of a sad song, aesthetic pleasures of a scary movie or horror film, aesthetic pleasures of a brutal film, aesthetic aspects, determinants of aesthetics, sensory properties and aesthetic properties, aesthetic properties versus descriptive properties, explicit and implicit stimulus properties and aesthetic preference, personal and cultural factors of aesthetic preference, personal and cultural factors of musical preference, aesthetics of webpages, virtual aesthetics, acoustic quality and design aesthetics, product’s aesthetic value, aesthetics and marketing strategies, theories of human aesthetics, perspectives of aesthetics, philosophical aesthetics, evolution of aesthetics, evolution of physical attractiveness, models of human aesthetics, the information-processing model of visual aesthetics, the three-component model of visual aesthetics, the two-pathway neural scheme of visual aesthetics, the triadic model of visual aesthetics, the dynamic model of visual aesthetics, pleasure-interest model of visual aesthetics, hierarchical model of haptic aesthetics, neural models of musical aesthetics, hybrid models of musical aesthetics, attention in perception, attention in aesthetics, deployment of attention, lateralized local-global model of attention, perceptual versus cognitive processing, cognition-emotion independence, top-down and bottom-up processing in perception, top-down and bottom-up processing in aesthetics, local and global processing in perception, local and global processing in aesthetics, local and global processing in the blind, local and global processing across sensory modalities, local and global processing in visual modality, local and global processing in nonvisual modalities, local and global processing in tactile modality, local and global processing in auditory modality, local and global processing of auditory stimuli, local and global processing of musical information, local and global processing in gustatory modality, local and global processing in olfactory modality, neural substrates of aesthetics, neural substrates of visual aesthetics, neural substrates of tactile aesthetics, neural substrates of discriminative and affective touches, neural substrates of auditory aesthetics, neural substrates of musical aesthetics, neural substrates of gustatory aesthetics/pleasures, neural substrates of olfactory aesthetics/pleasures, task-dependent activity of brain regions, and beauty-dependent activity of brain regions.

    Thus after tracking down the references from the relevant retrieved articles a manual search was concurrently done as articles might be inaccurately indexed or might fail to include keywords during the electronic literature search (Higgins & Green, 2011). In addition to peer-reviewed journal articles, relevant book chapters/papers and gray literature (e.g., unpublished studies, reports, dissertations, conference or symposium proceedings and abstracts) were also searched to identify more references to published works.

  • 3. Search outcome or data collection. In order to identify relevant studies, various terms referring to human aesthetics and perception in different sensory modalities were checked. All the electronic articles and book chapters/papers that contained the keywords as well as the articles and book chapters/papers manually found from various sources were assessed and incorporated for inclusion in this review. The articles were selected using three inclusion criteria: (1) The study should be empirical (quantitative and qualitative) or theoretical, (2) The study should be conducted on human aesthetics or perception or hedonic aspects of perception in any sensory modality, and (3) The study should be published in English in a peer-review scholarly journal. After passing these inclusion criteria, a total of 424 journal articles, 1 conference proceeding, and 42 book chapters/papers (excluding 8 journal articles and 1 book chapter/paper cited here about integrative review methodology) published during the period of 1895 to 2022 were deemed relevant and included for analysis in the next stage.

  • 4. Data analysis and synthesis. At this stage, we critically appraised the selected studies, taking into account the above guiding question as the basis for analysis. First, the title and abstract were reviewed, followed by an in-depth review of the full text of each article. Second, synthesis of findings from individual studies was done using the ‘best fit’ framework synthesis by creating deductive themes and codes against which the data were analyzed thematically (Carroll et al., 2011; Carroll et al., 2013). Third, data that did not fit into the ‘best fit’ framework were considered iterative and analyzed using inductive thematic analysis (Carroll et al., 2011; Carroll et al., 2013). Thus the selected study findings were integrated to develop a comprehensive conceptual or theoretical framework of human aesthetics and perception. All articles were categorized, analyzed, appraised, and synthesized by the first author of this review that started in January 2015 and continued as needed until the write-up of this work. Received results were checked for accuracy and relevance by the other contributing authors, and discrepancies, if occurred, were resolved through discussion and consensus.

  • 5. Presentation and interpretation of results. The results obtained in the selected studies are discussed and a critical analysis is performed on what is evidenced. The data of the studies included in this review are categorized, analyzed and interpreted or discussed, establishing relationships with the proposed theoretical model in focus. Results are structured and presented below in order to answer the aforementioned question in all major sensory modalities, containing enough information for the reader to make an analysis of the review performed.

Results and discussion

The aforementioned five-stage review approach allowed us to integrate a large pool of data from diverse sources, and incorporate a wide range of purposes, such as analysis of the current theories/models of human aesthetics, identifying current conceptual problems and gaps in current understanding of human aesthetics, developing new and more comprehensive propositions about human aesthetics, bridging between related issues of human aesthetics and perception, developing a novel theoretical framework that explains human aesthetics and perception in both behaviorally and neurally dissociable fashions, generalizing the framework across sensory modalities, identifying the overlapping and distinct issues of aesthetics and perception across sensory modalities, identifying a domain-general faculty of beauty or aesthetics, and the need for future research directing to validation of the proposed theoretical framework. The diversity of sampling frame in conjunction with the multiplicity of purposes results in a deepening of the knowledge about human aesthetics and perception, a comprehensive portrayal of complex concepts in this field, and a novel theoretical framework of human aesthetics applicable not only to visual modality but to nonvisual modalities as well.

Conceptualizing human aesthetics

What is aesthetic perception?

The first (neurological) theory of human aesthetics was put forward by Ramachandran and Hirstein (1999) followed by three seminal neuroimaging studies on human aesthetics in the early 2000s (Cela-Conde et al., 2004; Kawabata & Zeki, 2004; Vartanian & Goel, 2004). Since then neuroaesthetics has been growing as an independent field of research. Over the last two decades or so, a large number of research studies have been published, leading to the development of a number of models of visual aesthetics, most notably the information-processing model (Leder, 2013; Leder et al., 2004; Leder & Nadal, 2014), the three-component model (Nadal et al., 2008), the two-pathway neural scheme (Ishizu & Zeki, 2013), the triadic model (Chatterjee & Vartanian, 2014, 2016), the dynamic model (Redies, 2015), and the two-component model (Graf & Landwehr, 2015, 2017) of visual aesthetics. However, compared to visual aesthetics nonvisual aesthetics has received scanter scientific attention, giving us a hierarchical model of haptic aesthetics (Carbon & Jakesch, 2013), and a few neural or hybrid models of musical aesthetics (see Brattico et al., 2013; Juslin, 2013; Reybrouck & Eerola, 2017; Schubert, 1996). It is undeniable that those studies and models made invaluable contributions to the understanding of arts and aesthetics in their own ways. However, an in-depth analysis of those studies and models reveals a few fundamental problems with how the concept of aesthetics has been used in the current literature. First, those studies and models restrict the concept of human aesthetics to an appraisal of the spatial or structural composition of an object or art (Carbon & Jakesch, 2013; Ishizu & Zeki, 2013; Juslin, 2013; Leyssen et al., 2012; Palmer et al., 2008; Reybrouck & Eerola, 2017;.Scherer, 2004), with little or no explanation of the local or global information processing operating during aesthetic appreciation (see P. Brattico et al., 2017; Carbon & Jakesch, 2013). Second, some previous studies limit their model to the explanation of the perception of specific stimnulus property (e.g., brightness) and aesthetic judgments of paintings, and fails to give a general account for how aesthetic judgments and basic perceptual judgments are executed (e.g. Ishizu & Zeki, 2013). Third, they rarely highlighted the true nature of aesthetic experience that differentiates aesthetic perception from basic perception. Instead, they generally explained basic perception and aesthetic perception in a non-differentiated fashion (see Cela-Conde et al., 2011; Conway & Rehding, 2013; Ramachandran & Hirstein, 1999), with only some authors having aesthetic perception discussed as different from everyday perception (Boccia et al., 2015; Cupchik et al., 2009; Cupchik & Winston, 1996; Mamassian, 2008; Marković, 2012). The latter group of authors proposed that everyday perception is pragmatic and oriented toward object identification whereas aesthetic perception is subjective and emotional reactions to the stylistic and structural properties of artworks (Scherer, 2004). They further conceived of aesthetic experience as a special, psychological process involving attention focused on the object and the suppression of everyday concerns. Such an attempt to differentiate aesthetic perception from everyday perception seems to be appealing. But it is not so simple and straightforward to distinguish them from each other because common sense indicates that aesthetic perception can also be part of our everyday perception (see C. Mo et al., 2016; Tractinsky, 2013; Swaminathan & Schellenberg, 2015; Venkatesh & Meamber, 2008; Weggeman et al., 2007), and that it is not limited to arts or artefacts only. So, we coin the term basic perception rather than everyday perception to distinguish from aesthetic perception. We define basic perception as a process of sensory information analysis used primarily for the recognition and understanding of the basic physical distinguishing features or compositional properties (explicit attributes, such as size, color, orientation, shape, texture, pitch, frequency) of an object or event, and aesthetic perception, by contrast, as an attention-driven psychological process operating primarily for discriminating the qualitative and affective aspects (implicit attributes, such as prettiness, pleasantness, sweetness) of the object or event experienced through the use of a relevant sensory modality. Because perception of basic physical features results in stimulus recognition hereafter we use the term ‘basic perception’ as interchangeably with the term ‘perceptual recognition’, and similarly, because a person's felt appreciation of a stimulus or event serves as an indicator of its perceived aesthetic appeal (Schindler et al., 2017) hereafter we use the term ‘aesthetic perception’ interchangeably with the term ‘aesthetic appreciation’.

We propose that basic perception is dependent on explicit stimulus properties and cognitive agent’s perceptibility, whereas aesthetic perception or appreciation may or may not be dependent on explicit stimulus properties (see Carbon & Jakesch, 2013), but on cognitive agent’s (perceiver’s) personal characteristics as well (see Juslin, 2013). The explicit stimulus properties that have been found to modulate aesthetic preference include symmetry and regularity (e.g., Jacobsen et al., 2006; Jacobsen & Höfel, 2002, 2003; Karim & Likova, 2018), surface smoothness (e.g., Karim, Prativa, & Likova, 2021a; Lindström et al., 2016), sharpness or angularity (e.g., Bar & Neta, 2006, 2007, 2008; Cotter et al., 2017; Karim & Likova, 2018; Palumbo et al., 2015), novelty or originality (e.g., Berlyne, 1971; Haertel & Carbon, 2014; Hung & Chen, 2012; Juslin, 2013), complexity (Berlyne, 1971), and so forth. The cognitive agent’s personal characteristics that can further shape aesthetic preference include the culture, experience, interest, aesthetic mind, emotional state or motivation, etc (Cela-Conde et al., 2011; Darda & Cross, 2021; Fingerhut & Prinz, 2020; Jacobsen, 2010; Masuda et al., 2008; Menninghaus et al., 2019; Menninghaus et al., 2020; Zysset et al., 2002). These sorts of characteristics of the cognitive agent likely produce individual differences in aesthetic preference. Most modern analyses of aesthetics suggest that aesthetics emerge from a dynamic interaction between the cognitive agent and the object, rather than solely from explicit ‘objective’ properties of the object or ‘subjective’ characteristics of the cognitive agent (see Juslin, 2013; Reber et al., 2004). The explicit object properties can be associated with the implicit or perceived qualities of the object (e.g., visual domain: Marković & Radonjić, 2008; Spehar & Stevanov, 2021; tactile domain: Essick et al., 2010; Essick et al., 1999; Etzi et al., 2014; Karim, Prativa, & Likova, 2021a; Kitada et al., 2012; Klatzky & Peck, 2012; Pasqualotto et al., 2020; Verrillo et al., 1999); however, such an association does not guarantee the causal role of explicit properties. For example, a beautifully designed statue (even if it does not comprise any nudity) may not be appreciated by the Muslim community because people of this community believe that there is no place for images, sculptures or statues of humans or any other animals in Islam. What goes against this religious code and value is perceived as unaesthetic and ugly, which supports the proposition that beauty lies in the eye of the beholder (Germine et al., 2015; Johnston & Franklin, 1993; Yu & Shepard, 1998). Thus it has been suggested that the perceived quality of an object or product can reflect the perceiver’s opinion or attitude about its (aesthetic) quality independent of its actual physical qualities (Carbon & Jakesch, 2013).

What is an aesthetic quality or property then? An aesthetic quality or property is the extent to which an object or stimulus is attractive, beautiful/pretty, elegant, sublime, catchy, and induces hedonic (pleasant or unpleasant) experiences. An aesthetic property is different from a descriptive or basic physical property by the fact that the perception of an aesthetic property involves cognitive appraisal and hedonic valuation, but the perception of a descriptive or basic physical property does not (see Gagnon & Peretz, 2000; Ishizu & Zeki, 2013; Jacobsen et al., 2006; Nasar, 1984; Zangwill, 2000). A descriptive property, such as being rectangular or being red, can be attributed without any belief about its appraisal and hedonic status—whether it is positive, negative, or neutral (De Clercq, 2008). However, an aesthetic property can also possess a descriptive component, such as a dress may look attractive to a child because of its bright color. Similarly, a descriptive property can have a nonaesthetic component, such as “sharpness,” as literally applied to sharp objects (for a detailed thesis on aesthetic property versus descriptive property; see De Clercq, 2008). The perception of descriptive property (e.g., being a circle, a triangle, or a square; lexical status of letter strings) can operate with or without awareness (Forster & Davis, 1984; Forster & Veres, 1998; Merikle et al., 2001; Williams Jr., 1938) or attention (Chen et al., 2021; Mack & Rock, 1998; Moore & Egeth, 1997; Rock et al., 1992), but the perception of aesthetic property does not (see the following section). Taking all these together, we contend that basic perception is a nonappraisal form of cognitive process or a purely noncognitive process that does not generally induce any emotional feelings, whereas aesthetic perception or appreciation is not only a definite cognitive appraisal process, but induces emotional feelings as well (Schindler et al., 2017; Xenakis et al., 2012). These emotional feelings, popularly known as aesthetic emotions, are elicited by different sensory impressions generated by visual arts, natural scenes, tactile arts, music, theater, or film (Augustin, Carbon, & Wagemans, 2012a; Beermann et al., 2021; Karim, Prativa, & Likova, 2021a).

It follows from the above discussion that aesthetic qualities depend in part on basic sensory properties (Zangwill, 2000), and the felt aesthetic emotions might vary depending on personal characteristics or sociocultural discourse of the cognitive agent. However, one complexity associated with aesthetic perception is the conflicting aesthetic emotions elicited by multifeatured stimulus composition. A stimulus can be composed of purely aesthetic properties or partially aesthetic properties (i.e., a combination of both aesthetic and non-aesthetic properties or positive as well as negative aesthetic properties; De Clercq, 2008). For example, flowers, landscapes, and some artworks (those of Hilma af Klint; Carter, 2019) possess purely aesthetic properties. On the contrary, some parts of an artwork can be attractive and novel with the other parts being unattractive and very traditional; an individual may possess two or more different, even conflicting aesthetic qualities, such as a pretty look but a rude voice, or a nice voice but an ugly look. According to framework principle, the aesthetic property of such a partially aesthetic stimulus is determined by the presence of its non-aesthetic property (Zangwill, 1998, 2000). The coexistence of both aesthetic and non-aesthetic properties in the same stimulus is likely to simultaneously induce both positive and negative emotions in the cognitive agent. In such an approach-avoidance dilemma, aesthetic preference might be determined by the resultant impact of the two opposites on elicitation of aesthetic emotions. If the resultant impact of those properties is in the direction of a positive aesthetic emotion the cognitive agent will prefer the stimulus; otherwise s/he will reject it. A second possibility is that the aesthetic preference in such a dilemma can be driven by the cognitive agent’s self-interest, an interest in changing the valence of the stimulus aspects. For example, in making aesthetic preference the cognitive agent can devalue the stimulus by actively searching for negative aspects (referred as approach-reduction or avoidance-increment strategy), or can overvalue the stimulus by actively searching for positive aspects (referred as avoidance-reduction or approach-increment strategy). In philosophical aesthetics, the distinction here is between ‘interested perception’ and ‘disinterested perception’ (Kant, 1790/2000). According to German philosopher Kant, interested perception is biased and tainted with our personal experience and emotional baggage whereas disinterested perception is pure and independent of pragmatic interests (Kant, 1790/2000). Thus, pleasures emerging from interested or intentional perception are bound up with desire- or self-interest (e.g., one takes in attractiveness, status symbols, etc.), and pleasures emerging from disinterested perception is desire- or self-interest-free and universal: we judge objects or events as aesthetically pleasing whether or not we believe them to serve our desires or interests (e.g., when listening to a Beethoven symphony; Botstein, 2010; contemplating an abstract painting by Hilma af Klint; Carter, 2019). In support of Kant’s thesis, Scherer (2005) proposed that aesthetic pleasure is elicited in response to intrinsic quality, or virtue of the aesthetic stimulus per se, and is independent of the individual’s current needs and goals. For example, a flower looks beautiful, a landscape looks attractive and charming, both for their own sake, not for any useful purposes; their beauties are pure and objective and are shared among public. To evoke pleasure from such objects or scenes, no conceptual judgment is required—the response is immediate and not bound to much evaluation by thought. In the aesthetic judgments of these sorts of objects or scenes, attention is fixed on their qualities, but not on their usefulness or theoretical interests or on the pleasures expected to derive from them.

One notable aspect of Kantian thesis is his belief that since beauty is a disinterested feeling that is not responding to any interest or desire of the subject, it is similar to the disinterested feeling of pleasure involved in moral appraisals. Current philosophical, psychological and neuroscience research advocates this link to a certain degree. In current philosophical aesthetics, ethicism, for example, claims that the aesthetic value of an artwork is, in part, determined by its moral value (for a critical discussion, see Halwani, 2009). Research in empirical moral psychology has demonstrated that moral judgments become stricter when participants are exposed to stimuli eliciting disgust, irrespective of whether the moral transgression under evaluation itself involved triggers of disgust, for example, eating your dog or not, or not returning a lost wallet (Schnall et al., 2008). It has been also evident that witnessing unfairness in an economic game triggers exactly the same physical facial motor activity that an awful taste does (Chapman et al., 2009). Finally, neuroimaging studies have shown that there is an overlap in the brain regions that process aesthetic and moral judgments (Jacobsen et al., 2006; Zaidel & Nadal, 2011). Beauty itself is morally valuable; however, beauty, as a form of sensory pleasure or gratification, is either trivial or potentially irresponsible in the face of serious moral concerns, such as sentencing a physically attractive man to prison due to his moral degradation or damaging a beautiful statue or painting on the gound of a strict religious code (see above). Thus, what aesthetic properties depend on is less secure than what moral properties depend on (Zangwill, 2000).

It is undeniable that the Kantian thesis was groundbreaking but internally contradictory and phenomenologically opaque (Cannon, 2008). Kant regarded aesthetic judgment as subjective while he still believed that aesthetics or beauty is pure and objective—something that exists in its own right within the art or object. Contrary to the so called disinterested aesthetics, Santayana (1896) argued that the central quality of aesthetics is pleasure, and that aesthetics or beauty is not an objective property of arts or objects, but rather is a self-interested subjective pleasure experienced through the perception of a stimulus or a person. There is growing evidence that beauty is highly influenced by such personal characteristics as self-interest and motivation (e.g., Fingerhut & Prinz, 2020; Juslin, 2013; Menninghaus et al., 2019; Menninghaus et al., 2020). Some people may be willing to choose physically attractive partners even though they have a rude voice, whereas other people may choose partners having a nice voice but ugly or unattractive looks. Thus aesthetic pleasures are not wholly devoid of personal interest and relevance. Even the so called disinterested, purely beautiful object may also have pragmatic interest and utility. For instance, we use beautiful flowers to meet many of our personal and social purposes – we give our loved ones a bouquet of flowers on their birthdays, and also use them in decorating various ceremonies or socio-cultural events. Similarly, living within aesthetically pleasing and culturally meaningful landscapes enhances our sense of wellbeing and quality of life. So, the so-called disinterested nature of pleasure is not desire- or interest-free in a true sense. Moreover, Kant’s approach to aesthetics appears to be concerned with a limited number of objects or events, particularly those that are natural, such as flowers and landscapes. However, as discussed above, many artificial objects or products are aesthetically or beautifully designed to serve certain utilities and purposes, and the perceived beauty also enhances the perceived usability of products (Lavie & Tractinsky, 2004; Tractinsky et al., 2000). For example, we purchase a beautiful flat to live a comfortable and secure life; we buy a nice car for our self satisfaction, and thus their purposes or functionalities are directly associated with desire or self-interest.

Thus apart from Kant’s view, we propose that aesthetic appreciation of an object or event either directly or indirectly involves immediate ‘interested sensory pleasure’ resulting from exposure to that object or event. In this respect, Berlyne looked extensively into novelty and complexity and investigated the topics in terms of “interestingness” and “pleasingness” that contribute to hedonic value of an art or object (Berlyne, 1970; Berlyne et al., 1968; Berlyne & Parham, 1968; Berlyne & Peckham, 1966). More recent studies have also examined the relationship between pleasure and interest with respect to aesthetic liking, and proposed a two-component model of aesthetics in the visual modality (Graf & Landwehr, 2015, 2017). This model, known as the Pleasure-Interest Model of aesthetics, posits that aesthetic liking can be triggered by processing dynamics of two distinct and separate components: a pleasure-based response and an interest-based response. The pleasure-based response involves stimulus-driven automatic processing and the interest-based response involves perceiver-driven controlled processing. Pleasure is a positive valence of emotion that involves feelings of enjoyment, happiness, and satisfaction (Becker et al., 2019), whereas interest is a feeling that motivates someone to focus on or explore an object or event (Graf & Landwehr, 2015). Taken all these together, we contend that pleasure and interest are core emotional processes and central components of aesthetic appreciation, and that these are possibly the most appropriate terms to describe felt aesthetics amodally. As outlined earlier, aesthetics is a multisensorial complex construct that can vary from extremely positive to extremely negative on such dimensions as attractive – unattractive or beautiful – ugly (visual), comfortable – uncomfortable (tactile), catchy – monotonous (auditory), and so forth (see Jacobsen et al., 2004; Karim, Prativa, & Likova, 2021a; Marković & Radonjić, 2008; Menninghaus et al., 2019). The hedonics and interests associated with these dimensions do also vary from extremely positive to extremely negative. That is, the emotions induced by such an appraisal do vary on a continuum extending from extremely pleasurable (known as aesthetic chills induced by music, visual art, natural scenes, film/movie, play, and poetry; Bannister, 2019; Goldstein, 1980; Konečni, 2008; Schoeller & Perlovsky, 2016; Sloboda, 1991) to extremely unpleasurable in terms of pleasure, and from extremely interesting to extremely uninteresting in terms of interestingness. However, to be aesthetic (or unaesthetic), a stimulus should not be necessarily pleasurable (or unpleasurable) and interesting (or uninteresting) in respect to all its physical elements or features (Gopnik, 2012). The pleasurableness and interestingness of some local elements or features can indeed make the stimulus aesthetic or unaesthetic (Levinson, 2003). For example, some yellowish or reddish parts of a mango surface make it look nice, whereas a few distributed black spots on another one make it look ugly. Thus, we further contend that the processing of local instead of global information might be the primary strategy in aesthetic appreciation of a stimulus. However, an object or event can be interesting but not necessarily pleasurable; and an unpleasurable object or event can nevertheless be interesting, appealing and enjoyable (Andersen et al., 2020; Hanich et al., 2014; Marin et al., 2016; Muth et al., 2019; Silvia, 2005a, 2005b; Turner Jr & Silvia, 2006; Vuoskoski & Eerola, 2017). For example, we may aesthetically enjoy and appreciate a sad-sounding song, a scary movie or a horror film (see Andersen et al., 2020; Hanich et al., 2014; Martin, 2019; Vuoskoski & Eerola, 2017). Because of the coexistence of positive and negative emotions in those experiences they are known as complex aesthetic experiences.

Two components of aesthetic perception

It follows from the above discussion that human aesthetics comprises two fundamental components. The first component is aesthetic emotions, the emotions that are elicited through aesthetic experience in response to aesthetic appeal or virtues of sensory objects or arts (see Menninghaus et al., 2019; Schindler et al., 2017). There is an ancient view that arts may bring ‘catharsis’, the purification of the soul through aesthetic experience that evokes pleasant feelings (Cook & Dibben, 2010; Paskow, 1983; Schaper, 1968). In line with this philosophical view, numerous recent ERP studies suggested that the emotional feelings induced by aesthetics or beauty might be stronger than the emotional feelings induced by control or neutral stimuli. For example, one visual ERP study showed that the amplitudes of P1 and P3b components were larger for attractive faces as compared to unattractive faces, indicating stronger emotional feelings and the involvement of emotion and reward pathways in judging facial attractiveness (Zhang & Deng, 2012). Auditory ERP studies demonstrated that the late positive potential (LPP) amplitude was larger during the evaluation of beauty of chord sequences as compared to the evaluation of correctness of chord sequences, particularly in naive participants (e.g., Müller et al., 2010). These findings indicate an enhanced affective, motivational component in the computation of visual or auditory beauty, with the experience of auditory beauty being more emotionally loaded than the experience of visual beauty (Augustin, Carbon, & Wagemans, 2012a; Augustin, Wagemans, & Carbon, 2012b). Thus, there is an inherent link between aesthetics or beauty and emotional (inside) feelings (Egermann & Reuben, 2020; Juslin, 2013; Schindler et al., 2017). Here, an outstanding question is: how are aesthetic emotions different from the basic or everyday emotions and from those associated with affective evaluation in general? To answer this question we delineate below the characteristics of aesthetic emotions as distinct from the characteristics of the basic or everyday emotions and from those of the emotions associated with affective evaluation.

First, a close comparative look at the literatures on basic emotions and aesthetic emotions indicates that the basic or everyday emotions involve appraisal of a situation in relation to the individual’s goal and action oriented coping (Zentner & Eerola, 2010), whereas the aesthetic emotions are elicited through sensory and cognitive processing in response to aesthetic appeal or quality of the object or event per se (Menninghaus et al., 2019; Scherer, 2004, 2005; Schindler et al., 2017). Ortony et al. (1988) defined aesthetic emotions as object-related emotions, such as pleasure, interest, awe, being moved, admiration, delight, and rapture (Juslin, 2013; Scherer, 2004), and everyday emotions as outcome-related basic emotions, such as happiness, interest, sadness, disappointment, anger, fear, surprise, and disgust (Juslin, 2013; Schindler et al., 2017; Suzuki et al., 2008). Thus, interest can be an ‘everyday emotion’ or an ‘aesthetic emotion’, depending on how it was aroused (Juslin, 2013). Along the same lines, Chatterjee and Vartanian suggested that aesthetic emotions are triggered by objects rather than outcomes (Chatterjee & Vartanian, 2014), a contrast that may also be reflected in the activity of two dissociable neural systems. Object-related (aesthetic) emotions correspond to activity in the liking system, while outcome-related (utilitarian) emotions correspond to activity in the wanting system (Berridge & Kringelbach, 2008, 2013). Thus, everyday emotions are utilitarian emotions (i.e., oriented towards the satisfaction of bodily needs) and are useful in cognitive agent’s goal-oriented adaptive functions (Juslin, 2013; Pelowski et al., 2017; Scherer, 2004; Xenakis et al., 2012), such as protection from danger, reproduction, orientation, and exploration (Lazarus, 1994), whereas aesthetic emotions are not goal-relevant, but involve feelings of subjective pleasure in response to the structural characteristics of the stimulus per se (Scherer, 2004, but also see the evolutionary perspective of aesthetic emotions). The basic or everyday emotions are believed to be primitive and universal (Ekman, 1992) and are found in all human cultures, whereas aesthetic emotions can be culturally learned and therefore more likely to vary across cultures (see Darda & Cross, 2021; Lazarus, 1994). For example, a nude and erotic artwork might induce aesthetically negative emotions in individuals of a Muslim or conservative society, but not in individuals of a radical western society. Moreover, the basic and aesthetic emotions are elicited in different contexts. For example, casual or inattentive listening to music in everyday situations mainly induces basic emotions, such as sadness, happiness, and fear, whereas listening to a piece of music with an aesthetic attitude or within an aesthetic context, such as in a concert hall, generates such aesthetic emotions as enjoyment, awe, and nostalgia (Brattico et al., 2013; Brattico & Pearce, 2013; Juslin, 2013; Sloboda, 2010). Thus aesthetic emotions are qualitatively distinct from everyday emotions albeit aesthetic emotions are built out of basic emotions (Xenakis et al., 2012).

Second, a comparative analysis of the literatures on aesthetic appreciation and affective evaluation indicates that the aesthetic emotions are distinct not only from everyday emotions but from the emotions associated with affective evaluation as well. The aesthetic appreciation involves assessments of the quality or value (analytical/originality, semantic, typicality, affective) of a sensory object or event, whereas affective appraisal involves assessments of affective contents in an object or event (see Egermann & Reuben, 2020). Aesthetic appreciation of an object or event is a predominantly cognitive process involving emotions as an after-effect being associated with the cognitive process of identifying the meaning of that object or event (Baltissen & Ostermann, 1998), whereas affective appraisal is a predominantly sensory or perceptual process of emotions contained in an object or event. The latter one may also involve cognitive process but that does not necessarily tell us anything about what the cognitive agent is feeling, since perception of emotions may well proceed without any emotional involvement (Gabrielsson, 2002; Harré, 1997). Thus aesthetic judgment induces aesthetic emotions, the emotions that the cognitive agent actually feels, rather than emotions that are represented, expressed, or alluded to in sensory stimuli or events (Gabrielsson, 2002; Schindler et al., 2017; Swaminathan & Schellenberg, 2015). For example, affective evaluation involves emotion that we simply perceive in an art, such as a music or painting, whereas aesthetic appreciation involves emotion that we actually feel in response to aesthetic features (see Gabrielsson, 2002), the features that are relevant to the aesthetic status/value of the art that possesses them (see Gopnik, 2012; Levinson, 2003). However, affective judgment of certain stimuli, such as the judgments of affective pictures of the IAPS (International Affective Picture System), can also evoke emotions, but in a way qualitatively different from how they are induced in aesthetic judgment (Baltissen & Ostermann, 1998).

Associated with the affect or emotion is attentional resource, a second component (cognitive) that probably makes direct contribution in aesthetic (quality or richness) analysis by connecting time and appraisal (Singh et al., 2019). To interpret the role attention plays alone in aesthetic preference, Proulx (2010) argued that by attentional mechanism people select stimulus features, objects, and spatial locations in the environment for increased scrutiny, which allows them to selectively extract from the environment the information that is most relevant and needed to achieve their goals. We propose that objects or arts that are aesthetically pleasing may be more attention-demanding (see Pool, Brosch, Delplanque, & Sander, 2016), and (aesthetically) more sensitive than those that are aesthetically neutral. Prior research has demonstrated that an attractive face captures greater spatial attention than does an unattractive face (Nakamura & Kawabata, 2014). However, an aesthetically unpleasant or ugly object or artwork may also be equally or even more sensitive and attention-demanding. For example, our attention can be captured not only by a sweet melody but by a very loud and unpleasant noise as well. However, an aesthetically pleasing object or event is different from an aesthetically displeasing object or event by the quality of affect/emotion associated with pleasantness or ugliness of that object or event. Here, we are not completely denying the role attention plays in basic perception. We propose that unlike basic perception or basic object recognition which can operate, as mentioned before, with or without attention (e.g., Chen et al., 2021; Mack & Rock, 1998; Moore & Egeth, 1997; Rock et al., 1992) aesthetic perception necessarily recruits attentional resource. Even if attention is considered as a common requirement for both basic perception and aesthetic perception (Conway & Rehding, 2013), the quality and weight of the attentional resources recruited in these two processes might be different (see Fazekas, 2016; Nanay, 2015). As compared to basic perception, aesthetic perception perhaps involves more emotionally driven component of attention focused on selective and attractive/pleasant (or unattractive/unpleasant) feature(s) of the object or event. Secondly, because aesthetic features (see above) are rewarding or threatening aesthetic perception appears to recruit extra and more sustainable attentional resource and receive preferential processing as compared to basic perception (Kret et al., 2013; Öhman, Flykt, & Esteves, 2001a; Öhman, Lundqvist, & Esteves, 2001b).

More intriguing evidence for the role of attention in aesthetic perception comes from a wealth of neuroimaging studies. For example, a number of studies using fMRI or MEG techniques demonstrated that the aesthetic experience is related to increased activity of cortical regions involved in the allocation of attentional resources and evaluative judgments, including the dorsolateral and ventrolateral prefrontal cortex, temporal pole, posterior cingulate cortex, and precuneus (fMRI: Cupchik et al., 2009; Jacobsen et al., 2006; MEG: Cela-Conde et al., 2013). Other studies which used fMRI techniques only showed that aesthetic appreciation involved an attention-related enhancement activity in visuoperceptual areas, such as bilateral fusiform gyri, angular gyrus, and the superior parietal cortex (for a review, see Cela-Conde et al., 2009; Cupchik et al., 2009; Ishizu & Zeki, 2013; Lacey et al., 2011). These findings have been corroborated by a number of ERP/VEP studies in both visual and nonvisual modalities. For example, one visual ERP study demonstrated that the amplitude of the attentional component P3b (known to be modulated by motor-inhibition) was greater for visual images perceived as more beautiful than for neutral or ugly images (de Tommaso et al., 2008). A VEP study showed an enhancement in C1 and N1, P3 and N4 components and increased attention-related occipital alpha desynchronization for more appreciated visual images (Sarasso et al., 2020). An auditory ERP study demonstrated that electrophysiological indexes of attentional engagement (N1/P2) and motor inhibition (N2/P3) were enhanced during aesthetic appreciation of musical intervals (Sarasso et al., 2019). Consistently, a very recent auditory ERP study showed a significant trial-by-trial correlation between subjective aesthetic judgments of musical sounds and single trial amplitude fluctuations of the attention-related N1 component (Sarasso et al., 2021). This indicates that aesthetic appreciation correlates not only with perceptual facilitation but with attentional amplification as well. Taken all these findings together, it can be concluded that there is enhanced attentional modulation during appreciation of beautiful objects or events (see Kingstone et al., 2016; Kirsch et al., 2016; Nadal, 2013; Sarasso et al., 2020).

Are the two components interactive or dissociable? Aesthetic theorists posit that aesthetic experience necessitates an active engagement or intentional orienting of perception toward distilling the affective properties of an object or artwork (Cupchik et al., 2009; Leder et al., 2004). This refers to the interactions between attention and emotion in the modulation of aesthetic pleasure (see Fenske & Raymond, 2006; Oliveira et al., 2013; Pourtois et al., 2013). Behavioral studies have suggested that experiences of beauty require attention and are typically accompanied by feelings of pleasure (Blood & Zatorre, 2001; Grabenhorst & Rolls, 2011; Jacobsen et al., 2006; Vartanian & Goel, 2004). A number of other studies have reported that there is a reciprocal interplay between visual attention and reward, and that this interplay is not only at the behavioral level but at the neural level as well (e.g., Okon-Singer et al., 2013; Raymond, 2009; Raymond et al., 2003; Serences & Saproo, 2010; Viviani, 2013; Vuilleumier et al., 2003; Yamaguchi & Onoda, 2012). This interplay operates in a stimulus-driven bottom-up manner via emotion-related centers of the brain, particularly the amygdala (Cisler & Koster, 2010; Dolan, 2002; Öhman, 2002, 2005), under the control of top-down influences via two frontal regions, namely the orbitofrontal cortex (OFC) and the anterior cingulate cortex (ACC), with which the amygdala is thought to have reciprocal interconnection (see Compton, 2003; Fenske & Raymond, 2006; Pourtois et al., 2013; Vuilleumier et al., 2003). Through this process attention, affect, and the interactions between them extract rewarding values from sensory stimuli, and lead to the generation of appropriate responses to them (see Yamaguchi & Onoda, 2012). We propose that this might be true not only for affective evaluation in general but for aesthetic appraisal as well. There is evidence that the amygdala exhibits a nonlinear response profile for facial beauty, by responding maximally to extremely attractive and unattractive faces, and relatively less to faces of average attractiveness (Winston et al., 2007). Consistently, a review study suggested that the amygdala – among other regions – was more strongly activated during aesthetic than during non-aesthetic judgments (Jacobs & Cornelissen, 2017a, 2017b). This review further suggested that amygdala might be involved in aesthetic judgments, and in emotional decision making in general.

To summarize, we propose that emotions and attention are two intertwined components necessary to generate aesthetics in humans. Indeed, emotion and attention interact with one another and affect the prioritization of information processing (see Cupchik et al., 2009; Fenske & Raymond, 2006; Leder et al., 2004; Oliveira et al., 2013; Pourtois et al., 2013). The aesthetic emotions and attention paid to aesthetic stimuli or objects are qualitatively distinct from everyday emotions and from the attention we pay most of the time. These two are fundamental and universal components associated with aesthetics (be it pleasant or unpleasant, pretty or ugly) not only in visual modality but other sensory modalities as well, albeit evidence from other sensory modalities is scanty.

Aesthetic sensitivity versus perceptual sensitivity

Because aesthetic experience is thought to involve unique perceptual and emotional processes (Makin, 2017) we theorize that aesthetic sensitivity is different from (basic) perceptual sensitivity. We define aesthetic sensitivity as a pattern of emotional or affective reactions that an individual uses to appraise the quality or richness (look, sentiment, taste) of a sensory object or event (Karim, Prativa, & Likova, 2021a). Our aesthetic sensitivity allows us to make affective comments on the quality or richness of arts or artistic objects and events which are brought into existence in the pursuit of creating them as beautiful or ugly (see Eysenck, 1983; Meier, 1928; Parker, 1978). It is the extent to which variations in a particular stimulus attribute lead to variations in an individual’s hedonic valuation of and liking for that stimulus (Corradi et al., 2019; Corradi et al., 2020). Conversely, perceptual sensitivity can be conceived of as the capacity of an individual to detect slight differences in environmental stimulation using a sensory system and is usually expressed in terms of threshold; with a lower threshold indicating higher sensitivity and a higher threshold indicating lower sensitivity (see Bolders et al., 2017).

Perceptual sensitivity is typically linearly related to stimulus intensity, and can also correspond to an inverted U-function, as in the case of speed-tuning function (Curran & Benton, 2003), whereas aesthetic sensitivity is likely to possess a nonlinear relationship to stimulus intensity. For example, reaction time may be linearly related to stimulus intensity, such as complexity (Schweizer, 1998; Venables, 1958). However, some people consistently prefer complex designs or musics, some people consistently prefer simple ones, while others are aesthetically indifferent to design or music complexity (Clemente et al., 2022; Corradi et al., 2020). Thus, aesthetic sensitivity does not correspond to perceptual sensitivity: it does not gauge whether someone can discriminate fine variations in complexity (Clemente et al., 2022). However, in the case of non-linear relationship stimulus sensitivity does not normally correspond to an inverted-U function that has been posited by Berlyne for aesthetic preference (Berlyne, 1971). Thus from the stand point of Berlyne’s view, it is reasonable to argue that aesthetic preference is perhaps independent of perceptual sensitivity. This does not imply that aesthetic sensitivity is necessarily independent of stimulus intensity. Simply changing the intensity of a stimulus may change the perceived pleasantness/prettiness. For example, there is anecdotal evidence that brown noise (derived from the Brownian notion not from the color per se), when played relatively quietly, is perceived relatively pleasant, and can be even used to induce sleep and relaxation. Yet, when played very loudly, it is definitely unpleasant.

The question of the relation of aesthetic sensitivity to perceptual sensitivity has been directly addressed in studies on perception and appreciation of tangible textured surfaces and oriented visual textures/pictures. For example, one line of research in tactile modality has shown that the perceived magnitude of roughness of a stimulus surface varies proportionally with (Karim, Prativa, & Likova, 2021a) or as a power function of the physical magnitude of roughness (Ekman et al., 1965; Verrillo et al., 1999), and that the perceived magnitude of softness increases proportionally with the physical magnitude of softness (Karim, Prativa, & Likova, 2021a) or monotonically as a function of increasing object compliance (Pasqualotto et al., 2020). A second line of research in the same modality has revealed that the perceived magnitude of pleasantness of tactile sensation is monotonically and inversely related to the physical/estimated magnitude of surface roughness (Karim, Prativa, & Likova, 2021a; Kitada et al., 2012; Klatzky & Peck, 2012; Verrillo et al., 1999), or increases monotonically with the physical/estimated magnitude of softness or object compliance (Karim, Prativa, & Likova, 2021a; Pasqualotto et al., 2020). Because smooth or soft tactile stimuli likely engender less friction (Essick et al., 2010; Klöcker et al., 2012; Klöcker et al., 2013), other research has demonstrated that people rate smooth and soft stimuli (e.g., silk material, cosmetic brushes) as more pleasing than rough and hard stimuli (e.g., burlap material, plastic mesh, polyester, sandpaper, sponge, cotton) under both active (Etzi et al., 2014; Karim, Prativa, & Likova, 2021a; Major, 1895; Ripin & Lazarsfeld, 1937) and passive (Essick et al., 1999; Essick et al., 2010; Etzi et al., 2014) touch conditions. Thus, our sensitivity to texture aesthetics is inversely related to (basic) perceptual sensitivity which typically increases with roughness or coarseness of a stimulus surface.

A second factor that can affect (visual) perceptibility is stimulus orientation (e.g., Appelle, 1972;Gros et al., 1998 ; Westheimer, 2003). For example, gratings at cardinal orientations are more accurately recognizable or discriminable as compared to gratings with oblique orientations (Gros et al., 1998; Westheimer, 2003). It has been further shown that the orientation effect on perceptibility is not specific to the visual modality but can also be generalized to nonvisual modality (e.g., tactile modality; Lechelt et al., 1976; Lechelt & Verenka, 1980), and that such an effect is not limited to basic perceptual discrimination of sensory stimuli but can be extended to aesthetic appeal as well (Latto et al., 2000). For example, Latto et al. (2000) reported that stimuli (Mondrian's paintings) at the cardinal orientations (vertical or horizontal) are closely tuned to the properties of the visual system and are found to be aesthetically more pleasing as compared to stimuli at oblique orientations. Thus aesthetic appreciation appears to involve basic feature processing analogous to basic perceptual discrimination or recognition. However, this does not necessarily imply that aesthetic sensitivity and orientation sensitivity (a basic perceptual sensitivity) are the same and necessarily interdependent. We conjecture that after analyzing the tuning nature of visual stimuli or objects perhaps the visual system dispatches the resulting output (tuned or not tuned) to the affective system centered on amygdala in the brain (Elliott et al., 2011), which is probably predominantly biased to qualify the stimuli or objects that are tuned to the properties of the visual system (in a priori analysis) as aesthetically more pleasing and the others as displeasing or neutral. This means that there is probably a relay station between the visual, the supramodal and the reward-processing (affective) areas of the brain, particularly the amygdala - one of the most highly connected subcortical structures of the brain - which is thought to modulate aesthetic emotional processing (Becker et al., 2019; Cisler & Koster, 2010; Dolan, 2002; Öhman, 2002, 2005) under the control of top-down influences via frontal regions (see Compton, 2003; Fenske & Raymond, 2006; Pourtois et al., 2013; Vuilleumier et al., 2003).

A recent study has demonstrated that oblique orientations of visual textures correlate with higher beauty ratings (Jacobs et al., 2016). Thus, the orientation effect on aesthetic appreciation appears to be inconsistent across stimuli and across populations unlike the orientation effect on basic perception. This again indicates that aesthetic preference is probably independent of orientation sensitivity albeit they may have a common trend for a certain type of visual stimuli or objects (e.g., Mondrian's paintings). Though an immediately prior study has demonstrated a strong correlation between visual sensitivity and aesthetic preference for simple visual patterns (sine-wave gratings varying in spatial frequency and random textures with varying scaling exponent; Spehar et al., 2015) we cannot affirm that they are causally related and their relationship can be generalized to other stimuli or objects. We argue that an increased sensitivity to the basic stimulus features may not always lead to an increased aesthetic sensitivity. For example, if somebody is asked to see or touch a sharp object typically his basic sensitivity to it will be stronger; however, the aesthetic sensitivity or aesthetic feeling will probably decrease, resulting in the evaluation that the object is not aesthetically pleasing. Similarly, basic sensitivity to a highly textured surface may be stronger, but will he prefer to touch such a surface rather than a smooth one? Perhaps he will not, as it is irritating and displeasing (i.e., creates more friction but less aesthetic sense; see Essick et al., 2010). It is very likely that formation of such an impression of the stimulus or object quality involves recognition of the basic stimulus aspects or elements during the initial stage of analysis and processing. Prior studies suggested that stimulus beauty can be related to the features present in the stimuli, such as symmetry and regularity (e.g., Jacobsen et al., 2006; Jacobsen & Höfel, 2002); thus beauty judgments are predictable from stimulus features to some extent (Jacobs et al., 2016), indicating a stimulus-driven effect on aesthetics. However, this does not necessarily preclude the dissociable nature of aesthetic sensitivity and (basic) perceptual sensitivity. In support of this, an fMRI study investigating aesthetic judgments showed functionally dissociable networks underlying beauty judgments and basic perceptual (e.g. symmetry) judgments (Jacobsen et al., 2006). Thus, the concepts of basic perception and aesthetic appreciation should always be differentiated in terms of sensitivity – unlike basic perceptual sensitivity, aesthetic sensitivity has a reward value, an emotional or affective component, and highly focused attention associated with specific feature(s) of a particular stimulus or object, such as attractive faces (Nakamura & Kawabata, 2014; see above). We propose that aesthetic appreciation of an object is more than just understanding its identifying physical features, and requires higher-order processing under top-down control. This view is in line with our daily experience. In everyday life, we encounter many types of complex stimuli, objects or events, and we have the experience of judging the qualities, the emotional aspects, of those objects or events even though if we fail to recognize their non-emotional aspects or cannot recognize them well, especially when they are not familiar to us. Say, you are together with your friends for having lunch at a foreign restaurant for the first time in your life, and there are many foods but all are unfamiliar to you. If they do not have any smell to you and you do not have any previous experience of eating them how would you choose your foods? Probably you would do it by matching the appearance of those foods, such as color, with the foods you already have in your mind, especially if you are too shy to ask your friends or waiters. Here the processing of color (basic stimulus feature) is not so important; the important thing instead is whether it matches the color of tasty foods you have in mind (for a color-taste association, see Velasco et al., 2016). This indicates the role of your past experience to make preferred foods (top-down processing). It further indicates that aesthetic perception which is very much bound to emotional aspects of the objects may not be necessarily dependent on basic object identification though recognition of the basic aspects or elements of an object may be the initial and crucial stage to make an aesthetic preference (also see below).

Visual aesthetics versus visual perception

Aesthetic perception versus basic perception in visual modality

In the prior section, we conceptualize human aesthetics with a great attention to basic visual perception and visual aesthetic perception or appreciation. In this section, first, we introduce a new hierarchical, local-global integrative model of perception and aesthetics, followed by the discussion of a large pool of research evidence that lends support to the propositions generated in this model. Grounded on the current literature, the model differentiates aesthetic perception from basic perception by logically explaining how they operate and how they are associated with cognition and emotion.

A hierarchical model of aesthetics: Perception-appreciation independence

Humans typically cannot memorize or store any information in the brain in absence of attention (Chun & Turk-Browne, 2007). It is commonly believed that attention is the key to both perception and memory; however, the perception of affective components requires selective attention (Yamaguchi & Onoda, 2012). As discussed earlier in this review, attending to the physical distinguishing features of an object or scene is not the same as attending to the features that make the object or scene beautiful or ugly, albeit the beauty or ugliness is typically reflected off the physical features (Yoshino et al., 2009). Because affect or emotion is at the core of aesthetic appraisal, perceiving aesthetics or beauty of an object or scene perhaps requires deployment of attention to the selective local features which appear to be pleasing or displeasing at a glance. Research has shown that people can pay attention to the same object or stimulus in two different ways: (1) by zooming out and deploying attention to the whole or (2) by zooming in and deploying attention to the details (Förster, 2011). The former way is known as global-to-local (or simply global) processing strategy and the latter way is known as local-to-global (or simply local) processing strategy (Love et al., 1999). More precisely, global processing involves attention directing to the whole and encoding spatial relationships between discrete local elements to form a coherent global structure of an object or scene (e.g., Kimchi, 1992; Kovács, 1996; Lewis et al., 2004; Neiworth et al., 2006), whereas local processing is based on attention directed to the individual local elements that make up the object or scene (Kimchi, 1992; Navon, 1977; Nayar et al., 2015). Thus in local processing attention is gradually extended or shifted, following sequential allocation, to the other elements of the object or scene in the field of current view (VanRullen et al., 2007).

The visual system is confronted with a huge amount of information even in a single object or scene (Wolfe & Horowitz, 2017; Yantis, 2008), but not every perceivable feature or information conveys its aesthetic appeal (Gopnik, 2012); there may be certain features that are relevant to the aesthetic status/value of the object or scene, the features that make it beautiful or ugly (Levinson, 2003). While appreciating aesthetics we search for those features, the features of our aesthetic mind and interest. In this search, the problem with the global strategy is that when the object or scene is in the current field of view the visual system may not be able to deploy attention to all its features or elements at once due to having fundamental limits on visual processing (Wolfe & Horowitz, 2017; Yantis, 2008). As a result, the features or elements important for quality or richness analysis might be missing or confounded with many irrelevant or unimportant features or elements (known as distractors) in the aesthetic process, or even if the aesthetic features are somehow located they require further analysis for aesthetic decision that cannot be done through a global analysis alone. The local processing strategy also is problematic in aesthetic analysis as it does not tell us which elements of the object or scene the viewer is first likely to focus his attention on. Despite these limitations the global-to-local or local-to-global (extended local) strategy can be sufficient for basic perceptual processing but not for aesthetic processing. We can recognize the object or scene by seeing the whole at a glance (global level) or by seeing details of the local elements (local level) and integrating them into a global frame (Beaucousin et al., 2013; Gerlach & Poirel, 2018; Stoesz et al., 2007), but aesthetic appraisal requires more focused attention and top-down mediated further analysis about the quality or richness that depends not only on physical features of the object but on the cognitive agent’s affective and cognitive resources as well. Here, we propose that while interacting with environmental objects or scenes we do not necessarily follow a global-to-local or local-to-global processing strategy; in certain cases we may instead use a restricted local processing strategy only. By restricted local processing we mean allocating and limiting attention to a few selective or focal features (novel or previously experienced) that make the object or scene pleasing or displeasing, beautiful or ugly. For example, a beautiful lady is beautiful because of her beautiful face and eye; she becomes more beautiful just by beautifying her lips, eyes and face with relevant cosmetics, and she does not need to beautify her whole body to be perceived as beautiful. Thus the cognitive agent’s attention focuses on the lady’s beautiful lips, eyes and face, but not on her whole body, to generate impression of her global beauty at first sight. The proposition of restricted local processing has been directly or indirectly supported by the findings of a few prior studies. For example, one study has demonstrated that hedonic (pleasant or unpleasant) pictures have higher fixation response rate than neutral pictures (Nummenmaa et al., 2006). More interestingly, this study further showed that when participants were asked to avoid looking at the hedonic pictures, these were still more likely to be fixated first and gazed longer during the first-pass viewing than neutral pictures. A review study suggested that emotionally arousing image captures attention to such an extent that individuals cannot detect target stimuli for several hundred milliseconds after the emotional stimulus (McHugo et al., 2013). Consistently, a recent study showed that during aesthetic judgment participants tend to fixate on patches that are richer in color information, and that the differences in the distribution of attention – as evident from the distributions of fixations – are feature-driven (Jacobs & Cornelissen, 2017a, 2017b).

Based on the above information processing strategies we propose here a dual-channel model which differentiates aesthetic processing from basic perceptual or recognition processing, and at the same time rules out the hypothesis that cognition is absent in affective or aesthetic processing (Fig. 1). The first channel (route ABC in Fig. 1), the ‘aesthetics-only’ channel, primarily involves ‘restricted local processing’ to analyze the quality or richness of sensory inputs (e.g., prettiness, pleasantness) under top-down and bottom-up controls (Wolfe & Horowitz, 2017; Yantis, 2008) in the total absence of stimulus or object recognition. Here, we propose that stimulus or object recognition is not necessary for aesthetic appreciation under three specific conditions: availability/visibility of familiar aesthetic features, subjective limitation and short stimulus exposure. First, when previously known aesthetic features of an object or scene are immediately available in the current field of view, such as when those features are at the front side of the object or scene (see Wolfe & Horowitz, 2017) the appreciation of that object or scene does not require deeper semantic understanding and basic recognition processing (Kunst-Wilson & Zajonc, 1980; Seamon et al., 1983a, 1983b). We propose that in the absence of global perceptual features the cognitive agent makes aesthetic appreciation by immediately comparing the currently visible local features with those previously stored in cognitive faculty solely based on his/her phenomenal state of interest developed through past experience (see Graf & Landwehr, 2015; Leder & Nadal, 2014; Mendonça et al., 2019). Thus aesthetic pleasure/displeasure occurs as a result of immediate harmony/disharmony between the aesthetic features of the current visual input and the agent’s existing cognitive faculty which is characterized by familiarity, expectations, subjective taste, emotions and culture (Höfel & Jacobsen, 2007; Tatarkiewicz, 1963). As exemplified earlier, some yellowish or redish parts of a mango surface make its look nice whereas a few distributed black spots on another one make its look ugly. Thus appreciation of the quality of a mango can easily be done just by looking at its focal features on the front view without seeing the global shape and features or without seeing the other side of it. Second, aesthetic appreciation also occurs when the object or event of judgment has intrinsic aesthetic value (see Kant, 1790/2000; Menninghaus et al., 2019; Xenakis et al., 2012), but failure of recognition of that object or event is obvious due to subjective limitations. Because intrinsically beautiful or aesthetic objects or events do not necessarily require any semantic comprehension the cognitive agent does not need to analyze the basic extrinsic perceptual features for appreciating those objects or events. Thus, an individual having subjective limitations in semantic comprehension can also enjoy and appreciate the aesthetics or beauty of those objects or events without basic perceptual processing, but not without cognitive processing. For example, ‘aesthetics-only’ channel may operate when we (laymen) enjoy and appreciate the beauty or aesthetics of a dance without having any prior knowledge of dance rules and without being able to properly analyze the choreographic expression, dynamism, and exceptionality of the dance. A dance has intrinsically unique power to attract human's mind regardless of culture, race, religion, age, or complexion. It is not only the dancer who moves her/his body but our minds are also moved by the creative body movement of a dancer despite our inability to make semantic differential of the spatio-temporal features of a dance movement (Calvo-Merino et al., 2008). Thus, we do enjoy and appreciate the artistic expressions in a dance based on our subjective taste, attention and thoughts restricted to some focal and pleasant spatio-temporal features of this creative art (Best, 1975; Orlandi et al., 2020), indicating the involvement of cognitive processing but not necessarily perceptual processing. Third, regarding the duration of stimulus exposure, numerous studies examined aesthetic appraisals after very short exposure to webpages. One study suggested that a stable aesthetic impression can be formed after being exposed to a web design for only 50 ms (Lindgaard et al., 2006). The extraordinary rapidness of judgment about web displays participants never saw before suggests that aesthetic impression might be formed prior to basic perceptual or recognition processing. Although the robustness of these findings can be questioned, participants do not need more than half a second to form the first, stable aesthetic impression of a webpage (Tractinsky et al., 2006). We argue that such a short duration might be sufficient for restricted local processing used in the understanding of aesthetic richness but not for global-to-local or local-to-global processing used in the understanding of physical characteristic features of the webpage. This indicates that aesthetic impression of the webpage can take place prior to basic perceptual processing of those features of the webpage. Taken together, we conclude that perceptual processing of an object or scene is not a necessary first step of aesthetic appreciation; it might rather be a direct step depending on the context.

Contrary to the aforementioned ‘aesthetics-only’ channel that comprises cognitive and affective processing of the objects or events the second channel, the ‘perception-to-aesthetics’ channel comprises an initial perceptual processing which is followed by cognitive and affective processing. That is, the ‘perception-to-aesthetics’ channel is more typical and likely operates in two consecutive stages: (i) a basic perceptual recognition stage which helps locate the affective or aesthetic features, and (ii) an aesthetic stage that involves cognitive and affective processing of those features. The perceptual recognition stage (routes ADE in Fig. 1) involves either a global-to-local or a local-to-global processing style under top-down and bottom-up controls to analyze the pictorial content and structural organization of visual inputs for an accurate recognition or meaningful representation of the percept (Beudt & Jacobsen, 2015; Egermann & Reuben, 2020; Leder & Nadal, 2014; Martindale & Moore, 1988; Mendonça et al., 2019). The aesthetic stage (routes EDBC in Fig. 1) which operates concurrently or immediately after the perceptual or recognition stage involves processing of a few selective local features (restricted local processing) for quality or richness analysis of the output data and for generating aesthetically emotional response (e.g., attractive or unattractive, pleasant or unpleasant, Xenakis et al., 2012) that cannot be done at sensory or perceptual level. We propose that this two-stage aesthetic processing likely operates under two general conditions: unfamiliarity or absence of familiar aesthetic features and viewer’s analytic intention. The first condition involves a stimulus setting in which the previously known aesthetic features are not present in an object or scene available in the current field of view and the analysis of the whole object or scene becomes obvious. For example, we might be interested to purchase a beautifully designed sofa that we have seen in our friend’s house. If a sofa of exactly the same design is available in the market we might immediately decide to purchase it as our focus of attention is restricted to the known aesthetic features of that furniture, and this probably requires the operation of the ‘aesthetics-only’ channel. However, if a sofa of exactly the same design is not available in the market we might look for a new one that requires the operation of ‘perception-to-aesthetics’ channel. According to the second condition, the viewer might have intention to analyze the whole object or scene (or s/he might be required to do so) despite the fact that the previously known aesthetic features are immediately visible or available in the current field of view. For example, on the contrary to a layman discussed above, a dance expert, the person who understands the spatial and temporal features of a dance movement, evaluates the beauty or aesthetics of a dance using the ‘perception-to-aesthetics’ channel. The dance expert uses his/her prior choreographic knowledge to the understanding of semantic differential of the spatio-temporal features of a dance movement in the first (perceptual) stage, followed by the induction of a psychological state, the state of aesthetic experience in a second stage (see Calvo-Merino et al., 2008). Although the expert’s basic perception of the dance is based on a global-to-local or a local-to-global processing style the impression of the dance aesthetics likely depends on the processing of certain dance features that aesthetically moves him/her. A second example here can be the appreciation of an erotic/nude art that may operate following a similar fashion. While evaluating such an art the viewer first perceives the different features of the art shape, locate its erotic aesthetic features, and then restrict their attention to the erotic features only that induce aesthetically negative or positive emotions in them, depending on such cultural and personal factors as religion and values. Thus despite the help the initial perceptual process makes in locating aesthetic or affective features of the object the two processes are different and are likely to be integrated towards a final preference decision.

The above discussion illustrates how the proposed dual-channel aesthetic model can explain a simple aesthetic experience, such as the aesthetic experience of an object/stimulus which is either pretty (e.g., a rose) or ugly (e.g., a rotten mango). Here, one outstanding question is: How does the model explain a partial/semi aesthetic exerience, an experience of an object or stimulus in which both a positive aesthetic property and a negative aesthetic property coexist (e.g., an attractive lady with rude voice)? We propose that in such an approach-avoidance aesthetic dilemma both the positive and negative properties are concurrently processed following the principle of the ‘aesthetics-only’ channel or the principle of the ‘perception-to-aestehtics’ channel depending on the conditions discussed above. However, in such a situation, the cognitive agent is likely to make, as outlined before, his/her aesthetic preference depending on the resultant impact of the two opposites on elicitation of aesthetic emotions, or by devaluing the object through an active search for negative aspects (approach-reduction or avoidance-increment strategy), or by overvaluing the object through an active search for positive aspects (avoidance-reduction or approach-increment strategy). The devaluation or overvaluation of the object is possibly determined by the cognitive agent’s self-interest or desire (see earlier for a more deatil).

Now, a second outstanding question is: How does the proposed model account for a complex aesthetic experience, such as the aesthetic experience of a scary movie or a horror film? Before answering this question let us first see how the current theories explain this. One theory is the excitation transfer theory (Zillmann, 1980, 1996) which posits that we derive enjoyment of a horror or frightening film from the feeling of suspense and resolution of threatening event. It assumes that suspense arises from events, which signify conflict, dissonance and instability, and with the resolution of threatening event, suspense ends and our negative affect built up during exposure to the horror film converts to euphoria (see Lehne & Koelsch, 2015). A second theory is the arousal- or thrill-seeking theory which argues that we like and appreciate a horror film because the act of watching horror provides us with a thrill or arousal regardless of the resolution of threatening event (Tamborini, 1991). Research has suggested that certain personality traits, such as sensation seeking, verbal aggression, and argumentativeness are positively correlated (Greene & Krcmar, 2005), whereas empathy is negatively correlated (Hoffner & Cantor, 1991; Sparks, 1991; Zillmann et al., 1986) with enjoyment of horror and violent films. However, a common limitation of these two theories is that they explain the cause but not the process of such a complex phenomenon. This problem is resolved well in the proposed dual-channel aesthetic model which proposes that we enjoy and appreciate a horror or frightening film by the operation of ‘perception-to-aesthetics’ channel. Here, this analytic channel probably operates following a local-to-global instead of a global-to-local processing style in the perceptual analysis stage and a restricted local processing in the aesthetic valuation stage. In the perceptual analysis stage, a local-to-global processing style might be obvious as the whole film cannot be viewed at once. We propose that in the perceptual analysis stage, the horror film viewer is more likely to perceive and evaluate the film locally episode by episode, and feature by feature within an episode, moving forward to the global but not the other way round after finishing the film. Concurrently, in the aesthetic valuation stage, the stage of enjoyment and liking, the viewer is likely to be moved by thrilling or exciting features, and devote more attention to those features of the film (restricted local processing). Here, it can be noted that people may also enjoy and appreciate a sad film other than a horror or scary film by being moved (Hanich et al., 2014) by the film episodes that correspond to their personal life events/experiences, and that this enjoyment and appreciation can also be accounted for by the proposed dual-channel model in a way similar to how the model explains the enjoyment of a sad song (for details, see the section for auditory aesthetics).

The aforementioned proposition of the ‘perception-to-aesthetics’ channel receives support of both theoretical views and prior empirical observations. Specifically, in line with this proposition the current aesthetic theory posits that the aesthetic judgment involves a sensory-perceptual process, a cognitive process and an affective process (Berlyne, 1971; Cupchik et al., 2009; Diessner et al., 2008). Similarly, the results of prior neuroaesthetic studies indicate that the brain areas involved in aesthetic judgment include the ventral visual systems (V1, V2, V4 and inferior temporal gyrus/ITG) which are associated with visual processing, the superior frontal gyrus (SFG) which is associated with cognitive processing, and the orbitofrontal cortex (OFC) which is associated with affective processing (Avram et al., 2013). It has been suggested that the sensory and perceptual processing of an image or event is not only the primary step in the process but is also crucial for making aesthetic decisions (Leder et al., 2004).

Because of the operation of initial perceptual process, aesthetic appreciation, a form of cognitive and evaluative judgments, made through ‘perception-to-aesthetics’ channel appears to be slower than descriptive judgments made during the basic perceptual recognition only. Both behavioral and neurophysiological measures of a few studies support this notion. For example, the behavioral data of an ERP study showed that the basic perceptual judgments, such as symmetry judgments of novel graphic patterns, took 1013 (for ‘Yes’ response) to 1044 (for ‘No’ response) ms whereas aesthetic judgments of the same stimuli took 1111 (for ‘No’ response) to 1221 (for ‘Yes’ response) ms (Jacobsen & Höfel, 2003). Consistently, the Lateralized Readiness Potential (LRP) and the N200 data of a second ERP study demonstrated that the processing of art style follows the processing of content-related information, with style-related information being available at around 224 ms or between 40 and 94 ms later than content-related information (Augustin et al., 2011). The longer time taken for art or aesthetic judgments compared to basic perceptual judgments indicates that art and aesthetic judgments probably involved perceptual processing as a first step of art or aesthetic processing.

Now, an outstanding question is do we see the details, such as textures of a visual art (local level) followed by restricted local analysis, such as analysis of certain pleasant/unpleasant textures; or the overall outlay, such as the whole visual art (global level) followed by restricted local analysis? An early study suggested that in visual perception global structuring of a visual scene, such as forest, precedes analysis of local details, such as trees (Navon, 1977). However, more recent research has shown that this depends on such personal characteristics of the viewer as mood, experience and age. There is evidence that positive moods broaden the scope of attention (Fredrickson & Branigan, 2005) whereas negative moods narrow the scope of attention (e.g., Derryberry & Tucker, 1994). Thus individuals with positive mood and optimism are likely to use a global style whereas those with negative mood (depression, anxiety) are likely to use a local style (e.g., Basso et al., 1996; Derryberry & Reed, 1998; Gasper & Clore, 2002; Mokhtari & Buttle, 2015; Yovel et al., 2005; but see also von Mühlenen et al., 2018). There is a global precedence in young individuals that declines with age (Staudinger et al., 2011), indicating the effect of experience on information processing style. Indeed, aesthetic perception is both stimulus- and perceiver-driven, and may elicit a pleasure-based aesthetic response depending on dynamic interactions between stimulus properties and cognitive agent’s past experience (see Graf & Landwehr, 2015). We propose that the past experience might be more important for appreciating the objects or events that are not intrinsically aesthetic or beautiful; the aesthetics/beauty of those objects is discovered by the cognitive agent by associating the current object features with his/her past experience stored in cognitive faculty. In support of this, research has suggested that familiarity with certain objects or stimuli through repeated exposure induces positive affect that can directly influence memory formation and subsequent preference for those objects or stimuli (e.g., Bateson, 1973; Bohrn et al., 2013; de Zilva et al., 2013; Leder, 2001; Sluckin et al., 1982). However, due to the lack of prior experience or cognitive mismatching perceptual recognition might not always be successful even after exploring the object or stimulus through global-to-local or local-to-global processing, but still it is followed by the latter, the restricted local processing stage through which the person may be able to make (though not necessarily due to subjective inability or cognitive mismatching in some cases) aesthetic preferences, again indicating that aesthetic processing does not depend on recognition processing. In further support of the independence of aesthetic and recognition processing, an ERP study has suggested that aesthetic judgment process and symmetry judgment process, a form of basic perceptual or recognition process, differ dramatically and recruits, at least in part, different neural machinery (Jacobsen & Höfel, 2001). Other studies have suggested that without the amygdala (responsible for guiding feature-based attention during aesthetic judgment; Jacobs, Renken, Aleman, & Cornelissen, 2012a) one might be able to recognize stimuli but his aesthetic judgment becomes strongly deviant due to severe disruption in top-down guidance of feature-based attention (Jacobs & Cornelissen, 2017a, 2017b). The aesthetic (restricted local) processing stage is perhaps a one-way stage (Fig. 1) which might not typically operate when the task is purely perceptual or recognition. Thus, aesthetic appreciation is task-dependent (eg., Boccia et al., 2015; Ishizu & Zeki, 2013; Jacobs, Renken, & Cornelissen, 2012b; Jacobsen et al., 2006; Thakral et al., 2012; for details see the section for neuroscientific evidence for perception-appreciation independence). The aesthetic process and the basic perceptual process might not be interdependent even in the case when both stages operate successfully; the two processes are separate and operate independently and consecutively or even concurrently. In the aesthetic judgment of a previously well-known object or stimulus perhaps aesthetic processing occurs concurrently with perceptual/recognition processing, yet recognition is not a necessary precondition for such functioning. This might be true even when the perceived aesthetic qualities, such as hedonic tone and arousal, are associated with the physical stimulus properties, such as form and complexity (Marković & Radonjić, 2008; Spehar & Stevanov, 2021). Thus, it has been suggested that the perceived quality of a product reflects the perceiver’s opinion about the product’s quality independent of the product’s actual physical qualities (Carbon & Jakesch, 2013).

It follows from the above discussion that aesthetic processing can operate with or without sensory recognition, but not without cognition –appreciating beauty or quality requires not only attention (Conway & Rehding, 2013; Singh et al., 2019), but thought as well (Brielmann & Pelli, 2017). This indicates that aesthetic processing involves a distinct faculty for imagination or cognitive mastering (Consoli, 2017), a faculty active, according to Kantian thesis, in the generation of aesthetic pleasure (Kant, 1790/2000). Indeed, aesthetic arts or objects must be attended, analyzed, and categorized to generate aesthetic emotional responses. We call these sub-processes of cognition together ‘aesthetic cognition’. Broadly speaking, ‘aesthetic cognition’ comprises the cognitive emotional processes necessary for rational analysis and decision about the quality or richness (e.g., attractiveness, beauty/prettiness, elegance, sublimeness, catchiness, hedonic value) of an object or stimulus. Research has shown that aesthetic appreciation is a predominantly cognitive process that involves an after-effect emotion associated with the cognitive process of identifying the meaning of a painting (Baltissen & Ostermann, 1998; Xenakis et al., 2012). Martindale (1984) proposed a "hedonic calculus" according to which pleasure is determined by the activation of cognitive units which help identify the meaning of a painting and by the positive associations which accompany them. At the interface of ‘aesthetic cognition’ affective and cognitive processes are integrated to identify the attractive (or unattractive), affectively colorful features of an object or stimulus, and to assign some aesthetic value to it. This indicates how the contents (sub-processes) of ‘aesthetic cognition’ are different from the contents of ‘basic cognition’ (e.g., attention, thoughts, memory) that are simply geared toward identifying the physical distinguishing features of an object or stimulus. These two cognition faculties (Fig. 1) are different at functional level though not exclusively at neural level – they may share the neural substrates of the same brain regions that might be involved in modulating the cognitive components in both basic perceptual (recognition) processing and aesthetic processing (see Ishizu & Zeki, 2013). In order to operate aesthetic cognitive functions, the shared brain regions, mostly the prefrontal cortices, likely form a neural network with the affective system centered on amygdala (Elliott et al., 2011; Ochsner & Gross, 2005, 2007; Phillips et al., 2008), and in order to operate basic cognitive functions perhaps they form a neural network with sensory and other relevant regions of the brain. This does not necessarily preclude the network that the affective system has with the sensory regions (Barbas, 1995; Dolan, 2002; Swanson, 2003; Young et al., 1994); it does instead exclude the sensory regions from ‘aesthetic cognition’ only. However, the brain regions modulating these cognitive functions are not necessarily universal; rather, they do vary across sensory modalities and across the physical properties of sensory inputs (e.g., brightness, smoothness, sharpness, symmetry). Although different levels of affects or emotions (positive, negative) might have different systems or faculties in the brain (Duncan & Barrett, 2007), here we are not interested to subdivide the functionality of our ‘aesthetic cognition faculty’ because in either case the aesthetic valence (positive or negative) of the object or stimulus will probably be analyzed through restricted local processing. Thus we limit our model to aesthetic processing as differentiated from basic perceptual or recognition processing (Fig. 1).

Fig. 1
figure 1

A dual-channel model that differentiates aesthetic processing from basic perceptual or recognition processing in the visual modality. The first channel (route ABC), the Aesthetics-only channel, involves restricted local processing to analyze the quality or richness of sensory inputs under top-down and bottom-up controls in the absence of stimulus recognition (i.e., when recognition is not necessary or failure of recognition occurs). The second channel, the Perception-to-Aesthetics channel, operates in two consecutive stages: a basic perceptual stage and an aesthetic stage. The basic perceptual stage (routes ADE) involves either global-to-local or local-to-global processing under top-down and bottom-up controls to analyze the basic physical distinguishing features of sensory inputs for an accurate recognition or meaningful representation of the percept. The aesthetic stage (routes EDBC) which operates concurrently or immediately after the perceptual stage involves restricted local processing. This latter one is perhaps a one-way stage which does not typically operate when the task is purely perceptual or recognition. The two cognition (aesthetic cognition and basic cognition) faculties in the model are different at functional level but not necessarily at neural level. There are reciprocal interactions between aesthetic emotion and aesthetic cognition and between perception and basic cognition. The brain regions modulating these cognitive functions are not necessarily universal; rather, they do vary across sensory modalities, and across the properties of sensory inputs within a sensory modality

In our model, the ‘aesthetics-only’ channel appears to be direct, more economic and faster (can operate during brief exposures) than the ‘perception-to-aesthetics’ channel because of two reasons: (1) the latter channel involves additional cognitive operations necessary for object or stimulus recognition at the initial stage, and (2) the cognitive operations involved in extraction of a meaningful percept through the analysis of global structure or local details at the initial stage might be slower than the cognitive operations involved in aesthetic appraisal through the analysis of a few selective local features only. This relative efficiency of the restricted local processing over the global-to-local or local-to-global processing leads us to formulate the proposition that aesthetic appraisal may precede semantic processing in certain cases (for a similar proposition for affective appraisal, see Zajonc, 1980, 1984, 2000). Thus during the operation of the ‘aesthetics-only’ channel, only the aesthetic cognition faculty and during the operation of the ‘perception-to-aesthetics’ channel, both the basic cognition faculty and the aesthetic cognition faculty become active, but independently in separate stages: the former one being active during an initial recognition processing, and the latter one being active during an aesthetic processing.

To summarize, we conclude that the aesthetic perception is independent of basic perception but not of cognition (Baltissen & Ostermann, 1998; Mirams et al., 2016). The basic perceptual process operates through a global-to-local or a local-to-global analysis of sensory inputs whereas aesthetic process operates through a restricted local analysis either directly or via perceptual recognition process. At this stage, the two processes appear to share cognition (e.g., thoughts, memories, attention), but still they are functionally different as they are connected to non-affective and affective systems respectively. The perceptual process involves ‘basic cognition faculty’ responsible for basic feature analysis whereas the aesthetic process involves ‘aesthetic cognition faculty’ responsible for quality or richness analysis. Both these cognition faculties operate under top-down and bottom-up controls (Chatterjee & Vartanian, 2014, 2016; Leder & Nadal, 2014; Ochsner & Gross, 2005, 2007; see Pelowski et al., 2017; Phillips et al., 2008; Redies, 2015). The basic cognitive functions and aesthetic cognitive functions are perhaps modulated by different neural networks involving both shared and separate brain regions; however, the modulatory cortical regions are not necessarily universal; they do vary across the properties of sensory inputs.

Neuroscientific evidence for perception-appreciation independence

A wealth of studies in neuroaesthetics has demonstrated that different brain regions underpin aesthetic perception and basic perceptual recognition of visual objects or arts. Those studies have identified both task-dependent activity and beauty-dependent activity of brain regions in healthy humans (e.g., Cela-Conde et al., 2004; Ishizu & Zeki, 2011, 2013; Jacobs, Renken, & Cornelissen, 2012b; Jacobsen et al., 2006; Kawabata & Zeki, 2004; Thakral et al., 2012; Zeki et al., 2014). The studies reporting task-dependent activity of brain regions are summarized in Table 1, and those reporting beauty-dependent activity are summarized in Table 2. These two lines of studies together provide intriguing evidence for the relative independence of basic perceptual (recognition) processing and aesthetic processing. Here, we take the opportunity to highlight those studies individually as they used different stimulus parameters and made unique contributions to the understanding of perception-appreciation independence.

Table 1 A summary of prior studies showing task-dependent activity of brain regions in healthy humans
Table 2 A summary of prior studies showing beauty-dependent activity of brain regions in healthy humans

Task-dependent activity of brain regions

As shown in Table 1, the task-dependent activity of brain regions was first reported by an fMRI study of Jacobsen and colleagues (Jacobsen et al., 2006). In this study, participants viewed a variety of visual geometric shapes (triangles, squares, rhombuses, and various oriented bars) and judged their aesthetics and symmetry. This study showed that aesthetic judgments caused more specific and stronger activations in the right frontomedian cortex near BA 9/10, right cingulate cortex, left inferior precuneus, bilateral ventral prefrontal cortex around BA 45/47, left temporal pole, and temporo-parietal junction, whereas symmetry judgments, another type of basic perceptual judgments, elicited more specific and stronger activations in several areas related to visuospatial analysis, including superior parietal lobule, left intraparietal sulcus, left fusiform gyrus, left ventral premotor cortex, dorsal premotor cortex (PMC) and left extrastriate visual cortex (Jacobsen et al., 2006). The task-dependent activities of these brain areas were identified when participants were tested with the same visual stimuli, indicating that aesthetic appreciation probably proceeds through neural channels independent of the neural channels for basic feature perception. The same study further demonstrated that stimulus complexity enhanced activity in the right lateral fronto-orbital cortex (BA 47/11) during aesthetic judgments, and activity in the right anterior inferior frontal gyrus, and the right ventral PMC during symmetry judgments. Moreover, the effect of stimulus complexity was descriptively more dominant in fusiform gyri during symmetry judgments than aesthetic judgments. Thus complexity adds new feature, new dimension to the stimuli or objects and activates new brain areas, but aesthetic processing and basic recognition processing are still neurally dissociated (Jacobsen et al., 2006).

A second fMRI study examined the neural basis of motion and aesthetic experiences in humans using fMRI techniques (Thakral et al., 2012). Participants viewed and judged the pleasantness of van Gogh paintings that evoked a range of motion experiences. This study demonstrated that activity in MT+ [the middle temporal (MT) plus other adjacent motion-sensitive areas, including medial superior temporal (MST); Dukelow et al., 2001] was associated with the degree of motion experience and activity in the right anterior prefrontal cortex (PFC) was associated with the experience of pleasantness of paintings, but not the other way round. The authors explained these findings in support of both the (low level) sensory hypothesis and the (high level) conceptual hypothesis of aesthetic experience. However, the findings also bear clear evidence that the aesthetic processing and motion information processing, a type of basic perceptual processing, are neurally dissociated.

A third fMRI study used visual textures as stimuli, asking participants to judge their beauty and roughness, and showed that the frontomedian cortex (ventral and dorsal clusters), the amygdala and the posterior cingulate cortex were more strongly activated during beauty judgments, whereas the frontal operculum, the supramarginal gyrus and the fusiform gyrus were more strongly activated during roughness judgments, a type of basic perceptual judgments (Jacobs, Renken, & Cornelissen, 2012b).

A fourth and more recent fMRI study used Arcimboldo's portraits, asking participants to perform an explicit aesthetic judgment task and an artwork/non-artwork classification task (Boccia et al., 2015). This study demonstrated that as compared to classification task, aesthetic judgments produced stronger activation in the OFC, insula (insular cortex), supplementary motor area (SMA), left superior and left inferior frontal gyrus and the right middle cingulum, as well as the bilateral anterior middle cingulum. The authors did not contrast these two tasks the other way round, leading to no information about brain regions activated by classification task. However, they further reported that both positive and negative aesthetic experiences activated fusiform face area (FFA), with the ambiguous artworks eliciting a negative aesthetic experience leading to more pronounced activation than the ambiguous artworks eliciting a positive aesthetic experience. These findings suggest that the same neural substrates subtend both positive/beauty and negative/ugly aesthetic experiences, but with different patterns, so that the pattern of neural activity predicts the category of stimuli or objects (Boccia et al., 2015).

A fifth (not chronological) fMRI study examined the differences in brain activation during beauty/aesthetic judgment and brightness judgment of simultaneously presented paintings (Ishizu & Zeki, 2013). This study demonstrated that the PMC, SMA, dorsolateral PFC and intraparietal sulcus were activated by both perceptual and aesthetic judgments. However, as compared to brightness judgments aesthetic judgments produced greater activation in the medial and lateral subdivisions of OFC, SMA, inferior and superior frontal gyrus, left anterior insula, and in the subcortical regions that are associated with affective motor planning, such as globus pallidus, putamen, thalamus, amygdala, and cerebellar vermis. Based on these findings, Ishizu and Zeki (2013) proposed a hypothetical scheme to illustrate the separation between brain systems involved in perceptual or cognitive judgment and those involved in affective or aesthetic judgment (Fig. 2). According to this scheme, there are two pathways for perceptual and aesthetic judgments in the brain. There are also functional specializations in both the non-motor pathways and the motor pathways, with aesthetic judgment recruiting cortical systems not recruited by perceptual judgment, in addition to those recruited by both kinds of judgments.

Fig. 2
figure 2

Ishizu and Zeki’s hypothetical neural scheme for aesthetic and perceptual judgments of paintings. The system to the left (anterior insula, dlPFC, and IPS) is involved in both brightness (perceptual–cognitive) and beauty (affective–aesthetic) judgments, whereas that to the right (mOFC and iOFC) is involved in aesthetic judgment only. The two motor pathways involved in both kinds of judgments (PMC and SMA) are shown to the left, and the motor structures involved in affective judgment alone (basal ganglia and cerebellar vermis) are shown to the right (after Ishizu & Zeki, 2013)

We advocate the notion of functional specialization as well as the shared brain systems; however, we disagree with how the neural scheme has been interpreted, making it problematic in a number of ways. One major problem is that the proposed neural scheme appears internally contradictory. According to this scheme (Fig. 2), brain system ‘A’ is responsible for affective-aesthetic functions, brain system ‘B’ is responsible for perceptual-cognitive functions, and brain system ‘B’ is shared by both brightness and aesthetic judgments; thus brightness judgment involves cognitive functions but aesthetic judgment does not! From this view, it appears that aesthetic judgment is a purely affective process and perceptual judgment is a cognitive process, and that aesthetic process operates independently of cognitive process – a view similar to the so called cognition-emotion independence hypothesis (Zajonc, 1980, 1984, 2000) that has been severely criticized and rejected by other researchers (e.g., Lazarus, 1982, 1984, 1991; Phelps, 2004; Storbeck & Clore, 2007). This type of view is unrealistic and contradicts the well-established models of visual aesthetics and the mounting body of evidence that aesthetic judgment involves cognitive functions (see Cattaneo et al., 2014; Cela-Conde et al., 2013; Chatterjee & Vartanian, 2014, 2016; Cupchik et al., 2009; Ferrari et al., 2015; Lengger et al., 2007; Redies, 2015; Ridderinkhof et al., 2004). Thus apart from Ishizu and Zeki’s (2013) view, we propose that the brain regions which are involved in basic cognition underlying brightness perception is also involved in affective cognition underlying aesthetic appreciation, with the latter function involving additional brain regions for emotional processing (Fig. 2). Indeed, the authors’ view builds on the Kantian philosophical belief that aesthetic judgment is highly subjective but cognitive judgment is not (Ishizu & Zeki, 2013). Contrary to this view, we argue that cognitive judgment can also be subjective as the cognitive schema is shaped by past experience, cultural or contextual influence (as exemplified earlier), and that aesthetic judgment can rather involve higher level of cognitive operations than perceptual judgment of brightness (see Cela-Conde et al., 2013).

A second problem is that Ishizu and Zeki’s (2013) neural scheme fails to clarify why some brain areas are shared while others are not, and how the shared brain areas interact with those that are specialized for aesthetic judgment. Here, we propose two plausible reasons for which brightness judgment and aesthetic judgment might share some areas of the brain. First, those areas might be actually specialized for brightness perception and they are also recruited during aesthetic appraisal because an initial perceptual analysis might be necessary prior to aesthetic processing – an idea consistent with how the second analytic channel of our dual-channel aesthetic model works (Fig. 1). Second, those brain areas might have been genetically programmed not only for basic cognitive functions but for affective cognitive functions as well, and during aesthetic appraisal they might interact with other brain areas that are specialized for eliciting aesthetic emotion. In support of their involvement in affective cognitive functions, research has suggested that the anterior insula plays an important role in making choices (Ernst & Paulus, 2005; Sanfey et al., 2003) and in cognitive–affective integration (Gu et al., 2012). An fMRI study on the emotional aspect of aesthetic appreciation suggested that the dorsolateral PFC plays a crucial role in aesthetic appreciation related to executive functions in general, and to orienting and sustaining attention in particular, and that the pattern of activity observed in this and related frontal regions might constitute a signature of an aesthetic response (Vessel et al., 2012). It has been further suggested that aesthetic judgments (which are cognitive) and aesthetic emotions are interactive (Armstrong & Detweiler-Bedell, 2008; Dio & Gallese, 2009; Yeh et al., 2015; Zeki et al., 2014), and that beauty is best thought of as an exhilarating emotional experience (Armstrong & Detweiler-Bedell, 2008).

A third and final problem of Ishizu and Zeki’s (2013) neural scheme is that it is limited to explaining brightness judgments and aesthetic judgments of paintings, and fails to give a general account for how visual aesthetic judgments and basic visual perceptual judgments are executed. However, the dual-channel analytic model we propose in this review (Fig. 1) is free from such a limitation. Because our dual-channel model is not specific to a stimulus property it can explain basic visual perceptual judgments and visual aesthetic judgments in general. According to this model, aesthetic judgments likely recruit the same emotion-related centers of the brain irrespective of stimulus parameter; however, as mentioned before, the brain regions recruited in basic perceptual judgments may vary across sensory modalities and across stimulus parameters (e.g., brightness, smoothness, sharpness, symmetry). A detailed discussion of stimulus parameter-induced activation of brain regions is beyond the scope of this review.

Beauty-dependent activity of brain regions

Evidence for beauty-dependent activity of brain regions comes from a number of neuroimaging studies (Table 2). For example, an MEG study demonstrated that the left dorsolateral PFC exhibited stronger activation when participants perceived beautiful stimuli (natural or artistic) rather than non-beautiful stimuli (Cela-Conde et al., 2004). Using a variety of artistic visual stimuli, such as portrait, landscape, still life (beautiful, neutral and ugly) an fMRI study demonstrated that aesthetic appreciation of different categories of paintings was associated with distinct and specialized visual areas of the brain, and that the modulation of activity within the same areas correlated with the judgment of a painting as being beautiful or not (Kawabata & Zeki, 2004). The same study further demonstrated that regardless of painting type, the OFC, anterior cingulate gyrus (BA 32), and the left parietal cortex (BA 39) were activated more by the perception of beautiful than neutral stimuli, the medial OFC was activated more by the perception of beautiful than ugly stimuli, whereas the motor cortex was mobilized more by the perception of ugly than beautiful stimuli. In a second fMRI study, participants were presented with pictures of paintings whilst acquiring fMRI data. The results generally showed that as compared to both ugly and neutral paintings as well as their combination beautiful paintings produced stronger activation in the medial OFC and left caudate nucleus (Ishizu & Zeki, 2011). On the other hand, as compared to beautiful paintings ugly paintings produced stronger activation in the amygdala, right fusiform gyrus, left inferior occipital gyrus, left superior medial gyrus, left postcentral gyrus, and left somatomotor cortex (Ishizu & Zeki, 2011). The beauty-dependent brain activation was also corroborated by Zeki et al. (2014) in another fMRI study conducted on the appreciation of mathematical formulae or equations. This study showed that compared to the mathematical equations rated as ‘neutral’ the mathematical equations rated as ‘beautiful’ produced greater activations in the medial OFC, the left angular gyrus and the left superior temporal gyrus. Simialrly, the mathematical equations rated as ‘beautiful’ produced greater activation in the medial OFC as compared to those rated as ‘ugly’.

A fourth fMRI study demonstrated that compared with neutral stimuli beautiful stimuli produced stronger activations in the left middle frontal gyrus, left angular gyrus, cingulate cortex, left precuneus, and left medial OFC (Martín-Loeches et al., 2014). Though beautiful stimuli and ugly stimuli produced similar activations in the medial OFC as well as in the posterior and medial portions of the cingulate gyrus, this study reported beauty-dependent activations in other areas. Specifically, the left caudate/nucleus accumbens (NAcc), the left ACC and SMA showed stronger activations for beautiful faces or bodies compared to ugly faces or bodies, whereas basal occipital areas displayed an inverse pattern of activations. However, in contrast to the beautiful or ugly stimuli, the neutral stimuli elicited stronger and wider activations in the somatosensory and somatomotor systems (Martín-Loeches et al., 2014), the regions that are thought to be responsible for basic perception.

The fMRI study of Jacobs, Renken, and Cornelissen (2012b) which lends support to the task-dependent activity of brain regions (Table 1) also provides evidence in support of beauty-dependent activity. As shown in Table 2, this study demonstrated that BA18/19, middle occipital gyrus and fusiform gyrus were more strongly activated by the most beautiful than by the least beautiful textures, with the neutral textures showing no activations. The study further demonstrated task-stimulus interactions, in the frontomedian cortex and the amygdala, which were qualitatively different for the regions responding to the main effect of judgment task when compared to the regions responding to the main effect of beauty level (Jacobs, Renken, & Cornelissen, 2012b). The regions responding to the main effect of judgment were more responsive to beauty level during beauty judgments, and the differences were particularly pronounced for the beautiful stimuli. On the other hand, the regions responding to the effect of beauty level appeared rather to be less responsive to the ugly stimuli during beauty judgment than during other judgments.

Finally, a very interesting fMRI study was conducted by Bohrn and colleagues in which participants read a number of proverbs without explicitly evaluating them (Bohrn et al., 2013). In a post-scan reading each participant rated the beauty of each proverb. The authors correlated BOLD activity with individual post-scan beauty ratings, reporting some important findings. For example, post-scan beauty ratings showed a parametric modulation of the BOLD activation in the right caudate nucleus extending to putamen (and at a more lenient threshold also in the left ventral striatum), suggesting that the more rewarding a proverb was during initial reading, the more aesthetically pleasing or beautiful it was judged in a post-scan reading. A similar parametric effect of post-scan beauty ratings on BOLD activation was also recorded in the anterior rostral part of the medial frontal cortex associated with ACC. This region of the medial frontal cortex is thought to be functionally connected to the amygdala, OFC, insula and hippocampus, and is generally involved in affective tasks, such as valence ratings, emotional Stroop tasks, or mood induction (Bush et al., 2000).

An inspection of the findings listed in Tables 1 and 2 reveals some important aspects of human aesthetics and perceptual processing in relation to neural substrates. First, it appears that the perception of both beauty and ugliness of an object or stimulus recruited the same emotion-related subcortical and cortical regions that were highly consistent across studies (Table 2), whereas the perception of neutral stimuli (Table 2) or basic perception (Table 1) recruited brain regions exclusively different from the regions recruited by the beauty or ugliness perception or aesthetic perception in general, and these sites of activation widely varied across studies (Tables 1 and 2). This wide discrepancy among the activation sites for neutral stimulus perception or basic perception might reflect methodological differences and stimulus differences across studies. This further indicates, consistently with our dual-channel model, that the process of basic perception or neutral stimulus perception recruits neural substrates mostly depending on the physical distinguishing features of an object or stimulus, whereas aesthetic perception recruits neural substrates depending more on the (affective) quality or richness than simply on the physical identifying properties of an object or stimulus. Second, it further appears that the beauty perception produced stronger activation in some emotion-related sites (e.g., OFC, caudate nucleus/NAcc, cingulate cortex; Ishizu & Zeki, 2011; Kawabata & Zeki, 2004; Martín-Loeches et al., 2014; Zeki et al., 2014; Table 2), whereas ugliness perception produced stronger activation in emotion-related some other sites (e.g. amygdala; Ishizu & Zeki, 2011; Table 2). This indicates that the beauty perception and ugliness perception differentially modulate the neural activity in the emotion-related same brain regions; however, they are clearly distinct from the regions responsible for basic perception. Third, some studies showed that in addition to the emotion-related regions, aesthetic perception recruited higher-order cortical regions but no sensory regions (Boccia et al., 2015; Bohrn et al., 2013; Tables 1 and 2), indicating that the aesthetic perception might occur directly without operating the basic perceptual process, and this lends support to the first analytic channel of our dual-channel aesthetic model (Fig. 1). However, other studies demonstrated that aesthetic perception recruited sensory regions, such as basal occipital regions and occipital gyrus, in addition to emotion-related subcortical and cortical regions, particularly when participants were involved in aesthetic judgment (Ishizu & Zeki, 2011; Jacobs, Renken, & Cornelissen, 2012b; Martín-Loeches et al., 2014; Tables 1 and 2). This indicates that assigning participants with an aesthetic judgment task might lead to basic perceptual processing in an initial analysis prior to the operation of aesthetic processing in the second stage, and this lends support to the second analytic channel of our dual-channel aesthetic model (Fig. 1). Taken all these aspects of prior studies together, we conclude that the aesthetic process and the basic perceptual process proceed by relatively separate neural channels; the former process always recruits emotion-related brain regions essentially for non-perceptual processing, such as affect and decision making, but the latter process does not. Similarly, the latter process always recruits sensory and other relevant brain regions for basic perceptual processing, which in some cases may also be an initial stage of aesthetic processing but not necessarily all the time.

Nonvisual aesthetics versus nonvisual perception

Beauty lies not only in the visual modality but in the nonvisual modalities of the beholder as well (Barry, 2014; Groyecka-Bernard et al., 2017; Joy, & Sherry Jr., 2003; Lauwrens, 2019; Roberts, 2022; Scheller et al., 2021). As outlined earlier in this review, many of our everyday decisions are based on a combination of both visual and nonvisual sensory experiences. Research has shown that nonvisual means might be even more important for evaluating certain products than the visual means (e.g., for a vacuum cleaner it was audition and for a computer mouse it was touch; Schifferstein, 2006). Therefore, we briefly discuss in this section how human aesthetics operates in the tactile, auditory, and other nonvisual modalities, and how it is different from nonvisual basic perception. To this end, our discussion mainly focuses on the mechanisms of nonvisual aesthetics, and the extent to which the proposed model for visual aesthetics can be generalized to aesthetics in nonvisual modalities.

Aesthetic perception versus basic perception in tactile modality

Perception and appreciation of tactile objects

Touch is a fundamental means to perceiving and appreciating nonvisual world comprising tangible arts and objects (see Barry, 2014; Lauwrens, 2019; Roberts, 2022). It is the first sense to develop and perhaps the second most important sensory modality humans tend to rely on. Every day, we experience a wide range of sensations through touch, from the feeling of clothing against our skin to the feeling of tactile vibrations from electronic devices like cell phones. Salem et al. (2009) suggested two or three distinct aesthetic experiences people may have with such interactive objects or products: (1) the aesthetics of perception, the degree to which all our senses are gratified; (2) the aesthetics of cognition, the meaning we attach to the product; or (3) the aesthetics of action, the way we feel comfortable, satisfied, or pleasant  through bodily action. Our aesthetic appraisal of interactive objects or products emphasizes the comfortableness or pleasantness of tactile contact and makes a difference to how we feel in our clothes, and how we enjoy using electronic devices and other interactive objects or products. This affective aspect of touch is distinct from the discriminative aspect of touch which refers to the basic perceptual attributes of tactile stimulation, linked to quantifiable, physical features of the stimuli or objects (Essick et al., 2010; Pasqualotto et al., 2020). Thus we perceive basic tactile attributes through discriminative touch and appreciate their quality or richness through affective touch, and this job is skilfully accomplished by the activation of skin receptors innervated by different types of nerve fibers or afferents. Human skin is innervated with two major types of tactile afferents: A-beta (Ab) afferents and C-tactile (CT) afferents (Ackerley et al., 2014). Ab afferents which exist in the glabrous (nonhairy) skin, such as palm skin (Ackerley et al., 2014; McGlone et al., 2007) are involved in discriminative touch – a kind of touch used to identify or discriminate physical properties of an object, such as form, texture, shape, and size (McGlone & Reilly, 2010). On the other hand, CT afferents which are exclusively found in the hairy skin sites, such as face or arm (Ackerley et al., 2014; Johansson et al., 1988; McGlone et al., 2007; Nordin, 1990; Vallbo et al., 1993; Vallbo et al., 1999; Yu et al., 2019) are responsible for sensing affective or pleasant aspects of touch (Cerritelli et al., 2017; Etzi et al., 2014; Liljencrantz & Olausson, 2014; Löken et al., 2009). A recent study examined the relationship between stroking hardness and affective touch over palm and forearm skin sites by giving affective tactile stimulation with four different hardness of brushes at three different forces (Yu et al., 2019). This study showed that light, soft stroking was rated to be more pleasant as compared to heavy, hard stroking. Moreover, the hairy skin of the forearm was more susceptible to stroking hardness than the glabrous of the palm in terms of the perception of pleasantness.

Both the discriminative and affective aspects of touch are important in human life. Discriminative aspects of touch support object recognition and motor activities that are necessary for human’s survival and affective aspects of touch allow the detection of hedonic environmental features that help maintain emotional wellbeing and homeostasis of humans. The most significant affective touch in human life is social touch – a kind of touch that elicits intimate emotional responses (grooming, nurturing) in skin-to-skin contact and cements interpersonal bonds between individuals, such as parents and infants, close friends, and romantic partners (Kress et al., 2011; Morrison et al., 2010; Olausson et al., 2010). For example, a friend’s hug can give us comfort, a parent’s pat on the back can give us courage, and a lover’s kiss can excite us (Wijaya et al., 2020). These kinds of social touches play a powerful role in human life, with important physical and mental health benefits throughout the lifespan (Gentsch et al., 2015; Pasqualotto et al., 2020; van Erp & Toet, 2015). A recent study showed that participants receiving less tender physical contact with family members, partners, or close friends judged social touch as (significantly) less pleasant than participants who received more interpersonal touch in everyday life (Sailer & Ackerley, 2019). This suggests the role experience plays in the development of social or affective touch – use-it-or-lose-it! A detailed discussion of social touch hypothesis is beyond the scope of this review. We intend here to show evidence from a fascinating line of research that affective touch which is essential to sensing tactile quality or richness (referred to as tactile aesthetics) has a unique sensory system, and that it is neurally dissociated from discriminative touch which is essential to perceiving basic tactile features.

Distinct neural processing of discriminative and affective/aesthetic touch

A growing number of neuroimaging studies in healthy as well as patient humans as summarized in Table 3 (these are not exhaustive) demonstrated that different brain regions underpin discriminative (basic perceptual) touch and affective (aesthetic) touch, and that the aforementioned two nerve fibres are responsible for detecting and transmiting these touch signals to different regions of the brain. Specifically, Ab afferents are responsible for detecting and transmiting discriminative touch signals to SI, whereas CT afferents are responsible for detecting and transmiting affective touch signals to (posterior) insular cortex, a brain region thought to process information related to emotions and interpersonal experiences (Craig, 2002, 2008, Kress et al., 2011; Morrison, 2016; Morrison, Löken, et al., 2011b; Morrison, Löken, et al., 2011b; Olausson et al., 2002; Olausson et al., 2010). This has been proven by studies in both healthy humans and patients lacking Ab or CT fibers. For example, an fMRI study showed that soft brush stroking on the forearm (CT stimulation) of a participant lacking Ab afferents produced activation in insular region, but not in somatosensory areas, SI and SII (Olausson et al., 2002). A second fMRI study in Ab deafferented patients demonstrated that CT stimulation not only activated insular cortex but deactivated somatosensory cortex as well (Olausson et al., 2008). A third fMRI study was conducted in both healthy humans and a patient lacking Ab afferents which demonstrated that soft brush stimuli to the right forearm and thigh activated the contralateral (left) posterior insular cortex in both the healthy group and the patient (Bjornsdotter et al., 2009). The consistency in insular activation patterns across the patients lacking Ab fibers and the healthy participants confirms that the identified organization reflects the central projection of CT fibers (Bjornsdotter et al., 2009). However, a fourth fMRI study was conducted in healthy humans and patients with fewer CT afferents due to a genetic mutation (Morrison, Löken, et al., 2011b). This study reported three key findings. First, gentle, slow stroking on the forearm activated posterior insular cortex in the healthy group but not in the patient group. Second, the patients with fewer CT afferents perceived arm stroking as less pleasant than did the controls, indicating that the perception of hedonic aspect of dynamic touch likely depends on CT afferent density. Third, the patterns between individuals’ ratings of felt and seen touch were closely similar which suggests that the appraisal of seen touch is anchored in one’s own (hedonic) perceptual experience. More interestingly, a fifth fMRI study demonstrated that when healthy participants viewed videos of other people’s arms being stroked at a pleasant speed, their posterior insular cortex was activated in the same way as when they had been stroked themselves (Morrison, Björnsdotter, & Olausson, 2011a). This similarity between felt touch and seen touch indicates that the role of insular cortex is not specific to tactile modality but can be generalized to visual modality as well. Research in visual modality as outlined before also lends support to this proposition by showing that insular cortex is strongly activated during aesthetic judgment of visual stimuli (Boccia et al., 2015; Ishizu & Zeki, 2013).

Table 3 A summary of prior studies showing brain areas activated in healthy humans during discriminative and affective/aesthetic touch

In addition to insula activation, other emotional regions of the brain have been found to exhibit activation during affective touch. These particularly include two prefrontal regions, namely the OFC and the ACC, with which the amygdala, the core emotional center of the brain, is thought to have reciprocal interconnection (Compton, 2003; Fenske & Raymond, 2006; Pourtois et al., 2013; Vuilleumier et al., 2003). For example, some fMRI studies demonstrated that somatosensory areas, including SI and part of SII in the superior temporal plane, are activated more by a neutral or discriminative touch than by a pleasant or painful touch, whereas affective touch produced stronger activation in the OFC than a neutral or discriminative touch (Francis et al., 1999; Rolls et al., 2003). Moreover, pleasant touch and painful touch are represented in different parts of the OFC (Rolls et al., 2003). This indicates that pleasantness perception and pain perception differentially modulate neural activity in the OFC consistently with the modulation of neural activity by beauty perception and ugliness perception in the visual modality as discussed in a prior section (see Ishizu & Zeki, 2011; Kawabata & Zeki, 2004; Martín-Loeches et al., 2014; Zeki et al., 2014). A recent study asked healthy adults to rate intensity (basic perceptual aspect) and pleasantness (aesthetic aspect) of brushing, each aspect on a 100-point Visual Analogue Scale (VAS) during fMRI (Case et al., 2016). This study demonstrated that perceived intensity significantly predicted activation in contralateral SI, whereas perceived pleasantness predicted activation in the ACC, a finding consistent with the finding of Lindgren et al. (2012). The same study further demonstrated that ratings of intensity and pleasantness were inversely related within participants, indicating their processing independence. Moreover, when some of the participants were subjected to inhibitory rTMS over the right SI, their sensory discrimination was reduced, and the participants with reduced sensory discrimination rated touch as more intense; however, the perceived touch pleasantness was unaffected by rTMS. These findings support divergent neural processing of touch intensity and touch pleasantness, with affective touch encoded outside of SI.

It is noteworthy to mention a few other studies which demonstrated that pleasant touch can be mediated by both CT afferents and Ab afferents (Krämer et al., 2007; McGlone et al., 2012). However, pleasant touch from hairy skin, mediated by CT afferents, is thought to be innate (Pasqualotto et al., 2020) and processed in the limbic-related brain regions, such as posterior insular cortex and mid-anterior OFC, and represents an innate unlearned process, whereas pleasant touch from glabrous skin, mediated by Ab afferents, is thought to be learned (Pasqualotto et al., 2020) and processed in the somatosensory or parietal cortex and represents an analytical process dependent on previous tactile experiences (Gordon et al., 2013; McGlone et al., 2012). Thus it has been suggested that pleasantness perception based on Ab input might be more dependent on experiential or contextual factors in a top–down manner than pleasantness perception based on CT input (Olausson et al., 2010; Pasqualotto et al., 2020). However, recent neuroimaging studies argued that the evidence for somatosensory activation during Ab-projected affective touch is insufficient and might contain confounds related to attention, motivation and stimulus properties, and failed to replicate the same (e.g., Case et al., 2016; Karim & Likova, 2018). For example, one recent fMRI study on tactile aesthetics asked healthy blindfolded-sighted adults to slowly and softly explore 3D tactile objects (geometric, irregular), taking one at a time in the palms of two hands together (Ab stimulation) and judge their aesthetic pleasure (Karim & Likova, 2018). This study demonstrated that most of the reward networks established in visual experimental paradigms, such as the ventro-medial PFC, OFC, ACC, and NAcc, were activated during tactile aesthetic judgments (Table 3). This finding closely corresponds with the findings of most prior studies discussed above. Thus the insufficient prior evidence for somatosensory activation during Ab-projected affective touch does not necessarily affirm that the somatosensory cortex is inherently responsible for processing affective tactile inputs; either those findings were confounded or they might simply reflect the predisposed somatosensory response tendency to Ab projected inputs, whether affective or discriminative, or at best, such activation might be an indication that the somatosensory cortex is involved in passing Ab-projected affective information on to the insular or other reward-related regions. This latter possibility is most likely as there are afferent and efferent connections between posterior insular regions and parietal somatosensory areas (Augustine, 1996). For example, tactile responses in posterior insular cortex may be influenced by discriminative processing in SI and SII, and somatosensory responses in SI and SII may be modulated by affective coding of tactile stimuli in insular cortex (Olausson et al., 2008).

The aforementioned findings lead us to contend that the affective aspects of touch that underlie tactile aesthetic appraisal and discriminative aspects of touch that underlie basic tactile recognition are neurally dissociated. More generally, basic tactile recognition processing, which requires discriminative touch, and tactile aesthetic processing, which requires affective or emotional touch, are possibly mediated by distinct neural networks in both skin-to-skin and skin-to-object contact (Case et al., 2016; Karim & Likova, 2018; Morrison et al., 2010; Olausson et al., 2010). In support of this, one fMRI study demonstrated that gentle brush stroking to arm (CT mediated pleasant touch) activated a network of brain areas, including the right posterior superior temporal sulcus, medial prefrontal cortex, dorso-ACC and amygdala (Gordon et al., 2013). On the contrary, some fMRI and PET studies of tactile roughness discrimination showed that discriminative touch (rubbing fingertips with gratings that differed in roughness) activated a network of brain areas in the lateral prefrontal cortex (Kitada et al., 2005) as well as in the parietal cortex, including SI/postcentral gyrus and SII/parietal operculum (Burton et al., 1997; Burton et al., 1999). A recent study demonstrated similar results for softness discrimination (Kitada et al., 2019). This study showed that activity in the parietal operculum, insula, and medial prefrontal cortex was positively associated with perceived softness magnitude, regardless of the applied force, and that the control regions of tactile softness perception are located in the parietal operculum/insula, postcentral gyrus, posterior parietal lobule, and middle occipital gyrus. Taken these findings together, we conclude that affective touch, the core of tactile aesthetics and discriminative touch, the core of basic tactile perception are highly dissociable both on the regional and network levels.

A similar dual-channel hierarchical model for tactile aesthetics

The findings discussed above provide compelling evidence for the existence of relatively separate neural networks for basic perceptual judgment and aesthetic (affective) judgment of tactilely felt stimuli, consistently with such judgments of visual stimuli. An inspection of the regions of activation in the visual and tactile modalities indicates that there are four brain regions, namely the OFC, cingulate cortex, NAcc and amygdala, shared by the aesthetic judgments of both visual and tactile objects. This sharing does not necessarily rule out the relative independence of the two sensory modalities. Indeed, the two sensory modalities involve many separate brain regions and they appear to share only a few – those that are responsible for or associated with value judgments, decisions and the elicitation of emotional responses (Conway & Rehding, 2013). On the question of neural modulation strategies in the tactile modality, we propose that a dual-channel model, similar to the dual-channel model of visual aesthetics (Fig. 1), can explain the aesthetic process and recognition process modulated by the aesthetic cognition and basic cognition respectively, in both the sighted and the blind having typically developed Ab and CT afferents. In this modality, the “aesthetics-only” channel (Fig. 1) likely operates following a restricted local processing style (see earlier) when perceptual recognition is not necessary or failure of recognition is obvious, such as when CT afferents are shortly and softly stimulated (affective stimulation). The perception-to-aesthetics channel perhaps operates when both Ab afferents (responsible for mediating perceptual discrimination; McGlone & Reilly, 2010) and CT afferents (responsible for mediating affective appraisal; Cerritelli et al., 2017; Etzi et al., 2014; Löken et al., 2009) are concurrently stimulated, or even when only Ab afferents are stimulated, because they are responsible not only for perceptual discrimination but for sending input projections to the affective system as well (see above). However, in everyday life, we do not purposively stimulate our CT or Ab afferents; rather, we do touch and hold things or objects usually in the palms of our hands and appraise their richness or quality (pleasantness, interestingness). The deployment of attention in this kind of daily tactile setting is perhaps different from the deployment of attention in a similar visual setting. That is, unlike visual attention, perhaps tactile attention cannot be directly directed to the few attractive local features before exploring and feeling by touch the whole object/art or all parts that comprise it. Thus, apart from our view about aesthetics in the visual modality, we propose that the analytic channels in the tactile modality might rather operate differently in our daily life—a local-to-global or most likely a global-to-local processing might initially be necessary, to feel and notice the few attractive or unattractive local features, before operating the restricted local processing of those features in the second stage. A recent study demonstrated that global processing increased pleasantness ratings of a set of high-frequency tactile vibrations (Mirams et al., 2016), albeit this study did not assess restricted local processing effect. We conclude that perhaps the second analytic channel (a two-stage perception-to-aesthetics channel) is more prevalent and more typical in the tactile modality than is the first analytic channel (a one-stage aesthetics-only channel) in daily natural setting, which likely operates under top-down and bottom-up controls (Carbon & Jakesch, 2013; McCabe et al., 2008; Mirams et al., 2016) not only for pure aesthetic experiences but for partial/semi aesthetic experiences as well in a fashion similar to how it operates in the visual modality (see earlier). However, recognition of a tactile object/art might not always be successful (due to the lack of prior experience or familiarity, cognitive mismatching) even after exploring the whole object or art through a global-to-local or a local-to-global analysis, but still this stage can be followed by the second stage, the stage of restricted local analysis through which the person may be able to make (although not necessarily due to subjective inability or cognitive mismatching in some cases) aesthetic preferences modulated by aesthetic cognition. This further indicates that tactile aesthetic preferences are independent of tactile recognition but not independent of tactile cognition (Mirams et al., 2016). This proposition is consistent with the perception-appreciation independence hypothesis that we have proposed for the visual modality in a prior section.

We contend that such a proposition might apply not only for the sighted but for the blind as well. Blind people who can see the world little or cannot see at all likely rely on tactile modality the most. Touch is the only means for them to experience and evaluate things or objects necessary in everyday life and to experience the beauty of interactive artworks in a gallery or museum (see Lauwrens, 2019). Yet perhaps they never have had, for example, a prior experience of tactile octagons. Now, if we present them some tactile octagons with sharp edges and some with smooth edges in a mixed manner, perhaps they will prefer, like the sighted (Bar & Neta, 2006, 2007; Guthrie & Wiener, 1966; Silvia & Barona, 2009), the octagons with smooth edges over those with sharp edges, although they are likely unable to recognize the objects fully. This example lends further support to the “aesthetics-only” channel, which states that aesthetic appreciation can even operate without object recognition. Does this imply that cognitive process will not operate in the blind while making choices for the tactile octagons and irregular tactile shapes or appreciating their quality? Our answer is definitely negative; we propose that because aesthetic appraisal is a higher-order function, the brain areas involved in this process will make choices within the cognitive system irrespective of shape recognition capacity, and irrespective of blindness or visual experience. Evidence in support of this idea is scanty as very little research on tactile aesthetics has been conducted with this special population thus far.

One rare exception that supports the above view is a recent fMRI study by Karim and Likova (2018) who used a set of 3D irregular tactile shapes (symmetric and asymmetric versions) and a set of 3D geometric tactile shapes (sharp and curved versions) in blind and blindfolded-sighted participants, asking them to detect symmetry or sharpness of each shape and judge its aesthetic pleasure through touch. The behavioral data of this study showed that the blind population performed better than the sighted population in symmetry detection but not in sharpness detection. However, both the populations judged symmetric 3D tactile shapes as significantly more pleasing than asymmetric 3D tactile shapes, and curved 3D geometric tactile shapes as significantly more pleasing than sharp 3D geometric tactile shapes. These findings suggest two propositions about aesthetic appreciation. First, viewed from the evolutionary perspective sharp tactile shapes and asymmetric tactile shapes appear to be less adaptive and less aesthetic (see Verpooten, 2018). Second, aesthetic appreciation of tactile shapes might be independent of visual experience. Research in congenitally blind and sighted participants demonstrated that similar to tactile aesthetic appreciation (basic) tactile shape representation also can be independent of visual experience (Peelen et al., 2014). However, there might be experience-dependent differences in the use of a global or local processing strategy, as well as in the recruitment of cortical resources in basic tactile perception and tactile appreciation. In support of this, research in tactile perceptual discrimination has shown that when required to name compound Braille letters sighted people employ both a global and a local analysis, whereas blind people, especially congenitally blind people, likely employ a local analysis more than a global analysis (Heller & Clyburn, 1993). Consistently, in another study investigating haptic drawing task, totally blind children tended to process information locally more often than blind children with minimal light perception (Puspitawati et al., 2014), albeit this has not been assessed in tactile aesthetics. Second, the neural substrates underlying information processing in the blind might be different from the sighted to a certain degree due to reorganization of functional architecture of the cortex resulting from visual deprivation compensated by heightened tactile experience. This is supported by the fMRI data of the same study of Karim and Likova (2018) who further examined the brain areas involved in aesthetic judgment and sharpness or symmetry judgment of the aforementioned 3D tactile objects in blind and blindfolded-sighted participants, and demonstrated that both the populations commonly recruited the somatosensory and motor areas of the brain, but with stronger activations in the blind as compared to the sighted. In addition, sighted people recruited more frontal regions (ventro-medial PFC, OFC, and ACC) including the Nacc in the basal forebrain, whereas blind people paradoxically recruited more classic “visual” areas of the brain. These differences were more pronounced between the sighted and the congenitally blind rather than between the sighted and the late blind, indicating the key influence of onset time of visual deprivation on activation level and functional reorganization of various brain regions. As the perceptual and aesthetic tasks used in that study involved touch and hand movements, the recruitment of the somatosensory and motor areas by both the populations was obvious. The involvement of the somatosensory and motor networks in tactile manipulation (Kägi et al., 2010) and tactile perceptual processing, and the involvement of the frontal regions and basal forebrain structure in visual cognitive and/or aesthetic processing, are well known (see above). Therefore, based on their findings, Karim and Likova (2018) theorized that the role of the somatosensory and motor networks in tactile perceptual processing might be universal and independent of visual experience, and that the visual deprivation-driven reorganization of the visual cortex might serve the blind in cognitive and/or aesthetic processing—a role that is generally performed by the frontal regions and the basal forebrain structure in typically developed sighted humans. However, it merits further investigation to see whether the tactile aesthetic brain networks and tactile perceptual brain networks are dissociable in the blind population.

Visual-tactile dissociation in aesthetic and basic perception

The visual and tactile modalities are thought to be inherently capable of representing reverse versions of the same stimuli or objects (Amedi et al., 2007; Auvray et al., 2007; Kim & Zatorre, 2008, 2010). If so, how are they neurally dissociated in aesthetic and basic perceptual processing? To answer this question, the brain regions discussed thus far as underpinnings of aesthetic perception and basic perception in the visual and tactile modalities in healthy sighted humans (see the relevant sections) are summarized and integrated in Fig. 3 (but these are not exhaustive). A comparison of the two sensory modalities shown in this figure indicates that aesthetic perception of visual objects and that of tactile objects share five regions, namely the OFC, cingulate cortex, NAcc, amygdala, and the PFC, a region that also is shared with basic perception in the tactile modality only. The basic perception of visual objects and that of tactile objects share the parietal or somatosensory cortices. Perhaps these shared brain regions work as hubs for potential crossmodal connectivity or crossmodal effects in sensory perception (Hansson et al., 2009; see Karim, Proulx, et al., 2021b; Siuda-Krzywicka et al., 2016; Zangaladze et al., 1999). All other brain regions depicted in this figure are specialized for certain sensory functions, some for visual and some for tactile, indicating that the two sensory modalities are functionally largely dissociated.

Now, a functional comparison of the brain regions within each of the two sensory modalities indicates that aesthetic perception and basic perception of visual objects share the fusiform gyrus, frontal gyrus and the somatomotor cortex, and aesthetic perception and basic perception of tactile objects share only the PFC, indicating a within-modality functional dissociation of brain regions. The involvement of shared brain regions in aesthetic perception and basic perception can be explained by the proposed dual-channel aesthetic model, which posits that aesthetic processing can operate concurrently or immediately after perceptual recognition operating in a preceding stage (Fig. 1). All other brain regions within each of the two sensory modalities are specialized for certain functions, some for aesthetic perception and some for basic perception. This indicates that the two functions are neurally dissociated to a large extent in both the visual and tactile modalities.

An inspection of the brain regions in this figure further shows that the brain regions shared by the visual aesthetic perception and tactile aesthetic perception are mostly the emotion-related brain centers. This suggests that these regions are perhaps commonly responsible for evaluating aesthetic quality or richness of arts or objects irrespective of sensory modality (Blood & Zatorre, 2001; de Araujo et al., 2003; de Araujo et al., 2005; Grabenhorst et al., 2008; Ishizu & Zeki, 2011; Koelsch, 2010; Rolls et al., 2010; Small & Prescott, 2005) and irrespective of stimulus parameter. Here, it can be argued that sharing emotion-related brain centers for a functional purpose, such as to induce aesthetic emotions, do not deny the fact that aesthetic emotions are different from basic or everyday emotions (see earlier in this review). However, the reason for such a sharing as noted earlier might be due to the fact that aesthetic emotions are built out of basic emotions (Xenakis et al., 2012).

Fig. 3
figure 3

Functional organization of brain regions involved in aesthetic and basic perception in visual and tactile modalities of healthy sighted humans (summarized and integrated from different studies as outlined in the text, but these are not exhaustive)

Aesthetic perception versus basic perception in auditory modality

Perception and appreciation of auditory stimuli

Auditory perception, the perception of sound, is a complex process (Baldwin, 2012). The quality of our life is greatly influenced by the quality of sounds or acoustics that we are exposed to everyday. Being an inevitable and integral part of our life, acoustic quality influences us not only while listening to a music and watching a film or movie, but generally every day. Good acoustics have such essential consequences as the feelings of warmth and contentment. They ensure better speech intelligibility, mainly during our conversations that include multiple people, other sound sources (e.g., television) or hearing issues. Acoustically well-designed working environment boosts productivity by reducing stress and anxiety. As outlined earlier in this review, acoustic quality also plays an important role in design aesthetics of interactive technology and overall perception and evaluation of such interactive products as cars and cell phones (Mahlke et al., 2007).

Auditory basic perception, the ability to perceive or detect the difference between sounds, is largely influenced by three major physical properties of a sound, such as loudness (amplitude), pitch (frequency), and timbre (sound quality). Auditory aesthetics, the perceived pleasantness of a sound or sweetness of a voice, is also mediated by these factors. For example, a very loud sound or voice is perceived as unpleasant to listen to against a very soft sound, which is perceived as pleasant. A high-pitched voice is perceived as less pleasant to listen to than a low- or medium-pitched voice (Collins & Missing, 2003). Timbre, a third property of sound, allows the auditory system to distinguish between different types of sound production, such as choir voices and musical instruments. Moreover, the loudness and pitch of a sound also can interact in a complex manner to mediate both our auditory aesthetic perception and auditory basic perception (e.g., higher tones are perceived as higher with increasing amplitude; Baldwin, 2012). However, as explained below, the influence of physical properties of a sound on the perceived auditory aesthetics, such as music aesthetics, does not deny the proposition that auditory aesthetic perception and auditory basic perception are neurally dissociable.

Applicability of the proposed dual-channel visual aesthetic model to auditory modality

Research has suggested that humans are likely to use two cognitive styles—global and local—to process auditory information similar to the styles used in visual and tactile information processing (Bouvet et al., 2011). This leads us to contend that a dual-channel model, similar to the dual-channel model of visual aesthetics (Fig. 1), also can explain the aesthetic process and recognition process in the auditory modality. Whether the aesthetic perception of a music tone or melody will involve the aesthetics-only channel or the perception-to-aesthetics channel depends on someone's expertise and capacity to understand the physical composition of the music tone or melody. We propose that as in the visual modality the aesthetics-only channel of the proposed dual-channel aesthetic model may involve both cognitive and affective processes (Brattico & Pearce, 2013) and operate following a restricted local processing style (see earlier) in the auditory modality when someone is listening to and evaluating a foreign music/melody, such as when a Bangladeshi is listening to and evaluating a Hindi music/melody, that she/he does not understand but can feel and appreciate the pleasantness or unpleasantness of the music tone/melody. A second and more general example can be the greatest FIFA world cup theme song “La La La (I dare you)” or “Waka Waka (This Time for Africa)” by the famous Colombian pop star Shakira that moved people of all cultures, races, religions, languages, and even those who do not understand Spanish and the semantic meaning of either anthem but can feel its hedonic appeal. This indicates that the auditory aesthetic perception does not necessarily depend on auditory basic perception, the perception of physical features of a music tone or melody. This leads us to propose two conjectures. First, perhaps some musics or melodies are intrinsically pleasant and beautiful to move human’s mind regardless of language, culture, race, religion, or complexion. Perhaps music has culture-general cues that allow listeners to identify emotions in music from other cultures fairly accurately although culture-specific features of emotion in music cannot be denied (Swaminathan & Schellenberg, 2015). Second, the human brain is probably genetically and universally programmed to appreciate certain type of harmonized sounds as aesthetically pleasant and others as aesthetically unpleasant. Although perceptual or semantic comprehension of those sounds is not a necessary precondition for such appreciation, it does require attention and thoughts restricted to certain pleasant/unpleasant aspects of the music or melody to be processed under the control of higher-order brain centers. However, this proposition is speculative and merits further investigation and does not necessarily rule out the impact that sociocultural (including language) factors might have on the enjoyment and appreciation of a music or melody (Juslin, 2013).

Conversely, the two-stage perception-to-aesthetics channel likely operates in an individual who has expertise and capacity to understand and recognize the physical composition of a music or melody. In support of this, a review of Zatorrea and Salimpoora (2013) presented findings from cognitive neuroscience, showing that musical pleasures emerge from perception of sound patterns, and that this process likely operates in two interactive stages. The first stage involves perceptual analysis in which auditory cortical circuits encode and store tonal patterns of a music, and the cortical loops between auditory and frontal cortices play important roles to maintain musical information in working memory and to recognize structural regularities (e.g., tempo, tonality, pitch range, timbre, rhythm) in musical patterns (Chanda & Levitin, 2013). The second stage involves analysis of aesthetic and emotional quality (happy, sad, peaceful) that requires cognitive and affective interpretations (Brattico & Pearce, 2013). In this stage, the mesolimbic striatal system, a dopaminergic system, is involved in mediating pleasure associated with music; here, the reward value for music is coded by activity levels in the NAcc, whose functional connectivity with auditory and frontal areas increases as a function of increasing musical reward. Thus the authors suggested that musically induced pleasure arises from interactions between auditory cortical networks responsible for auditory perception and mesolimbic networks responsible for reward and valuation. We propose that these interactions might be crucial for the acoustic information (intensity, pitch, and timbre) and structural regularities extracted from perceptual analysis to be integrated into memory and compared against previous musical aesthetic experiences. The proposition of interactions between the two sorts of networks has been assessed and confirmed in a subsequent empirical study (Martínez-Molinaa et al., 2016). However, these interactions do not rule out their processing independence, as claimed by another study that demonstrated that the pleasant melodies were predominantly processed in the left hemisphere and the unpleasant melodies were predominantly processed in the right hemisphere, with no apparent lateralization in the descriptive (tonal and atonal) judgments of a melody (Gagnon & Peretz, 2000). Thus, as in the aesthetic appreciation of a visual art the aesthetic appreciation of a musical art is dissociable from its structural evaluation.

Although the aforementioned studies do not tell us anything about auditory information processing strategies, we speculate that in line with the two-stage perception-to-aesthetics channel of our dual-channel model for visual aesthetics (Fig. 1) the first stage in musical information processing likely operates using a local or global processing style for perceptual analysis and the second stage using a restricted local processing style for quality or richness analysis. Suppose you are asked to listen to a classical music and rate its aesthetic quality (pleasantness). It is likely that you will perceive all or some of its contents, such as tonal patterns, pitches, melodies, harmonies, rhythm, texture, structural regularities, and expression, perhaps using a local or a global processing style, depending on your music expertise, cultural and personal meaning, personality, as well as your current mood and emotions (Brattico et al., 2017; Brattico & Pearce, 2013; Pereira et al., 2011; Skov & Nadal, 2021; Van den Bosch et al., 2013). For example, if you have high musical expertise, you are likely to be more analytic (Susini et al., 2020) and prefer a local instead of a global processing style (Black et al., 2017; Ouimet et al., 2012; Stoesz et al., 2007). If you are a novice, you are likely to be less analytic and prefer a global instead of a local processing style. However, in either case, your attention is likely to center on certain aspects or elements of the music (e.g., expression, novelty in piece or composition, melodic originality; Juslin, 2013; Juslin & Isaksson, 2014) that appears to be pleasant or unpleasant to you, and this restricted local analysis is likely to play a crucial role in your appraisal of the music quality or richness. The preference for information processing style also may vary across cultures while appreciating an erotic song, for instance. The appreciation of an erotic song may involve perceptual analysis followed by aesthetic valuation. We propose that compared to the individuals of a radical western society, the individuals of a Muslim or any other conservative society might be more analytic about the contents of such a song, and prefer a local-to-global instead of a global-to-local processing style in the perceptual analysis stage. In the aesthetic valuation stage, they are likely to be more sensitive and devote more attention to the erotic or sexy tones and words/lyrics (restricted local processing) that might induce aesthetically negative emotions in them; however, this might not be necessarily the case for individuals of a radical western society. Similarly, compared to a novice, the person who has sad experiences in a prior love relationship might eventually be perceptually more analytic of a sad song, prefer a local-to-global instead of a global-to-local processing style for perceptual analysis, and concurrently enjoy and appreciate the song by being moved and devoting more attention to the sad aspects, such as sad-sounding tones and lyrics (restricted local processing) that associate the similar sad events s/he experienced in the past relationship. Because of its sad-sounding tones, lyrics, and themes, that song is able to engage the person emotionally as if someone is saying exactly what s/he is feeling inside. Sad song is a powerful trigger for nostalgic memories of foregone times. It can be suggested that those memories carry special meaning to that person and contribute to the enjoyment of that song similar to the enjoyment of a sad film or movie (Hanich et al., 2014) by inducing the feelings of being moved (Vuoskoski & Eerola, 2017).

The effect of sad experience on the enjoyment or appreciation of a sad song does not necessarily imply that a novice does not enjoy or appreciate a sad song at all. The extent to which someone will enjoy and appreciate a sad song depends on the felt sadness, which can be mediated by her/his personality, current moods, emotions, etc. (Brattico & Pearce, 2013; Pereira et al., 2011; Skov & Nadal, 2021; Van den Bosch et al., 2013). In support of this, research has shown that individuals who score high on empathy (Garrido & Schubert, 2011) or openness-to-experience (Ladinig & Schellenberg, 2012; Vuoskoski et al., 2012) or introversion (Ladinig & Schellenberg, 2012) or absorption (Garrido & Schubert, 2011, 2013) are likely to enjoy sad-sounding music. It has been further demonstrated that individuals with clinical depression may be especially likely to listen to music that expresses negative valence because it matches their chronic mood state (Wilhelm et al., 2013). Consistently, when individuals are in a sad mood, they are likely to exhibit mood congruency effects: increased liking for sad-sounding music, and increased perceptions of sadness in music that is selected to sound neutral (Hunter et al., 2011). Here, we propose that a person who is governed by any of these factors is likely to use a local-to-global instead of a global-to-local processing style in the perceptual analysis stage and a restricted local processing in the aesthetic valuation stage. In the perceptual analysis stage, a global-to-local processing style might not be feasible (although not completely unlikely; see above) as the listener cannot be exposed to the whole song at once. Thus, while listening to the song s/he is more likely to perceive and evaluate it locally feature by feature moving forward to the global but not the other way round after listenning the whole.

As discussed above, the proposed dual-channel model for visual aesthetics can account for aesthetic experience of a song with wonderful/catchy (or awful/monotonous) melodies and of a sad-sounding song. An outstanding question is: How does the model explain aesthetic judgment of a song with awful/monotonous melodies but great lyrics or a song with wonderful/catchy melodies but bad lyrics? We contend that the aesthetic judgment of such a partially aesthetic song is probably made in a fashion similar to how the aesthetic judgment of a partially aesthetic art/object is done in the visual modality following the principles of “aesthetics-only” channel or “perception-to-aesthetics” channel (see earlier). However, in such a melodic versus lyrical dilemma, the cognitive agent is likely to make an aesthetic preference depending on the resultant impact of the two opposites on elicitation of aesthetic emotions, or by devaluing the song through an active search for negative aspects (approach-reduction or avoidance-increment strategy), or by overvaluing the song through an active search for positive aspects (avoidance-reduction or approach-increment strategy). The devaluation or overvaluation of the song is possibly determined by the cognitive agent’s self-interest or desire. Thus some individuals who enjoy the song might overlook the (bad) quality of lyrics, giving more importance to the (wonderful/catchy) quality of melodies, whereas other individuals who reject the song might overlook the (wonderful/catchy) quality of melodies and give more importance to the (bad) quality of lyrics. Similarly, there might be individuals who enjoy the song by giving more importance to the (great) quality of lyrics and overlooking the (awful/monotonous) quality of melodies, whereas other individuals might reject the song by giving more importance to the (awful/monotonous) quality of melodies and overlooking the (great) quality of lyrics.

Aesthetic perception versus basic perception in other nonvisual modalities

As mentioned earlier, human aesthetics lies in any sensory modality. Yet, since the inception of neuroaesthetic research in the early 2000s, the focus so far has mainly been on visual aesthetics and to some extent on tactile and auditory aesthetics. However, aesthetics in gustatory and olfactory modalities have not been as attractive as those in other sensory modalities for neuroscientific study. Thus far, only a few studies have attempted to explore how gustatory and olfactory information can contribute to multimodal aesthetics of an object or product. Our bodily sensations cannot be always treated as isolated sensory experiences; they often are multisensory and take place in a specific context (Howes, 2005; Howes & Classen, 2014). For example, the taste of tea cannot be separated from its aroma and visual appearance. Second, the gustatory and olfactory information can also account for additional variation in the “attractiveness premium,” which is unaccounted for by measuring visual attractiveness alone (Saxton et al., 2009). For example, although visual cues are considered as strong predictors of overall attractiveness judgments (Sorokowski et al., 2013; Yu & Shepard, 1998), attractiveness also is influenced by someone’s nonvisual cues, such as voice (for reviews see, Hill & Puts, 2016; Pisanski, 2017) and smell (Roberts et al., 2011). It is undeniable that these findings provide some valuable information about the role the gustatory and olfactory modalities play in our aesthetic experience, albeit these are scanty for the development of a rigorous model of aesthetics in these sensory modalities. However, based on the available limited data we discuss below the applicability of the dual-channel model for visual aesthetics to these nonvisual modalities as well.

Applicability of the proposed dual-channel visual aesthetic model to other nonvisual modalities

As discussed above, the proposed dual-channel model of visual aesthetics can be generalized to the tactile and auditory modalities to a certain extent, with some differences in the operation of a local or global processing style and in the underlying neural mechanisms. Now, a final question is: can this model be also generalized to the gustatory and olfactory modalities? This is truly an important question but difficult to answer for two reasons. First, knowledge of shape, size, orientation, texture, and many other physical properties that underlie aesthetics of an object or art (see earlier and above) is primarily acquired through both vision and touch, and knowledge of acoustic/sound properties that underlie aesthetics of a music or melody (see above) is acquired through audition, but none of these stimulus properties that we are exposed to every day can typically be sensed through gustation and olfaction. Second, the proposed dual-channel model of visual aesthetics (Fig. 1) mainly builds on global and local processing styles, the cognitive styles of information processing that have been widely assessed in vision (Kimchi, 1992; Kovács, 1996; Lewis et al., 2004; Love et al., 1999; Navon, 1977; Nayar et al., 2015; Neiworth et al., 2006) and to some extent in touch (Heller & Clyburn, 1993; Puspitawati et al., 2014) and audition (Ouimet et al., 2012; Putkinen et al., 2017; Sanders & Poeppel, 2007), but rarely in gustation and olfaction—the two unique, intertwined, and complex senses (de Araujo & Simon, 2009; Karunanayaka et al., 2015; Small & Green, 2012).

Therefore, we need to be cautious when generalizing our hierarchical model of visual aesthetics to the gustatory and olfactory modalities. We conjecture that the proposed dual-channel model for visual aesthetics also can be generalized to these nonvisual modalities to some extent as research has suggested that humans are likely to use similar cognitive styles, global and local, across sensory modalities (Bouvet et al., 2011). Although this suggestion was based on the findings of visual and auditory modalities, a second study systematically examined global versus local processing across all major sensory modalities (visual, auditory, tactile, gustatory, and olfactory) and demonstrated crossmodal processing shifts not only between visual, tactile, and auditory modalities, but between visual and other nonvisual modalities as well (Förster, 2011). For example, in a set of experiments Förster’s (2011) study showed that global visual perception was enhanced when participants focused on the global composition of food or aromas, and local visual perception was enhanced when they focused on the ingredients of food or aromas. In a different set of experiments, the same study further demonstrated that visually priming the global/local processing styles carried over to respective tasting and smelling. More interestingly, a third study showed that when thinking about the distant future people listen to, grasp, taste, and smell in a more global way compared with when they think about the proximal future (Förster & Becker, 2012), indicating a similar trend in cognitive style across sensory modalities. Thus, it has been evident that the global and local processing styles likely operate not only in the visual, tactile, and auditory modalities but in the gustatory and olfactory modalities as well (Förster & Denzler, 2012). Because our dual-channel model of visual aesthetics (Fig. 1) builds on these cognitive styles and appraisal of aesthetic quality in a sensory modality requires deployment of restricted local attention, we conjecture that a similar dual-channel model can probably explain the basic perceptual and aesthetic experiences in the gustatory and olfactory modalities, albeit specific evidence is needed to make firm conclusion about this.

Thus far, we have shown that the brain systems engaged to represent a sensory stimulus during aesthetic judgment in the visual, tactile, and auditory modalities are different from those engaged during recognition of the physical properties of the stimulus. This also is likely to be for the gustatory and olfactory modalities. Research has shown that selective attention to the pleasantness (hedonic/aesthetic aspect) of taste stimuli increases activation in the OFC and pregenual cingulate cortex, and selective attention to the intensity (basic perceptual aspect) of the same stimuli increases activation in the (anterior) insular taste cortex (Grabenhorst & Rolls, 2008, 2010). Similar effects have been recorded for olfactory stimuli, with selective attention to pleasantness of odors increasing activation in the OFC and pregenual cingulate cortex, and attention to intensity increasing activation in the pyriform cortex and olfactory tubercle but not in the OFC and pregenual cingulate cortex (Rolls, Grabenhorst et al., 2008). Using these findings, we conclude that as in the other three major sensory modalities the aesthetic perception and basic perception in the gustatory and olfactory modalities are also neurally dissociated, and this lends further support to the proposed perception-appreciation independence model (Fig. 1).

Is there a single aesthetic faculty in the brain for different sensory experiences?

According to the proposed dual-channel model, an aesthetic cognition faculty is responsible for human aesthetics whereas a basic cognition faculty is responsible for basic sensory perception. Now, an outstanding question is: is there a single aesthetic cognition faculty in the brain for aesthetic appraisals of different sensory experiences? As discussed above, the proposed dual-channel model can explain aesthetics in all sensory modalities to a certain degree. However, the two analytic channels of the model may not necessarily comprise all the same brain regions across sensory modalities. Instead, they are very likely to comprise both shared and separate brain regions depending on the stimulated sensory modality, the neural projections or networks it has and the nature of inputs feeding the modality. Perhaps the major emotional or affective brain centers involved in aesthetic appreciation are shared by all sensory modalities and by all stimulus properties, whereas brain regions recruited for perceptual recognition are exclusively different across sensory modalities, and across stimulus parameters within a sensory modality. A number of empirical studies provide evidence in support of this. For example, one study has suggested that affective (pleasant and unpleasant) judgments recruit the same core network of OFC, the temporal pole and the superior frontal gyrus, regardless of sensory modality, and that this core network is activated in addition to a number of circuits that are specific to individual sensory modalities (Royet et al., 2000). A study in audition has shown that a network of limbic (e.g., amygdala, hippocampus) and paralimbic structures (e.g., OFC, NAcc, cingulate cortex, temporal poles, insula) are involved in the emotional processing of music (Blood & Zatorre, 2001; Koelsch, 2010). Ishizu and Zeki (2011) found that the medial OFC was activated when participants experienced the beauty of both music and paintings, suggesting a domain-general faculty of beauty. Consistently, studies in gustation (taste) and olfaction have shown robust representation of taste and odor in the insula, frontal operculum, amygdala, OFC, and the cingulate cortex (Anderson et al., 2003; de Araujo et al., 2003; de Araujo et al., 2005; Gottfried, 2010; Grabenhorst et al., 2008; Grabenhorst & Rolls, 2008, 2010; O’Doherty et al., 2001; Rolls et al., 2008; Rolls et al., 2010; Small & Prescott, 2005). As discussed above and earlier in this review, there is strong evidence that the major emotional centers, such as insular cortex, ACC, and OFC also are involved in aesthetic processing in the visual and tactile modalities. This striking convergence between the studies in different sensory modalities suggests that the core emotional centers are modality-independent and can be activated not only by the experience of visual beauty but by other sources of beauty as well. This also leads us to speculate that there might be a common faculty (aesthetic cognition faculty) in the human brain for beauty or aesthetics in all sensory modalities; however, other parts of the brain are also activated because of processing basic sensory elements before aesthetic processing according to the second analytic channel (perception-to-aesthetics) of our dual-channel model of human aesthetics.

Conclusions

This integrative review rearticulates the notion of human aesthetics by critically appraising the conventional definitions, offerring a new, more comprehensive definition, and identifying the fundamental components associated with it. Then, it addresses the recent advances and opportunities in human aesthetic research and explains a number of important but unresolved issues in the current literature. By distilling the literature on visual aesthetics and basic feature perception in this modality, we propose a novel local-global integrative model that comprises two analytic channels: aesthetics-only channel and perception-to-aesthetics channel. The aesthetics-only channel primarily involves restricted local processing for quality or richness analysis, whereas the perception-to-aesthetics channel involves global/extended local processing for basic feature analysis, followed by restricted local processing for quality or richness analysis. This dual-channel aesthetic model is different from all other models of aesthetics on a number of grounds. First, unlike the other models for visual and nonvisual modalities (see earlier in this review) this model considers aesthetic perception as a psychological process which is both behaviorally and neurally dissociable from basic perceptual process. Second, this model is the first to take the local and global information processing strategies into account to explain aesthetic appreciation. According to this model, human aesthetics can be explained primarily in terms of restricted local processing, an information processing strategy differentiated from the so called (extended) local and global processing. We argue that unlike the global and (extended) local processing the restricted local processing is exclusively associated with affective part and greater attentional resources of the top-down route (Conway & Rehding, 2013; Pool et al., 2016). Third, this model explains cognitive and affective processes underlying aesthetics from a novel perspective that can apply to both visual and nonvisual modalities. According to this model, irrespective of sensory modality the process of aesthetic appreciation operates independently of basic perception, but not independently of cognition (Baltissen & Ostermann, 1998; Mirams et al., 2016); perhaps aesthetic appreciation is modulated by aesthetic cognition and basic perception is modulated by basic cognition. These two cognition faculties are different at functional level though not exclusively at neural level; they may share the neural substrates of the same brain regions that might be involved in modulating the cognitive components in both basic perceptual processing and aesthetic processing (Ishizu & Zeki, 2013). However, the brain regions modulating these cognitive functions are not necessarily universal; rather, they do vary across sensory modalities and across the properties of sensory inputs. Finally, this model can account not only for simple and pure aesthetic (e.g., beauty of a rose) experiences but for partial (e.g., a song with catchy melody but bad lyrics) and complex (e.g., a sad song, a scary movie, or a horror film) aesthetic experiences as well. Thus, the model presented in this review can be considered as a unique generalized model that can explain aesthetics in all sensory modalities to a large extent. The propositions of this model remain to be tested in future research directing to the advancement of this emerging field of research.