Introduction

Generally speaking, aesthetics is the study of beauty, as well as its opposite, ugliness. Artistic experience or the arts are regarded by some philosophers as the essence of aesthetics. Nevertheless, most aesthetic philosophers see this discipline as encompassing beauty and ugliness as a whole. The term “aesthetics” was first used in 1750 by Alexander Baumgarten (1714–1762) to refer to a science of sensory perception which focused on beauty in particular. Philosophers have also discussed beauty for thousands of years. Some sources from ancient Greece and China both comment on “good” and “bad” music.

As painting and drawing are fundamental parts of a culture, music plays a similar role. Since ancient times, music has been ubiquitous in human culture. But what does make music so special to us? Throughout our lives, music plays a vital role. Emotions are evoked by music, and music is routinely used to regulate moods and emotions. All of these results point to music’s powerful influence, either explicitly or implicitly. A huge amount of money would not be spent on music if this wasn’t true.

The human behavior of listening to and playing music has existed since prehistoric times. Music is often valued for the emotions it generates, and listening to music can boost mood and increase well-being. There may be a reason why people listen to music every day. As music is abstract and subjective, we cannot quantifiably understand what sustains us in it. In recent years, advances in neuroimaging have fueled empirical studies on what makes music so enjoyable. It is our belief that musical pleasure is the result of interactions between the sensory, cognitive, and emotional systems, as well as reinforcement circuits. Moreover, music listening recruits large-scale brain networks in regions of the cerebral cortex, subcortical areas, and cerebellum that are involved in audition, motor imagery, and planning as well as emotion.

However, our responses to music aesthetic depend on many “internal factors” such as internal state, mood, personality, and attitude. In addition to the physical and social environment, also called the “external context”, the factors that affect listening include whether the listener is alone or in a group- being in a concert hall, for example.

Music aesthetics can be studied from two perspectives: psychological levels of well-being and enjoyment and its physiological and neurological correlates. The two perspectives are conceptually distinct, but they are inextricably bound with the neurological level providing the ultimate explanation for the aesthetic emotion of enjoyment. In the following sections, we dive into the “neuroaesthetics of music” in a deeper sense and explore why music is so appealing to us.

A Historical Overview of Neuroaesthetics of Music

Since the commencement of experimental psychology, the question of how music provides an aesthetic experience has been studied using scientific approaches. For an understanding of the profound and extraordinarily radical change brought about by this new discipline of neuroaesthetics of music, it is important to remember some of the beginnings of music aesthetics studies.

Hermann von Helmholtz (1821–1894) (von Helmholtz, 1863) inspired by Eduard Hanslick (1825–1904) (Hanslick, 1891), initiated a new phase of empirical studies on musical perception by associating the aesthetic characteristics of musical notes and scales with their psychoacoustic features (particularly frequency ratios between partials of complex tones). Besides, Wilhelm Wundt (1832–1920), a pioneer of experimental psychology and a former assistant of Helmholtz in Heidelberg, used a psychological approach to aesthetics, reflecting on how his own feelings of pleasure, excitement, and exuberance changed with the tempo of a metronome, for example. Wundt (Wundt, 1912) also showed that physiological arousal is related to the stimulus complexity, and that aesthetic pleasure is highest at intermediate levels of complexity.

Music psychologists of the first half of the twentieth century, such as Farnsworth, Hevner, Schoen, and Seashore studied the affective response to music. A wide range of data showed that music can evoke deep and complex affective responses, but direct causal relationships between music variables and specific affective responses were difficult to establish. Such studies had a very low predictive power. However, data revealed that affective responses to music were often highly idiosyncratic to the individual listener and their musical enculturation, as well as to the particular listening or testing conditions: the subjects brought mood states or arousal needs to the listening condition, for instance.

It was during the second half of the twentieth century that music research made significant breakthroughs in aesthetic theory. A few new developments in music and meaning theories were influenced by two highly influential theories proposed by Leonard Meyer (Meyer, 1957) and Daniel Berlyne (Berlyne, 1957) between the 1950s and the 1970s.

Meyer (1918–2007) proposed a model of how music evokes meaningful affective responses in the audience or performer. According to Dewey’s conflict theory of emotion, emotional responses result from inhibited responses. Compositional schemes that are continuously delayed, inhibited, and resolved are used to build expectations in music, creating a complex, woven sound architecture that builds musical structures of anticipation–tension–resolution through continuous delays, inhibitions, and resolutions. It is through the interplay between anticipation, tension, and resolution that suspense evokes arousal—particularly during the tension phase, i.e., a temporary inhibition of expectations—and a quest for resolutions that is resolved within the musical structure.

Daniel Ellis Berlyne (1924–1976) developed his idea into an inverted U-shaped function linking a stimulus’s “arousal potential” with its “hedonic value”, such that intermediate degrees of arousal correspond to maximum pleasure, and attempted to identify how stimulus properties (such as complexity, familiarity, novelty, and uncertainty) influence aspects of the aesthetic experience such as arousal, pleasure, and interestingness in his new experimental aesthetics.

Recent decades have seen an explosion in neuro-musical research from these early explorations of aesthetics. A relative flood of publications has emerged in recent years from pioneering efforts that began in the 1800s.

Music aesthetics underwent its most fundamental change over the last 25 years of the twentieth century when brain imaging techniques became an entirely new process of studying the human in vivo as the brain performs complex cognitive functions. And in recent years, a new field of scientific inquiry, cognitive neuroscience, has been established through the use of positron emission tomography (PET), functional magnetic resonance imaging (fMRI), and electroencephalography (EEG). These technologies provided unprecedented access to the brain basis of human cognition and music aesthetics. As a result of brain imaging techniques, research directions have changed dramatically, and the cognitive science of music embedded in music aesthetics has virtually been reinvented. With advances in brain imaging technology, scientists now have the ability to map and track connections and directional influences between brain regions while listening to music rather than relying on static topographical maps or liking and disliking questionnaires.

From Neuroscience to Neuroaesthetics of Music

Why do we enjoy music so much? People appreciate music largely for its aesthetic qualities: the emotions it evokes, the memories it brings, and its beauty per se. Listening to or performing music, like other aesthetic domains such as visual art, architecture, or dance produces aesthetic experiences that also include emotional responses and evaluative opinions of beauty, aesthetic quality, and liking when combined with a favorable environment and listening situation. Many argue that neuroscience has nothing to tell us about aesthetic questions. Nevertheless, since the beginning of experimental psychology, the question of how music provides an aesthetic experience has been studied using scientific approaches.

Box 1: What Do We Mean by Aesthetic Experience of Music?

We define an aesthetic musical experience as one in which the individual immerses herself in the music, devoting her attention to perceptual, cognitive, and emotive interpretation based on the perceptual experience’s formal qualities. We identify three key outcomes: first, emotion perception and induction (e.g., “this song is sad”); second, aesthetic evaluation (e.g., “this song is lovely”); and third, liking (e.g., “I like this song”) and preference (e.g., “I adore rock & roll”). However, not all of these consequences may be present at the same time, but they usually combine to make a real aesthetic situation. An aesthetic experience does not require the presence of a music-induced feeling.

Neuroscience belongs to physiology, with its final goal of examining human behavior’s neural basis. There are multiple reasons why scientists study music aesthetics through neuroscience lenses. First of all, musicians provide a useful model for neuroplasticity in the brain because their performance involves an online integration of sensory and motor information. In other words, neural changes (measurable non-invasively with sophisticated brain imaging tools) occur after intensive and extended exposure to a specific stimulus environment. It is also possible to distinguish between innate predispositions and those of training or exposure by studying music, a multidimensional complex stimulus that is not equally familiar to all humans (if we take into account musical expertise, which is easier to discern than language expertise). The milder semantic level may also allow for the separation of the effects of training or exposure from those of innate predispositions. Last but not least, music has always existed in human societies (as evidenced by prehistoric musical instruments discovered by paleontologists), making it unique among human entertainment activities.

Several studies have been conducted on the neural bases of pitch, timbre, and rhythm perception. The processing of sounds in the central nervous system as well as how they are organized, comprehended, and “felt” by each individual as the coherent unity known as music may then be scientifically justified through combined research in the neurosciences of musical appreciation and production. To put it another way, we could add scientific justifications to the anecdotal evidence and brilliant philosophical arguments that show us the unifying and expressive power of music, which likely evolved over the long course of human evolution.

Having said that, a question arises here: where should we begin to understand music aesthetics from a neuroscientific perspective? To answer this question, we could say that to better understand musical appreciation, we must understand the neural basis for sound perception and the neural underpinnings of the musical emotions. In the following section, we will have a visit to different aspects of music aesthetics and the human brain in detail.

Emotions in Musical Experience

The neuroscience of music has benefited from an increase in neuroscientific interest in affective processes. However, research has focused on the most common emotions that people experience in everyday life, such as happiness and sadness, and their role in mood regulation. So, we categorize musical emotions either in “basic emotions” (Universal emotional experiences, acknowledged across cultures and required for species survival) or general dimensional models of emotion. However, it has been proposed that music induces feelings that are qualitatively different from goal-oriented, common emotions, despite the fact that there has been little neuroscientific study on such aesthetic emotions to date.

Basic Emotions: Work on the categorical perception of facial emotion has sparked a lot of interest in music and emotion studies. This concept highlights basic emotions such as happiness, sorrow, anger, fear, and disgust, which are suggested to be recognized across all cultures and related with intrinsic motor and physiological responses. Music may convey and induce these basic feelings in people of all ages, including infants, and across many cultures, albeit negative emotions lose some of their unpleasant quality in a safe aesthetic context.

Brain Regions and Process of Basic Emotion in Music: The amygdala is crucial in the processing of significant negative emotions, particularly fear, caused by unpleasant stimuli. The amygdala appears to be an important brain region for fear perception and recognition in music. How do we know? The answer lies in the fact that patients with a medial temporal ablation that included the amygdala, as well as one with bilateral amygdala injury, misinterpret scary music as tranquil music while displaying intact perceptual skills. Besides, lots of evidence show that the amygdala is activated by sad and inharmonious music—compared to emotionally neutral and consonant music, respectively, and even by single random chords. Sad emotions linked with slow minor classical piano music engaged the left medial frontal cortex and the adjacent superior frontal gyrus when compared to happy major and rapid pieces. The ventral striatum and left superior temporal gyrus are also candidates for music associated with good emotions (for example, happiness). This region is thought to be a portion of the non-primary auditory cortex, which is responsible for integrating sounds across longer time spans than the primary auditory cortex and hence processing more abstract features of sounds.

Dimensional Models of Emotion: Dimensional models try to find a set of dimensions that can represent all of the conceivable emotional states. The dimensional structure should, in theory, be rich enough to represent the fundamental emotions as points in space. The circumplex model, which divides valence (pleasure–displeasure) and arousal (activating–relaxing) as two orthogonal dimensions of an emotional experience, is the most generally recognized dimensional model of emotion. Many behavioral and neuroscientific investigations have used this approach to study music. The three aspects of valence (pleasant–unpleasant), arousal (awake–tired), and tension (tense-relaxed) are included in a variation of this approach.

This model has been also applied to music. For instance, loudness and tempo were found to increase arousal and tension ratings, while loudness and pitch height upsurges pleasantness. In fMRI studies, it has been shown that while people listen to classical music some modifications in arousal and valence were reflected by changes in activation in the reward and limbic systems (including the striatum, ventral tegmental area, and orbitofrontal cortex for valence, and the ventromedial prefrontal cortex and the subgenual cingulate, for arousal) along with additional effects in brain areas related to memory, motor control, and self-reflective processes.

In simple words, for example, when one listens to energetic music, surge in electrodermal activity, which is generated by the sympathetic autonomic nervous system, is greater than when one listens to relaxing music.

It should be noted that listening to music might trigger other kinds of emotions (which are called aesthetic emotions) such as awe, nostalgia, and enjoyment. For instance, you have experienced that while you are listening to a music piece of the past autobiographical memories of the song cause a kind nostalgia. Janata (2009) featured artists who had a long history of working in the pop/rock genre. Individual judgments of autobiographical relevance and changes in brain metabolism revealed that the dorsal areas of the medial prefrontal cortex are critical for experiencing music-induced nostalgia.

Another fMRI study was conducted by Bogert et al. (2016) modulating the visual instructions related to 4-s music clips, asking participants to pay attention to either the number of instruments playing (implicit condition) or to categorize the emotions conveyed by the music explicitly (explicit condition). Contrary to the explicit condition, the implicit condition (contrasted with the explicit one) of listening to music involved bilateral activation of the inferior parietal lobule, premotor cortex, and reward-related areas such as the caudate (dorsal striatum) and ventromedial frontal cortex. During a listening mode that explicitly judged the musical emotions expressed in clips, dorsomedial prefrontal and occipital areas, previously associated with emotion recognition, were active (Fig. 1).

Fig. 1
6 scans of the brain depict the superior view, sagittal, and others that are responsible for music perception, music production, and music imagery. The various areas are highlighted.

Brain regions which are involved in different aspects of music Borrowed with permission from Pando-Naude et al. (2021)

Now, how do we measure the aesthetic emotion of enjoyment in music neuroscience? Most of the time, it has been examined by focusing on the chill reaction. Chills, also known as tremble or thrills, correspond to physiological changes such as goose bumps and shivers down the spine, and are possibly the most well studied aesthetic experience of music. Although not everyone gets goosebumps when listening to or playing music, those who do tend to have them all on a regular basis. Chills have the added benefit of eliciting physiological signals such as changes in heart rate, breath depth, and skin conductance, in addition to being easy to track behaviourally.

How Does Music Generate Emotions?

The relationship between music and emotion is a much-debated subject that has been at the heart of musical aesthetics since the late eighteenth century. In eighteenth- and nineteenth-century Europe, the advent of instrumental music and, later, program music sparked a debate between referentialists like Hegel (1770–1831) and Wagner (1813–1883) and formalists like Hanslick (1825–1904) and Stravinsky (1882–1971). The discussion focused on whether music has referential content, meaning that musical patterns can indicate nonmusical elements like physical objects, individuals, or feelings. This led Hanslick to argue that music’s aesthetic function is not to elicit emotion. In other words, music cannot convey distinct sentiments (which have objects) because it cannot reflect the thoughts that underlie these feelings; it can represent dynamic variations in intensity of such feelings, but not as attributes of specific emotions because other phenomena share such dynamic changes as well. Hanslick, meanwhile claims that music’s aesthetic qualities are unique to music.

According to Juslin and Västfjäll (2008), emotion perception, in which a listener detects or recognizes emotions represented in music, is distinguished from emotion induction, in which music generates an emotion in the listener. The induction of arousal by sudden, harsh, discordant, or fast pulsating noises is first mediated via brainstem responses (originating from locations such as the inferior colliculus). Second, in constructive conditioning, music can elicit emotion by associating with an unpleasant or rewarding sensation as a conditioned stimulus. Third, musical patterns can cause emotional contagion by imitating other forms of emotional expression including language, posture, and movement. Fourth, music can evoke emotions by using structures in the sensorium that have close external referents, resulting in visual imagery (e.g., a storm). Fifth, music can elicit emotion by triggering an episodic memory associated with it (“Darling, they’re playing our tune”). Finally, creating and breaking expectations can lead to feelings of tension, release, and surprise.

Therefore, concisely, the psychological mechanisms most particular to musical aesthetic perception, as well as those most investigated by neuroscientists are brainstem systems (such as those that produce dissonance), emotional contagion or imitation, and expectancy. In addition to these structural issues, neurochemical research has also identified neurotransmitters’ impact on music listening’s affective aspects. Combining psychophysical, neurochemical, and hemodynamic effects may uncover peaks in autonomic nervous system activity, which can explain music’s mood-enhancing effects. Researchers have found that listening to highly pleasurable music triggers the release of dopamine in the mesolimbic striatal system, as well as sensory areas for auditory reception, using ligand-based positron emission tomography—using radioligand raclopride, which binds to dopamine.

Areas Involved in High-Level Temporal Sequencing

The ability to appreciate music requires recognizing patterns based on structural information. With incoming information, this process is continuously updated, refined, and revised. The frontal cortices of the brain, such as the IFG (Inferior Frontal Gyrus), are typically hotspot of this operation. There may be a possibility that the STG (Superior Temporal Gurus) and IFG (Inferior Frontal Gyrus) are co-activated in order to process various aspects of music in a collaborative manner. Interestingly, it has also been demonstrated that white matter connectivity in this pathway contributes to the ability to learn syntactic structures in the auditory area. How do we know all these? As a matter of fact, there is logic behind these claims: A loss of a function is proof of its existence. For instance, it has been observed that people with congenital amusia who show deficits in music perception have disrupted STG-IFG pathways.

Moreover, part of music processing is carried out in the superior temporal cortex (STC), which houses both primary and secondary auditory areas. This region is also a place where pitch, extraction of pitch, and tonal relationships are processed. It should be noted that this region accumulates sound templates over time during our life. For instance, when TMS (Transcranial Magnetic Stimulation Device) stimulates the STC region of the brain, musical hallucinations are elicited, and increased activity in this area is linked to imagery and familiarity with music.

Sensory Dissonance

Sensory consonance and dissonance, which have long been used by composers and performers in Western and non-Western cultures to alter aesthetic responses to music, have received the greatest attention in neuroaesthetics of music studies. Dissonant sounds are experienced as beating amplitude modulation or roughness for the listener and in this situation the basilar membrane stimulates nearby hair cells, causing neurons in the cochlea nucleus and brainstem to fire without adequately resolving the two sounds. The signal stimulates neurons in the primary auditory cortex to vibrate at the beat frequency, resulting in higher neuronal activity than consonant sounds. These sensory responses to dissonant sounds are accompanied by an affective feeling of irritation, which appears to have a neurological basis in the parahippocampal gyrus, a brain region linked to withdrawal behavior, and the amygdala, which is linked to salience and negative affect.

Trost et al. (2012) discovered a clear lateralization of the parahippocampal gyrus while listening to emotional classical music: highly arousing music activated the left parahippocampal gyrus, whereas tender and nostalgic music with low arousal activated the right parahippocampal gyrus. Consonance, on the other hand, is commonly regarded as the absence of dissonance, although other researchers believe it is a more active process involving reward regions in the brainstem and ventral striatum.

Box 2: Dissonance in music is defined as discordant sounds or a lack of harmony. When the two notes of this staff are performed at the same time, the result is a dissonant sound.

Musical pleasure is associated with patterns of tension and resolving that arise from the confirmation and violation of perceptual expectations that we are normally unaware. In other words, the music’s prior context (e.g., the next note’s pitch in a melody, the next chord in a harmonic movement pattern, or the next note’s time) creates expectations/predictions in the listener’s brain about what will happen next. For instance, in an experiment done by Steinbeis and his colleagues (2006), they realized that unexpected chords create more physiological excitement than expected chords, as measured by skin conductance.

Aesthetic experience without making an aesthetic judgment could be possible in some circumstances. According to some findings, regions in the prefrontal regions of the brain, notably the dorsolateral prefrontal and orbitofrontal cortex are active when we do make an aesthetic judgment, such as when we decide that an object is beautiful. Numerous neuroimaging studies of musical listening support the orbitofrontal cortex’s function in the pleasant emotional experiences connected to aesthetic evaluations of musical preference or beauty. Moreover, In contrast to dissonant chords evaluated as ugly, consonant chords assessed to be beautiful activate the dorsomedial midbrain nuclei, which are part of the dopaminergic reward circuit of the brain (Suzuki et al., 2008). This is true regardless of whether the chords are in the major or minor key.

The Musical Preference and Aesthetic Experience

Preference, which differs from enjoyment or subjective pleasure in that it involves making a choice about the stimulus as a whole is another significant result of the musical aesthetic experience. Such a choice could stick to the person for a very long time and usually the preference activates after finishing a song in its entirety. This choice may be made based on the degree of enjoyment, an aesthetic evaluation of the stimulus’s beauty or other formal qualities (however it may also differ and be independent of such an evaluation), and other intrapersonal considerations such as the listener’s past habits, existing mood, or personality. For instance, the more we are touched by music, the more we prefer it, according to Schubert’s (2007) research, which showed that inducing either a good or a negative feeling through music predicts preference. Another predictor of preference was shown by Vuoskoski and Eerola (2011). They realized that when listening to gloomy music, people with high trait empathy tend to appreciate it more and experience greater sadness than those with low trait empathy.

Musical preference appears to activate lateralized brain networks. Altenmüller and colleagues (2002) in a groundbreaking electroencephalography (EEG) study discovered left lateralized frontotemporal activations of the brain when listeners preferred classical, pop, or jazz excerpts lasting 15 seconds. It should be noted that when they disliked those kinds of music, they showed right-lateralized anterior responses (neutral music produced bilateral brain responses). In a subsequent EEG and fMRI study, desired 30-s excerpts by Bach and Mahler similarly activated left-hemispheric regions, including Heschl’s gyrus, middle temporal gyrus, and cuneus, whereas disliked excerpts by a contemporary composer produced brain responses in the bilateral inferior frontal gyrus and insula.

There are lots of experiments that emphasize on the role of familiarity in liking and preference of music. They believe that where enjoyment and related liking judgments increase with increasing exposure (the mere exposure effect). Pereira et al. (2011) report activation of limbic and paralimbic areas including the nucleus accumbens to familiar music (contrasted with unfamiliar music), but only minimal activation when contrasting liked musical pieces with disliked ones, regardless of familiarity, demonstrating the close relationship between familiarity and hedonic musical experiences. These results imply that one of the major influences on the brain’s emotional and hedonic reactions is familiarity.

Another factor that is central to musical aesthetic experience is attention. In other words, the listener must focus on the music to appreciate the emotions and memories evoked by it, to judge whether it is attractive or well performed, and to determine its aesthetic value. The superior parietal lobule, the precuneus, and other parietal structures associated with the ventral network are involved in stimulus-driven attention.

Musical Expectancies

In order to appreciate tension and release, a listener must perceive musical relations within a hierarchy of tonal stability. As an example, moving away from a tonal center to unstable chords (or keys) is perceived as tensing, while returning to the stable tonal center is perceived as relaxing. It is also possible to produce tension by using dissonance as well as tones (or chords) that are harmonically unrelated to the context of the music. In fact, Tonal music appreciation depends heavily on the interplay between expectations as they unfold over time, and how they are fulfilled or violated.

Do Musicians and Non-Musicians Respond Differently to Musical Sounds?

Meta-analyses demonstrate an enlarged volume of gray matter in the temporal lobe for primary and non-primary auditory regions in music experts with skillful listening abilities. Additionally, neurophysiological measurements indicate that musicians’ neuronal assemblies respond to sounds and errors of sounds faster and stronger than non-musicians’. It should be noted that skillful listening requires cognitive mastery, a crucial stage of information processing that produces judgments and emotions based on knowledge and understanding. Professional musicians, for instance, have different cognitive strategies and assisting representations of music than those who are not professionals. As a result of training and practice, they show auxiliary mental representations of music, and use more complex neuronal networks than nonprofessionals, including an activation of the left hemisphere attributed to the recruitment of inner speech by automatically naming pitches and harmonies.

Shortcomings and Deficits in Study of Neuroaesthetics of Music

Although music neuroscience is now considered an independent sub-discipline of cognitive neuroscience, music neuroaesthetics is still in its early life. Having checked some articles published in this realm, you notice that many studies have been done on the brain impact of musical competence on perceptual and cognitive skills, but only few studies have looked at aesthetic or emotive judgments. Even psychologists tend to avoid investigating aesthetic reactions to music and instead focus on more mundane factors such as “preference” in music psychology. It should be noted that we consider preference to be an important and necessary factor, not a sufficient component of the aesthetic experience of music.

The Epilogue

Although there exist loads of evidence on underlying neuronal mechanism on music aesthetics, there are still ambiguities in how these systems function in the brain. Moreover, the reader should be satisfied at this stage regarding the significance of neuronal mechanism of the brain while these systems are used for our music enjoyment and other aesthetic aspects of music. Regarding experimental endeavor in neuroaesthetics of music, more research is also required to decide in more depth where, why, and how to execute these music aesthetic experiments.

Lastly, it should be noted that the presented chapter on neuroaesthetics of music does not intend to enforce this idea that there are only these brain networks involved in music aesthetic. It was aimed to illuminate a schematic picture for the new researchers who have come from fields other than neuroscience or psychology and want to be familiar with studies in music neuroaesthetics. It is believed that this chapter can help newcomers from non-neuroscientific fields to know more about music aesthetic and its underlying brain mechanisms.