Music information visualization and classical composers discovery: an application of network graphs, multidimensional scaling, and support vector machines

This article illustrates different information visualization techniques applied to a database of classical composers and visualizes both the macrocosm of the Common Practice Period and the microcosms of twentieth century classical music. It uses data on personal (composer-to-composer) musical influences to generate and analyze network graphs. Data on style influences and composers ‘ecological’ data are then combined to composer-to-composer musical influences to build a similarity/distance matrix, and a multidimensional scaling analysis is used to locate the relative position of composers on a map while preserving the pairwise distances. Finally, a support-vector machines algorithm is used to generate classification maps. This article falls into the realm of an experiment in music education, not musicology. The ultimate objective is to explore parts of the classical music heritage and stimulate interest in discovering composers. In an age offering either inculcation through lists of prescribed composers and compositions to explore, or music recommendation algorithms that automatically propose works to listen to next, the analysis illustrates an alternative path that might promote the active rather than passive discovery of composers and their music in a less restrictive way than inculcation through prescription.


Introduction
have recently surveyed a large amount of the literature that focuses on the unique link between musicology and visualization by classifying 129 related works according to the types of data that were visualized and the visualization techniques that were applied to respond to certain research inquiries. The intersection of musicology and visualization brings a diversity of innovative applications designed for a variety of purposes. As Khulusi et al. (2020) explain, musicologists are served by interactive tools to analyze musicological data, on the one hand, and on the other, applications are tailored to the broad public with the aim of communicating and teaching aspects of music in a more intuitive, playful manner.
According to Magnuson (2008), the Common Practice Period (1650-1900) of Western classical music is a macrocosm in that rules and practices were comparatively unified and composers could be described as belonging to comparatively long periods such as the late Renaissance, Baroque, Classical or Romantic. By contrast, the music of the twentieth century is both pluralist and microcosmic in that, throughout, composers explored more personal and individual approach to music creation, forming their own microcosms or 'small universes'. This article visualizes on maps both the macrocosm of the Common Practice Period and the microcosms of twentieth century classical music. This article falls into the realm of an experiment in music education, not musicology. The ultimate objective is to explore parts of the classical music heritage and stimulate interest in discovering composers located on 'maps' in the vicinity of better-known composers by proposing, first, network graphs tracking their interconnections based on personal (composer-to-composer) musical influences, and second, multidimensional scaling graphs that locate composers' relative positions while preserving the pairwise distances among them. In this last case, the distances originate from a matrix of composer similarity indices computed on the basis of musical (personal and style) influences and 'ecological' data (or composers' features). 1 In an age offering either inculcation through lists of prescribed composers and compositions to explore, or music recommendation algorithms that automatically propose works to listen to next, this article shows an alternative path that might promote the active rather than passive discovery of composers and their music in a less restrictive way than inculcation through prescription. All visualization techniques were developed using Python programming and libraries.
The article applies these visualization techniques to The Classical Music Navigator, thereafter referred to as CMN, a website created by Charles H. Smith (2000), and available at http:// people. wku. edu/ charl es. smith/ music/. 2 C. H. Smith created it as a reference work 1 Musical influences are of two types: personal and musical style influences. The word 'personal' clarifies that the musical influence on a composer originates from another person who happens to be a composer (i.e., a composer-to-composer musical influence). Musical style influences are defined in Section "Composer similarity indices and heat maps", which also introduces 'ecological characteristics' of a composer, that is, specific features (instead of influences) that describe a composer and his or her ecological niche. 2 Charles H. Smith (B.A., M.A., Ph.D., M.L.S) is Professor Emeritus of Library Public Services at Western Kentucky University, Bowling Green). His research has primarily involved bibliography and bibliometrics, collection development, history and philosophy of science, biogeography and biodiversity, evolutionary theory, music history, and general systems theory. By formation an academic geographer, he is also a professionally trained statistician with considerable expertise and experience in the statistical characterization of complex ecological and historical systems, both natural and human-organized. As for the CMN per se, as described therein, "[t]his project was originally conceived by Charles H. Smith (…) in 1993, at which time data collection has begun. Dr. Smith personally collected and integrated all of the basic information represented here, but eventually enlisted two additional individuals to assist him in finalizing Version One of and experiment in music education. According to the CMN, it consists of five compilations of material: (1) An alphabetically-arranged, 'Composers' list containing basic data, major works, and influences of 500 individuals (composer-to-composer influences); (2) A 'Basic Library' list of works culled from this composer list (and re-arranged by musical genre); (3) A 'Geographical Roster' in which the names of the 500 composers are listed under the names of the countries with which they were (/are) associated; (4) An alphabetical 'Index of Forms and Styles' listing the names of composers associated with each subject entry; and (5) A 'Glossary' of terms used in the CMN. Although the CMN is a 'short list' of 500 composers, the initial pool was a 'long list' of 750 composers. The long list was then shortened to 500 on the basis of 'objective' criteria to avoid mere advocacy or subjective preferences. 3 Overall, the CMN aimed at reflecting a composer's status (at the time it was put together). The 500 individuals are those who scored highest on a combination of eleven variables such as, among others, the length of a composer entry in the Grove's Dictionary of Music and other catalogs, the total number of recordings referring to each composer, the total number of recordings over the past five years (as of c. 2000), the holdings of sheet music and other items in 50,000 libraries in the U.S. and worldwide (through searches in the OCLC WorldCat database). For further details, see the CMN website section on Statistics. 4 The CMN would be more likely to meet its original educational objective if it incorporated tools making use of the ability of the human visual system to identify patterns and trends, which we explore in this article. In the next section, we discuss the use of social networks for analysing musicological data and describe the development and collection of the CMN personal musical influence database. Section "An application of network graphs" applies standard network graph theory to the CMN influence database. Section "Composer similarity indices and heat maps" illustrates the method underlying the computation of similarity indices and then Section "Multidimentional scaling (MDS) and classification maps" applies multidimensional scaling (MDS), a technique that transforms the composers' similarity/distance matrix into MDS maps. The section also uses a support-vector machines algorithm to group composers into several classes, while several data-fitting parameters are used to explore further the music styles of the twentieth century. The final section concludes and discusses a series of issues related to natural language processing as applied to music information discovery, collaborative filtering, recommender systems based on convolutional recurrent neural networks, and the ability of these systems to surprise pleasantly a user (serendipity).

Social networks and classical composers
Although composers typically compose music alone, creative work also depends on influences, interaction, and collaboration. Indeed, certain periods and places are considered hotspots of creativity where new musical ideas are shared and movements arise. For McAndrew and Everett (2015), "[t]alent and status attract network connections, but networks may independently foster creative output. Composers acquire tacit as well as formal musical knowledge from networks of teachers and peers; they build on this to create their own styles and music innovations. They use social networks to signal their own and others' talent to patrons, agents, concert promoters and publishers." Influences are not strictly limited to a composition teacher or even social interactions between peers. As argued by Pfitzinger (2017), even if one of the most personal and profound influences is that of individual composition teachers, in general, the character or style of composing is made up of accumulated influences of countless other musicians and composers. As Conte et al. (2014) wrote in a eulogy for Conrad Susa "[a]rtists may or may not have children, but in the words of Plato, they produce 'eternal progeny'. These progeny are their musical works. They also aid in the creation of the works of others." Hence, a composer is also a product of all the musical influences they receive throughout their lives, for example, when studying scores or performing music of others. As further argued by Georges (2017), "Western classical music evolved gradually, branching out over time and throwing off many new styles. This overall development is not due to simple creative genius alone, but to the influence of past masters and genres, as constrained or facilitated by the cultural conditions of time and place". Smith and Georges (2014) even propose studying similarities between composers through their common and distinct composerto-composer musical influences, assuming that a greater number of common influences is likely to be reflected in more similar musical styles. Their approach permits indices of similarity between pairs of composers to be computed, as reported in Section "Composer similarity indices and heat maps".
Besides the fact that influences and networks are important to the development of creative works, the analysis of networks helps us focus on questions such as the composers who are highly influential, innovative, pure information sinks, at the periphery of their group of contemporaries or who stand between groups, schools, or periods, giving the network connectivity and cohesion. Furthermore, such analysis may benefit music historians and composers wishing to identify the pedagogical influences of particular composers on their students, whether to describe particular schools of composition, or trace compositional lineages and examine how a composer might fit into a compositional family. As the composer Pfitzinger (2017) puts it, "[h]ow did my teachers' teachers influence them? And their teachers? And further back? If I am a compositional descendant of Beethoven or Mahler or Widor or Chadwick, has their compositional style affected me? Can I see it in how I write or the types of piece I choose to compose?" Network theory can be used efficiently to trace genealogies and lineages in the form of 'shortest paths' from one composer to another, as described in Section "An application of network graphs". For a network analysis of teacherstudent connections, see the study by Jänicke and Focht (2017) based on musicological data from the Bavarian Musicians Encyclopedia Online (Bayerisches Musiker Lexikon Online, BMLO) that now includes about around 28,000 musicians. The data used in this study is an ambitious digital humanities effort more akin to the preservation of musical heritage than an educational one as we try to pursue here.
Here we use the data on composer-to-composer musical influences collected by Smith (2000) (see the Composers page of the CMN) and apply standard graph theory to visualize and study the network of all 500 composers. For articles that discuss social networks in other musical worlds, see McAndrew and Everett (2015) and Crossley et al. (2015), among others. 5 As the CMN database reflects more than six centuries of classical music, it enables us to include the composers of both the Common Practice Period (1650-1900) and twentieth century music. Unlike Pfitzinger's (2017) database of composers, which focuses exclusively on actual 'teacher-student' relationship among composers, the CMN network includes all reported composer-to-composer musical influences. The CMN website gives some information on how the list of influences was developed. In Version '2.1', the influences noted were based on more than 20,000 opinions derived from biographical and dictionary sources, and dissertations and other databases available online including information retrieved from liner notes for recordings, programme notes and reviews. For the electronic searches, the opinions were typically retrieved by entering pairs of composers' names in online search engines together with the word "influence". 6,7 On average about five opinions/sources were used to corroborate a specific composer-to-composer influence, on the assumption that an influence that is mentioned in five different sources is likely to be a real one. However, the five-times-mentioned influence could not be an absolute standard. In particular, for lesser-known/lesser-studied composers, a lower standard had to be accepted. 8 At the end, there is a trade-off in choosing the relevant number of sources corroborating an influence, between Type I versus Type II errors, that is, errors of commission (including an influence that should not have been included in the list), and errors of omission (excluding an influence that should have been included in the list). 9 5 Of course, networks of composers do not need to start with composer-to-composer influences as their primary inputs. Besides the teacher-student connections of Jänicke and Focht (2017), see also Park et al. (2015) who use pairings of composers on CDs as their primary input for building their network. The clear advantage of this approach is the ease of establishing 'ties' between composers on the basis of easilyaccessed and large CD databases, but the nature or inherent quality of these ties is somewhat elusive. 6 This might seem a somewhat inefficient way to collect information on the web. According to C.H. Smith it represented a one-person one-year long full-time job. Remember however that the CMN was put together well before the introduction of technologies like natural language processing and the current efforts in digital humanities that may help collecting musicological data on the web more efficiently. More on this in the conclusion. 7 An 'influenced-by' list was therefore compiled first, from information on both backward influences (influenced by) and forward influences (influences on) and, from this list, a secondarily 'had-influence-on' list was compiled. Thus, there is a complete correspondence between both lists: if composer j is listed as influenced by composer i in the 'influenced-by' list, then, in the 'had-influence-on' list, we will find that composer i influenced j. 8 The lesser-known/-studied composers are mainly those ranked 350-500 in the CMN. But for two-thirds of these, at least three sources were identified, and for all, at least two. 9 For more famous, better-studied composers (say, the top 150 in the CMN) there are some influences on them that were, say, less than five-times mentioned in sources and eventually not included as an influence, possibly creating errors of omission. Beyond Type I and Type II errors, the absolute accuracy of each pronouncement of influences in the CMN is only as good as the literature is extensive regarding a particular composer, and of course the ability of each musicological source to come to their own intelligent conclusions. That there are some errors in the CMN is unavoidable, but the purpose of the effort was less musicological than it was educational: suggesting to users some 'new-to-them' composers they might listen to next. This is precisely the objective of efforts like Pandora (which went online a few years after the CMN version 1 was available).

An application of network graphs
A graph or network is a set of nodes (also referred to as vertices), with a finite number of members n. The CMN database is a network of n = 500 nodes/composers, i 1 , i 2 , …i n (or, i, j, k…). From the CMN, we can extract an n × n (row × column) adjacency matrix. Think of the n rows as 'influencers' and the n columns as 'influencees '. A cell (i,j) in this matrix is equal to 1 if composer i has influenced composer j, otherwise it is equal to 0. When we observe a value of 1, we say that there is an edge (or link) between i and j. The CMN is a 'directed' network, that is (i,j) = 1 does not necessarily imply (j,i) = 1 (e.g., that Alkan influenced Debussy does not, of course, imply that Debussy influenced Alkan). The 'out-degree' of a node i, k out i , is the number of composers that i influenced. The 'in-degree' of a node i, k in i , is the number of composers that influenced i. The degree of a node is defined as (This is of interest when tracing composers lineage.) The geodesic distance between two nodes i 1 and i K in a directed network is the length of the shortest directed path between them.
There are several algorithms that can be used to visualize a network through a pictorial representation of the nodes and edges. There can be very different layouts or representations of the network itself depending on the algorithms used. In Section "An application of network graphs", we use the ForceAtlas2 algorithm (Jacomy et al., 2014), a forced-based layout whereby the algorithm modifies an initial (random) node placement by continuously moving the nodes according to a system of forces based on a metaphor of springs and electric charged particles. The 'spring-electric' layout uses the attraction formula of springs (between nodes connected with an edge) and the repulsion formula of electrically charged particles (between any nodes). It uses the attraction force (or restoring force) formula of springs,F a (i 1 , i 2 ) = k a dist(i 1 , i 2 ) , where dist(,) is the initial geometric distance between two nodes. The more one stretches something, the harder it becomes to keep stretching. Or as one stretches something out, there is a restoring force (of opposite sign) that one has to compete with. Thus, given the initial random position, connected nodes with closer geometric distance dist(,) attract less (the restoring force is lower) than for more distant connected nodes. It also uses the repulsion formula of electrically charged particles (electrons), F r (i 1 , i 2 ) = k r ∕(dist(i 1 , i 2 )) 2 so that closer nodes (charges) repulse more. Hence the spring-electric analogy suggests that initially closer nodes attract less but repulse more. These forces create a movement that converges to a balanced state. 10 10 The main difference between several force-based algorithms is in the actual value of the exponent associated with dist(,) in the attraction and repulsion (a, r) formulas (with a '−' sign if dist is in the denominator). For example, the spring-layout analogy explained above imposes (a, r) = (1, − 2), the popular Fruchterman and Rheingold algorithm is (2, − 1). On the other hand, the ForceAtlas2 algorithm imposes that (a, r) = (1, − 1). For example, the attraction force in ForceAtlas2 is the one given in the text above (so that a = 1), while the degree-dependent repulsion force is given by: and k i is the degree of a node i while k r is a constant. According to this formula, the repulsion force is proportional to the degrees (plus 1) of the two nodes. The ForceAtlas2 algorithm tends to increase the repulsion between highly connected nodes and to produce a lesser repulsion force between poorly connected nodes and highly connected ones, and an even lesser repulsion force between poorly connected nodes. This avoids the cluttering effect of some algorithms, such as the Fruchterman and Rheingold algorithm, in which a forest of leaves (poorly connected nodes) surrounds the few highly connected nodes. (This originates from the fact that a − r (= 2) in ForceAtlas2, is less than a − r (= 3) in the Fruchterman and Rheingold algorithm. Visual clusters denote structural densities (the ratio of actual edges on potential edges) when a-r is low, that is when the attraction force depends less on distance and when the repulsion force depends more on it. We introduce more formally the notion of a network density below.) The user can choose the value of k r which provides a scaling effect. The higher k r the larger the graph will be. Note that with respect to the attraction Figure 1 shows a balanced state of the CMN network using ForceAtlas2 algorithm. 11 Nodes in blue represent composers from the Medieval and Renaissance periods; green represents Baroque; red is Classical; cyan is Romantic; and magenta, composers of the twentieth century. The size of the nodes reflects the number of influences (the 'out-degree') of a composer. The larger nodes are known as 'hubs'-composers who have influenced many other composers in the network. For readability, Fig. 1 labels only the 'top-150' composers according to a ranking proposed in the CMN. Figure 1 shows that the network of influences structures and reflects music periods, with composers from the same period (e.g., Renaissance, Baroque, Classical, etc.) appearing closer together, in clusters, reflecting a stronger density of 'intra-period' influences. As referred before, this appears like a 'macrocosm' for the Common Practice Period , with somewhat unified views on music and music rules and practices (the five pillars of the Common Practice Period-Tonality, Vocabulary, Texture, Sonority, and Time Organization) to which composers belonged to for rather long periods of time (i.e., Late Renaissance, Baroque, Classical or Romantic periods). 12 We will contrast this shortly with the microcosms or small universes of the twentieth century classical music (Figs. 5, 6, 7 and 8), a period unique in its pluralism and its partial or total modification of earlier music rules.
Quantitative metrics remain essential to shed further information that would be quite difficult to extract from the initial CMN database or from Fig. 1. 13 The density of a network is the ratio of actual edges in the network, to all possible edges (an arrow between any two nodes) given by n*(n − 1) where n is the number of nodes. The network density (a number between 0 and 1) gives a sense of how closely knit the network is. In our case, we have 3724 edges and 500 nodes so that the composer network density is 3724/ (500 × 499) = 0.0149. Hence, the CMN network is on the lower edge of the (0, 1) range, but still far from 0. As the CMN is a directed network, we may also be interested in the density of the reciprocal edges of the network, that is the ratio of the number of edges pointing in both directions to the total number of edges in the graph. The reciprocity of the CMN = 0.0417, so that one can infer that there are 155 edges where any composer i both influenced j and was influenced by j. 12 Magnuson (2008) describes the five pillars of the Common Practice Period as follows: 1. Tonality-The essential organisation around a single pitch, the tonic, which provides a home base to the ear; 2. Vocabulary-A diatonic pattern of seven stepwise pitches called major and minor scales; 3. Texture-Texture of the Common Practice Period is created with counterpoint, which is two or more simultaneous individual and independent lines, each of which confirms the pre-eminence of tonic and utilizes the vocabulary of a major or minor scales. 4. Sonority-consonant sonority of the Common Practice Period is a group of three notes arranged in thirds (tertian triad). Dissonance can be used on occasion in the form of a group of four notes arranged in thirds (tertian tetrad); and 5. Time-The essential time organization of the Common Practice period is based on a consistent and unchanging beat. These beats organize into 2, 3, or 4 essential pulses per measure, with the first beat always the strongest. Each beat can sub-divide into two parts (simple meters) or three parts (compound meters). 13 Ladd et al. (2017) discuss some of these metrics and explore in details how to compute them using the Python NetworkX library. 11 Network figures have been generated using the Python library ForceAtlas2 algorithm implemented for NetworkX by Bhargav Chippada (2017).
force, the ForceAtlas2 algorithm proposes an alternative option (dissuade hubs) given by . This means that if node i 1 is an 'hub' (i.e., a node with a high out-degree) then this option will tend to reduce the hub's attraction toward other connected nodes (i 2 ), pushing hubs towards the periphery while keeping nodes with a high in-degree but low out-degree in the center.

3
A second structural measure is the network global (or average) clustering coefficient. A clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. The CMN global clustering coefficient is C = 0.2189 and is computed as the average of the local clustering coefficients of all nodes, C = (1∕N) n ∑ i=1 C i . The local clustering coefficient of a node quantifies how close its neighbours are to being a clique and is given by C i = N i ∕ k i k i − 1 where N i is the number of edges among 'direct' neighbors (both influencers and influencees) of i (and therefore the number of triangles through i), k i = k out i + k in i is the sum of the in-degree and out-degree of i and k i k i − 1 is the maximum (potential) number of edges among the neighbours of i. The local clustering coefficient can be viewed as a type of centrality measure (see below), albeit one that takes small values for 'important' individuals. For example, for i = Debussy, C i = 0.0417. This implies that there are N i = 430 edges among all neighbours of Debussy. When C i is close to zero, as is the case for Debussy, i is a 'star' − i has a lot of neighbors but his/her neighbors are little connected between themselves, whereas if C i is close to 1, then the neighborhood of i forms a clique of composers (i is connected to its neighbours, and all the neighbors are connected to each others). 14 A traditional method to assess real network results is to compare them to those obtained from a purely random network (an Erdos-Renyi graph) with same number of nodes and edges as the tested network. For instance, a well-established result (see Ravasz & Barabási, 2003) is that C i is independent of k i in random networks while in real networks, C i decreases with k i (i.e., composers with a large number of neighbours (edges) have a low value for C i and vice-versa). Figure 2 shows this negative relationship for the CMN as well as a few 'stars' with low local clustering coefficients (e.g., Stravinsky, Debussy, J.S. Bach, JS, and Wagner). The intuition for this negative relationship is roughly as follows. On the one hand, composers tend to group in community sharing mostly neighbors within the same community. A composer with a low degree tends to be connected to nodes within his/her single community, leading to a denser community and a relatively high local clustering coefficient. On the other hand, a composer with a high degree tends to be connected to several communities, which increases the maximum (potential) number of edges among his/her neighbours, reducing this composer's local clustering coefficient. As stars are relatively few in real social networks, the global clustering coefficient computed as the average of the local clustering coefficients tends to be high, and typically higher than the one computed for purely random graphs. This will be shown, below, to lead to the small world characteristic of social networks.
Another structural measure of interest starts with the notion of geodesic path (or shortest path) which is the shortest possible series of nodes and edges that stand between any two nodes. In absence of a direct influence of composer i on k, but if i has influenced j and j has influenced k, then the shortest path length (or geodesic distance) between i and k is 2. This may lead to conjectures about whether or not an 'indirect' or residual musical influence of i exists on the music of k. As pointed out by Pfitzinger (2017), this might be of interest to musicologists. Shortest paths computed between any two composers may help discovering influence lineages. For example, taking Debussy as the target, we can discover in the CMN network all composers that are one step away (direct influence), two-step away, etc. 15 The concept of six degrees of separation was made famous by Milgram (1967) in his study of "The Small World Problem". 16 Watts and Strogatz (1998) have studied the small world property of networks more formally and Humphries and Gurney (2008) proposed a measure to determine if a network has a small-world characteristic, which is given by S = (C∕C rand )∕ L∕L rand , where C and L are the clustering coefficient (defined above) and the characteristic path length of the network (i.e., the average shortest path length for the entire network), 17 while C rand and L rand are the same concepts for a randomly constructed network. Networks that show both a small average path length and a high clustering coefficient are known as small-world network. For the CMN we find that S = (0.2189/0.0153)/ (1.1366/3.29) = 41.43 > > 1, which confirms that the CMN network has a small-world characteristic.
Beyond some structural measures we can also discuss which nodes are the most 'important' in the network through measures of centrality, in particular, degree centrality, betweenness centrality and eigenvector centrality. Figure 3 shows the placement of composers according to their 'in' and 'out' centrality. Composers with a high in-degree who also have a high out-degree may be viewed as 'communicators' or 'facilitators' of the network (e.g., Debussy, JS Bach, WA Mozart, Ravel, Liszt, Wagner, Stravinsky, Bartók, Brahms). They are influencers building on the shoulders of others. There are seen as prestigious or knowledgeable ('authorities') and other composers seek to be influenced by  16 In the experiment participants from a particular town were asked to get a letter to a particular person in a different town by passing it from acquaintance to acquaintance. The letters that arrived made it in an average of 5.2 steps. See also the subsequent study by Travers and Milgram (1969). 17 The average shortest path length is the sum of shortest path lengths between all nodes (assuming the length is zero if a target j is not reachable from i) normalised by n*(n − 1) where n is the number of nodes in the network. them. At the other extreme, some composers have low in-and low out-degrees, composing at the periphery. Hildegard, Lara, Milán, and Amram are actually disconnected from the rest of the network (no 'in' or 'out' influences in the network). Composers with a low out-to-in ratio are information sinks-to some extent, in purely quantitative terms, they do not spread or transfer the knowledge they may have learned from others, at least regarding to the low quantity of direct influences they have had). Some composers are a low source of influence because they are viewed as too indiscriminate in their own sources of influences. For others, this may reflect voluntary isolationism and lack of interest in transmitting knowledge. Perhaps Sullivan is an example of the former. He had little influence on others and might have suffered from his reputation among the musical establishment of writing frivolous music. Britten is perhaps an example of the latter. Of course, chronologically more recent composers are, by definition, sink of influences. Time is playing against them in being influencers, which also reflects a database limitation bias because the CMN has few composers born in the second part of the twentieth century. Finally, composers with a high out-to-in ratio are perhaps best described as outsiders and innovators, defining new directions in music while relying less on previous influences. Composers such as Carissimi, Corelli, D. Scarlatti, Chopin, R. Schumann, and Schoenberg, leading to a transition from a period or style to another, all fit this description to some extent.
It is common in network analysis to provide a visual check of the node (composer) distribution of 'in' and 'out' degree centrality. A random network will typically have a binomial degree distribution, or at the limit when n is large, a Poisson degree distribution-distributions that underestimate the presence of nodes with large degrees-while real networks have highly right-skewed distribution-the bulk of the distribution occurring for low degree nodes with just a small number of nodes having a very high degree. For the CMN out-degree distribution, we would expect that the bulk of composers have influenced very few composers while a few composers have influenced many. Similarly, for in-degree distribution we would expect that the bulk of composers have been influenced by very few influencers while a few composers might have been influenced by many. Real networks are often said to follow a Pareto or power law distribution (also known as scale-free  Figure 4 shows the fraction p k of nodes with degree k, (p k = n k /n), in level (left part of Figure) and in log-log transformation (right). The out-degree average is 7.5, the median is 2, and the standard deviation is 15.9. The corresponding numbers for the indegree distribution are: average = 7.5, median = 7, sd = 4.3. 19 Typically, if the in-and outdegree distributions followed a power law then the log-log transformation would appear as a downward sloping straight line (log p k = c − α log k for some constant α, c > 0). It is clear in Fig. 4 that the in and out-degree distributions of the CMN are highly right-skewed as expected from real-world networks, but that the in-degree distribution does not follow a power law for low values of k in (the frequency of composers associated with zero to four influencers is lower). 20 This reveals a bias in the long list of selected CMN composers; only well-established/known/connected composers were selected at the expense of little established/known/disconnected composers. This bias reveals the stated CMN educational purpose as opposed to preservation of the classical music heritage for musicology purposes. Table 1 shows two other measures of centrality, betweenness centrality and eigenvector centrality. Betweenness centrality tries to capture nodes that are important not because they have a high in-or out-degree, but because they stand between groups, giving the network connectivity and cohesion. If a composer is often found on shortest paths (as defined above) between any two composers, they will score high on this measure. Table 1 shows that composers with high betweenness-centrality are present in all major periods. Network 18 Many distributions have a typical size or scale around which individual measurement are centered, such as the height of human beings. A free-scale distribution does not peak, i.e., is not centered, 'around' a typical value. 19 In a directed connected graph, each edge implies both an 'in' and 'out' direction, so that the average degree distribution must be the same for in's or out's and is given by the ratio of the number of edges to the number of connected nodes = 3724/496 = 7.51. 20 Amaral et al. (2000) were not able to find a power law scale-free network among seven examples of networks in social, economic, biological, and physical networks. connectivity can be observed by the presence (in the top-25 list-not shown here) of composers such as Josquin des Prez (transitional stage in Renaissance music), Sweelinck and Schütz (transition from Renaissance to early Baroque), and F. Couperin and Purcell (both composers incorporated in their compositions different Baroque styles, French and Italian for the former, English, Italian and French for the later who is also an important source of influences on English musical renaissance of the early twentieth century). Finally, eigenvector centrality cares not only about the number (quantity) of connections (influences) but also the 'quality' of these connections (i.e., whether their connections are also well connected). In a directed network, there are two measures of eigenvector centrality-the left-(or in-) eigenvector which takes into account the quantity and quality of the composers who influenced a subject composer, and the right-(or out-) eigenvector which accounts for the quantity and quality of the composers the subject composer influenced. Given this definition, it is intuitive that the top-10 composers, according to the out-eigenvalue, are likely to be from earlier periods (Renaissance and Baroque), while the top-10 composers with respect to the in-eigenvalue, are from the twentieth century, which reflects the advantage of coming chronologically later. Modern composers had the opportunity to cherry-pick among an increasingly large pool of influencers from earlier periods, an opportunity that, say, a Renaissance composer did not have (partly also because knowledge of medieval and earlier music is limited). For example, Ligeti (1923Ligeti ( -2006 was, according to the CMN, influenced by a rather long list of important and influential composers of all periods from the Renaissance to the twentieth century. 21 Finally, observe that the top-10 lists for degree centrality measures (in and out) and eigenvector centrality (in and out), have no intersection, which illustrates the role of adjusting for influences quality. A characteristic of the network of composers studied up to now is that it connects composers from the medieval to the renaissance, all the way to the twentieth century. We now focus on the period since 1862. The choice of 1862, the birth year of Debussy, reflects Griffiths (1978) argument in his 'Concise History of Modern Music: From Debussy to Boulez', that Debussy is a transitional figure from late romantic to the modern period. The pool of composers is reduced to 249, so that the adjacency matrix is of dimension 249 × 249. Some modern composers have been influenced by composers from earlier periods, but we only account for musical influences among composers of this period. The number of nodes is 249 and the number of edges is 1325. Hence, the density of the network is 0.02146. The twentieth century network is denser than the previous network, reflecting on average a larger number of influences (among all potential links) per composer.
We color-coded composers (nodes) by style in Figs. 5, 6 citizenship (or regions of affinity) in Fig. 7, and gender and ethnic background in Fig. 8. Figure 6 zooms on Fig. 5 for improved visibility. To ensure the same final 'equilibrium' placement of nodes across maps, a pseudorandom number generator was used so that the initial placement of nodes, although statistically random, was created in a deterministic manner. Unlike composers from earlier periods, many twentieth century composers are little-known to the public at large, and Figs. 5, 6, 7 and 8 may help in discovering composers through their connections with better-known composers (whose larger node size (out-degree) reflects their influence).
In an interesting thesis Pauls (2014) describes the state of the twentieth century classical music as if there were 'two centuries in one'. As Pauls puts it: "An outstanding feature of the twentieth-century has been the divergence of European 'art' music into two general areas which do not overlap to the same extent that they do in previous centuries. That is, the performing repertoire is at odds, sometimes dramatically so, with a competing canon of works considered to be of greater importance from an evolutionary historical point of view". This feature can roughly be seen in Fig. 5 where composers located towards the West, South, and East edges of the map make up the bulk of the performing repertoire, pursuing (to some extent) the romantic style of the nineteenth century and, more generally, pursuing the five pillars of the Common Practice Period mentioned earlier. As we move towards the center, the North, and especially the Northeast of the map, however, we find composers who have changed most or all the musical elements and rules of the 'Common Practice Period' and are often viewed (loosely) as the 'avant-garde' of the music of their time. Magnuson (2008) offers an interesting discussion about which of the five pillars of the Common Practice Period have been basically maintained, generally modified, or completely modified by the different styles of twentieth century (e.g., impressionism, primitivism, neoclassicism, expressionism, serialism, indeterminism, minimalism, neo-romanticism, etc.). He assumes that when a composer either generally modifies or completely changes more than one of these five elements, then, a new music (or Uncommon Practice) is created.
Associating a unique style (node color) to one composer (as in Fig. 5) is a restrictive and misleading assumption as many composers explored different styles over their lifetime. The CMN provides a list of styles for most composers. When more than one style was provided, additional manual searches were done, reviewing the bios of composers in several sources such as Wikipedia and composers websites, to ascertain the dominant style during the composer lifetime. We understand the limitations of this approach and Section "Multidimentional scaling (MDS) and classification maps" deals with this issue in a different way. 22 However, for the time being, we deliberately take this short-cut in Figs. 5, 6 to explore the main styles of the twentieth century.
According to Magnuson (2008), the music of the twentieth century is unique in its pluralism as "[c]omposers began to explore a more personal and individual approach to music creation, forming their own microcosms or 'small universes'. No longer bound to the rules formed by one musical approach, they customized sound to suit their own views and preferences." For Magnuson, there were three important small universes or microcosms near the turn of the twentieth century (see Figs. 5, 6): Impressionism (e.g., Debussy, Ravel, in the Southwest), Primitivism (e.g., Stravinsky, Bartók, in the East and Center), and Expressionism (e.g., Schoenberg, Berg, Webern, in the Northwest).
Impressionism is represented by Debussy, Ravel, and other composers pictured with a blue node and located in the South/Southwest. 23 Impressionism was a reaction to the state of music at the end of the nineteenth century, that is, late Romantic composers who lived well into the twentieth century, who made some concessions to the new century, but essentially belong to the Common Practice Period. These essentially Romantic composers are located (with deepskyblue nodes) in the East/Southeast and West/Southwest parts of Fig. 5. 24 Expressionism followed the path of the Common Practice Period but completely mutated its basic pillars of tonality, vocabulary, texture, sonority, and time organization. Besides Schoenberg, Berg and Webern, other expressionists gravitate around them in Figs. 5 and 6. 25 Expressionism itself led to Serialism and some representative composers, at least during part of their life, are located with a pink node. 26 Primitivism (Stravinsky, Bartók) positioned itself somewhere between Impressionism and Expressionism, and eventually led to 1) Neo-Classicism (composers pictured with dark-green nodes in Figs. 5 and 6 and essentially grouped in the East part) and to 2) the revival of Nationalism as a source of inspiration (composers with gray nodes), a trend that began with Glinka, Smetana and Dvorák in the nineteenth century. 27 Note that nationalist composers are spread in the South part of the map closer to late-Romantic composers. This reflects the idea that Nationalism is the continuation of a nineteenth century trend and that the network of influences of nationalists is also driven by their citizenship (see below in Fig. 7).
Other styles, represented by the Avant-Garde (in saddlebrown) and Experimentalists (in sandybrown) in the Nord-East of Figs Practice'. According to Magnuson (2008), new technology created Electronicism 28 while in the second half of the twentieth century there has been an unprecedented attention to new elements of Texturalism 29 -the relationships of timbre, density of pitch and rhythm being given a new primordial status relative to melody and harmony. Reactions to these styles created Indeterminism/Chance/Aleatory music, 30 a reaction against the total control that is the basis for integral Serialism. Minimalism (Riley, Reich, Glass, Adams-in cyan in Figs. 5 and 6) opposed the ideas of atonality itself and reintroduced the vocabulary and sonority of the Common Practice Period. For Magnuson (2008), neo-Romanticism 31 (in springgreen in Figs. 5 and 6) opposed these things too, but also represents a complicated relationship between today's composer (and listener) and the music of the past (as opposed to the late Romantic composers, mentioned above, who belong to the Common Practice Period). Popular music, Jazz, exotic influences and the criss-crossing of styles led to Eclecticism-choosing diverse elements from many different sources. As argued by Magnuson (2008), this is the essence of the twentieth century but certain composers are put in this group as they simply cannot be placed into neat categories due to their originality and individuality. 32 Figures 7 and 8 focus on citizenship, gender, and ethnic background. 33 Figure 7 shows that citizenship (and possibly geography) still matters in the influence network. 34 We see a concentration of North-European composers (deepskyblue) 35 in the East and towards the center; Russian composers (cyan) 36 in the South-East and British composers (hotpink) 37 in 28 For example, Varèse, Luening, Babbitt, Maderna, Berio, Stockhausen, Druckman, Davidovsky, and Wuorinen. 29 E.g., Varèse, Carter, Lutoslawski, Ginastera, Ligeti, Xenakis, Stockhausen, and Penderecki. 30 E.g., Cage, Xenakis, Feldman, Stockhausen. 31 For example, pre-1950: Walton, Tippett, Shostakovich, Barber, and Britten, and post-1950: Rochberg, Henze, Penderecki, and Corigliano. 32 E.g., Scriabin, Ives, Ruggles, Cowell, Partch, Messiaen, Hovhaness, Cage, Berio, Crumb, Górecki, and P.M. Davies. 33 The CMN, focusing on an Anglo-American perspective, emphasizes twentieth century American and British composers and has an intended bias in that respect. There are 77 Americans and 29 British composers, while there are 31 French, 23 Germans/Austrians, 16 Italians, 16 Russians, 15 'Central-Europeans' (6 Hungarians, 5 Bohemians/Czech and 5 Polish), 12 North-Europeans (3 Danish, 3 Swedish, 4 Finnish, 2 Estonians), 12 South-Americans (4 Mexicans, 3 Argentinians, 2 Brazilians, 1 Paraguayan, 1 Venezuelan, and 1 Cuban), 8 Spanish, and 10 composers of other nationalities (2 Dutch, 2 Swiss, and one each from Belgium, Greece, Canada, Australia, Japan, and China). 34 Some composers have had several citizenships over their lifetime. To attribute a single color per node, only one citizenship was kept, depending on the biography of the composer. For examples, Varèse (1883-1965), a French-born composer, spent the greater part of his career in the United States from 1915 onwards and took the American citizenship in 1927. He is thus classified with Americans (in red). Xenakis , a Greek composer who moved to France in 1947 and became a naturalised citizen in 1965, was generally more entwined with French than Greek cultural life. He is thus classed with French composers (in blue). Nancarrow  was born in the USA and became a Mexican citizenship in 1956 but he lived in relative isolation in Mexico while his musical influences are American. He is therefore counted as American (in red). 35 Here we count Swedish, Norwegian, Danish, Finnish and Estonia as North European composers. E.g., Stenhammar, Sibelius, Nielsen, Holmboe, Nørgård, Sallinen, Larsson, Tubin, and toward the center, Kokkonen, Pärt, Rautavaara, etc. 36 E.g., Rachmaninov, Medtner, Miaskovsky, Glière, Glasunov, Khachaturian, A. Tcherepnin, Shchedrin, Prokofiev, and slightly further to the North, Stravinsky and Shostakovich. 37 E.g., Butterworth, Vaughan-Williams, Holst, and, from Holst and moving in the Northward direction, Howells, Finzi, Moeran, Rutter, Bax, Alwyn, Walton, Berkeley, Tippett, Rubbra, and further toward the center, Britten and Warlock. Also starting from Holst we see a strand of British composers in the Westward direction: Delius, Clarke, Ireland, Bridge, Bliss.
the South-East/ South. In the South-West we see a series of Spanish (yellowgreen) 38 and Italian (magenta) 39 composers. In the South-West/West, we find a concentration of French composers (darkblue). 40 In the West we encounter both German (yellowgreen) and Austrian composers (darkgreen). 41 American composers (red) firmly occupy the center and the Northeast side of the map. We also notice here composers from several nationalities, whose music is of a much more experimental and avant-garde style. 42 Their presence here tends to suggest that the influences of nationality/geography may have dissipated somewhat in the second part of the twentieth century.
In Fig. 8, color-coding permits to identify gender (fuchsia for Women). All in all, there are 12 women born since 1862 (out of 249 composers in the CMN). 43 To put things in perspective, there were just four women (out of 251 composers in the CMN) born before 1862. 44 For men, we also show ethnic background (yellow for African-US/African-European composers, green for Latin-American composers, and blue for the rest, i.e., essentially 'white males'). Although there are two Asian composers (Toru Takemitsu, Japan and Dun Tan, China) in the database, they are lumped together with the rest (in blue) in Fig. 8. 45 ' 46 Fig. 8 therefore shows how much 'white-male' dominant the twentieth century composers network (as portrayed by the CMN) remains. As mentioned in the Introduction, the short list of 500 composers included in the CMN was based on objective criteria with the aim of reflecting composers' status (at the time it was put together), instead of being a mere advocacy instrument or reflecting subjective preferences. This said, any list of 500 (when lists of several thousand composers exist) will always be open to criticism that 'some other' composers should have been included into the list. First, there is a wellknown discourse within music schools and gender and women studies departments that the narrative and canon formation of Western classical music has privileged white men of the 'European' tradition (e.g. Citron, 1990Citron, , 1993. In so far as the 500 composers were selected Gottschalk is also of mixed ethnic background (having a French Créole mother). However, he was born before 1862 and therefore is not in Fig. 8. 46 Latin-American composers in the CMN are Leo Brouwer (Afro-Cuban) and Agustín Barrios Mangoré (partly of Guarani origin form Paraguay). Other Latin-American composers who often used native and folk music of their country are, from Mexico: Carlos Chávez, Silvestre Revueltas, Agustín Lara, and Manuel Ponce; from Brazil: M. Camargo Guarnieri and Heitor Villa-Lobos; from Argentina: Alberto Ginastera and Astor Piazzolla; from Venezuela: Antonio Lauro. based on items and catalogues that reflect this narrative, then the CMN exposes this bias. 47 Second, the CMN does not cover more recent developments in classical music (Dun Tan, born in 1957, is the youngest composer in the CMN). The problem here is not the CMN per se (available online in 2000), but that 20 years later new and important composers have emerged. Finally, through the work of musicologists, once-forgotten composers are regularly re-evaluated or re-discovered. 48 The re-discovery process and the emerging of new composers may of course have an impact on catalogues, length of composer entries in dictionaries, etc., so that an eventual update of the long and short lists of composers as candidates to inclusion in the CMN might be required, at some point, to reflect changes in the relative status of composers.
Network graphs show how musical influences structure and reflect both the macrocosm of the Common Practice Period and its main components, and the late nineteenth and twentieth century musical styles-Impressionism, Expressionism, and Primitivism (and its development as Neo-Classicism)-and their interactions with composers' nationality, gender, and ethnic background. Yet, the criss-crossing of styles used or developed by twentieth century composers makes it difficult to associate only one style per composer. We therefore propose to extend and enrich the network analysis of composers analysed here by computing similarity indices across composers taking account of composer-to-composer musical influences, style influences, and ecological characteristics (features) that best describe the 500 CMN composers. This permits to develop a multidimensional scaling analysis where composers located closer on the map are more 'similar'.

Composer similarity indices and heat maps
Beyond establishing lists of musical influences, one objective of the CMN was linked to early efforts in music information retrieval (MIR). For instance, the CMN site explains that many introductions to the classical music world are in the business of inculcation through lists of 'mandatory' composers and compositions to explore. Yet, most people explore new subjects by starting with the familiar, and in the case of music, this may mean hearing a composer or a composition that one likes and then searching for more music of the same type. The site gives the following example: "Suppose you hear the Ravel G major piano concerto on the radio, and take an immediate liking to it. Our database will help you extend this interest to other music by making it possible for you to quickly identify: additional works by Ravel, other piano concerti, other works for piano in general, other concerti in general, composers allied 47 Advocacy groups for diversity in music have developed databases reporting the names of thousands of female composers (e.g., composerdiversity.com). Sadie and Samuel (1995) also provide biographies of about 900 women composers of Western classical music. 48 Examples of 'rediscovered' or more recent composers who could eventually make the long list of an updated CMN, are, among many other male and female composers: Francesca Caccini (1587-1640), Barbara Strozzi (1619-1677, Isabella Leonarda (1620-1704), Antonio Cesti (1623-1669), Ferdinand Ries (1784-1838), Louise Farrenc (1804-1875), Florence Price (1887-1953), Allan Pettersson (1911-1980), Mieczyslaw Weinberg (1919-1996), Galina Ustvolskaya (1919-2006), Alexander Goehr (1932-), Helmut Lachenmann (1935-), Brian Ferneyhough (1943-), Gérard Grisey (1946-1998 to the same general period and style (Impressionism) as Ravel, other French composers, composers and styles that influenced Ravel, and composers influenced by Ravel." Thus, the CMN anticipated the general idea of a recommender system that is now commonly and automatically implemented in Spotify, Apple Music, Pandora, YouTube and other music streaming platforms that have algorithms proposing works to listen to next. These algorithms and their improvement are largely indebted to the field of MIR, which develops innovative content, context, and user-based searching schemes, music recommendation systems, and novel interfaces to make the vast store of music available to all. 49 This section illustrates a method to compute similarity indices across composers. The general method has been explored and progressively refined in several papers such as Smith andGeorges (2014, 2015) and Georges (2017). Here, we assemble all elements together, using two basic sets of information given in the CMN. First, we have used the list of composer-to-composer influences as explained in the earlier sections. Second, we extracted 42 general musical style influences (e.g., African music, Native American music, Spanish music, Indian music, folk music (by specific regions), popular music (by specific regions), world music, jazz, ragtime, blues, electronic, gamelan music, nature sounds, birdsong, etc.), also provided in the 'musical influences' list of the CMN 'Composers' page. 50 Finally, we extracted 298 'ecological' categories from the 'Index of Forms and Styles' page of the CMN so that each of the 500 composers are associated with a subset of these ecological categories (i.e., characteristics such as time period, geographical location, school association, instrumentation emphases, etc.). 51 An 8-page list of all ecological categories is available in . Once this information was gathered for the 500 composers of the CMN database, bilateral similarity indices were constructed by means of pairwise comparison of presence-absence data, using similar concepts to those encountered in bibliographic coupling (e.g., Glänzel & Czerwon, 1996;Sen & Gan, 1983). In essence, we inferred similarities among composers by assuming that if two composers share many of the same musical (personal and style) influences and ecological features, their music will likely have some similarity. On the other hand, if two composers have very distinct sets of musical influences and ecological categories, then their music is likely to have little similarity.
Technically, for any pair of composers (i,j) for i,j ∈ C (among the n × n possible pairs of n = 500 composers in the set C of CMN composers), we have a set A i of all attributes k (personal influences, style influences, and ecological categories) that apply to composer i, and a set A j of all attributes k that apply to composer j. Given all attributes n k in the database we can therefore count the number a of attributes shared by i and j, the number b of attributes that characterize i but not j, the number c of attributes that characterize j but not 49 One of the earlier survey articles on MIR is Orio (2006). Schedl et al. (2014) provide a survey of more recent developments and applications. For a recent text on audio/content-based MIR, see Müller (2015). For a text that puts more emphasis on alternative 'non-audio'/context-based MIR, see Knees and Schedl (2016). 50 In the example of Debussy: {Asian music, gamelan music, Renaissance Period music}. 51 See the 'Index of Forms & Styles of music' in the CMN. For example, the ecological characteristics associated with Debussy are represented by the following array of elements:{ballets 1900 on, cello chamber music, chamber music 1825 to 1925, 'Dance' in composition title, etudes, flute unaccompanied, flute chamber music, harp chamber music, harp orchestral music, Impressionist style, nocturnes, operas 1900 on, orchestra incidental music, orchestra symphonic poems, orchestration, Paris composers 1800 on, piano unaccompanied 1775 to 1900, piano unaccompanied 1900 on, piano chamber music general, quartets for strings, song cycles and collections, songs 1800 to 1900, songs 1900 on, suites, trios for other combinations, viola unaccompanied or chamber music, violin chamber music 1850 on}.
i, and the number of attributes d that neither characterize i nor j (d = n k − a − b − c). Among several similarity indices, we have used the centralised cosine similarity measure given by the following formula: This measure is equivalent to a Pearson correlation coefficient, r, taken between two Boolean vectors of attributes describing a pair of composers (i,j), where each Boolean vector is a series (of length n k ) of 1's and 0's depending on whether an attribute belongs or not to a composer. It can be shown that values of the centralised cosine measure range from − 1.0 to 1.0. A value of 1.0 indicates that two composers are identical. A value of − 1.0 indicates that two composers are complete opposite. A value of 0 shows that the two composers are independent (unassociated). 52 This methodology permits to compute a 500 × 500 matrix of similarity indices between pairs of composers (i,j). Let us call the similarity matrix S comb where the subscript 'comb' refers to the fact that we combined all attributes k (musical influences and ecological categories) in the calculation. Besides computing S comb , we can compute similarity matrices by restricting attributes to one specific category. For example, we can compute a 500 × 500 similarity matrix S ecol , with similarity indices between pairs of composers based only on attributes that focus on the 298 ecological categories (as is done in . Or we can compute a 500 × 500 similarity matrix S infl , based on personal (composerto-composer) musical influences only (as is done in Smith & Georges, 2014). The S combbased indices are now included in the CMN composers webpage through lists of top-15 most similar composers to each of the 500 composers. Table 2 provides lists for 14 major composers together with the similarity score. 53,54 The classical music website soclassiq 52 A nonzero value of the centralised cosine measure might be due to randomness or actual association between composers. As shown in , unlike in the case of the ordinary cosine measure, there is a proper statistical significance test. This test is originally proposed by Giller (2012) who summarizes the statistical properties of statistics computed from independent random bitstreams and derives the moments of the asymptotically normal approximation to the sampling distribution of the cosine similarity of independent random bitstreams. He proposes a new statistic, the support adjusted cosine similarity (where the support is the count of the non-zero bits divided by the length of the bitstream) and notes the parallel between the support adjusted cosine similarity and the Pearson correlation coefficient. See also Egghe and Leydesdorff (2009) who show that there is no pure functional relation between the Pearson correlation coefficient r (or CSC), and the ordinary cosine measure (or Salton measure, COS). However, they establish that the cloud of points (COS, r) can be described by a sheaf of increasing straight lines whose slope decreases, the higher the straight line is in the sheaf. 53 It must be re-emphasized that the similarities scores arranged here represent appraisals of correlations between pairings of composers' recorded attributes, k (i.e., personal and style influences and ecological characteristics). The CMN is constructed such that famous composers have many more recorded attributes than lesser-known or -studied composers. Hence, the chance of observing 'matches' (intersection-set elements) between famous similar composers is larger than the chance of observing matches between famous and less-well known, yet similar, composers, which is also larger than the chance of observing matches between two lesser-known similar composers. As explained in the CMN, "J.S. Bach, for example, has many more recorded attributes in the CMN database than does Johann Ludwig Krebs, a relatively minor figure greatly influenced by Bach. The result is that Bach shows up as the eighth most similar composer in the Krebs entry, but Krebs does not appear in the Bach list of the 15 most similar composers" (because of the extent of 'un-matches', that is, attributes that belong to Bach but not to the lesser-studied Krebs). Nevertheless, the lists do pass, at least largely, an eye test. 54 To these scores can be attached statistical significance levels (according to the methodology discussed in Georges and Nguyen, 2019), but for the layperson the scores themselves are easier to appreciate. Generally speaking, scores above .60 represent composer similarities that are likely to be fairly obvious, scores of 1 3 (https:// socla ssiq. com/ en/) has also recently included our index permitting to explore, for each composer, a list of the most similar composers. 55 Figure 9 visualises S comb with a heat map. 56 In this graph, composers have been ordered chronologically, from Hildegard (born in 1098) until Dun Tan (born in 1957). Along the diagonal, composers are compared to themselves, which gives a similarity score of 1, and is translated into a tiny black dot in Fig. 9. Moving off the diagonal implies comparing different composers. Dark blue reflects high similarity, while a light yellowish color suggests that the two composers are unassociated (independent) and any whiter shading implies negative values (opposition between composers). In general, as we move further away from the diagonal composers become more distinct. We can also visually detect 'intra-period' similarities and 'inter-periods' dissimilarities or independence. For example, Renaissance composers (in the upward left corner) tend to be relatively similar to each other, forming a small dark blue 'square' area, but Renaissance seems largely independent of other periods, leading to a long yellow 'rectangle' area in the rest of the rows (or columns). Note that the CMN database tends to have a larger number of composers associated with more recent periods. For example, as seen above, half (249) of the database is made of composers born since 1862. This feature is translated as larger, more diffused, blue 'square' areas (of intraperiod similarity) as we move down the diagonal.

Multidimentional scaling (MDS) and classification maps
Multidimensional Scaling (MDS) is a technique that generates a map displaying the relative position of a number of objects based on a given set of pairwise distances between them. In this section we apply the MDS methodology to the 500 × 500 matrix of similarity/ distances between composers, S comb . In this context, the MDS algorithm seeks to position (i.e., assign coordinates to) the 500 composers in an N-dimensional space according to an optimisation procedure that preserves as well as possible the initially computed bilateral distances (d i,j ) between composers. 57 . Choosing N = 2 optimizes composers location in a two-dimensional space (x,y), by generating the coordinates (x i , y i ) and (x j , y j ). The coordinates are computed so as to minimize a loss function called 'stress', which is the sum of squared errors of the actual distance between two composers d i,j , and the predicted distance d i,j * computed by the algorithm, and where the predicted distances depend on the number 55 See for example, for G. Mahler: https:// socla ssiq. com/ en/ Pe/ affin ities/ Gustav_ Mahler/ ID/ 3292/ 56 The heat map has been produced using Python Seaborn library and dependencies. 57 Our initial composers' proximity matrix does not represent pairwise distances across composers, d i,j, but pairwise similarities, s i,j . Typically similarity indices are converted into distance indices using the formula: . As s i,j given by the CSC formula in Eq. (1) is equivalent to a Pearson coefficient on Boolean vectors, we could alternatively use the Pearson distance metric (1 − s i,j ). In both cases d i,j falls between 0 and 2.
about .45 to .60 signify a considerable similarity, .30-.45 some similarity (for example, of time period or an emphasis on guitar), and below .30 less obvious connections (though many of these may be statistically significant in the greater sense).
Footnote 54 (continued)  Figure 10 shows the MDS map computed using Python. 60 A general objective of this section is to gauge whether the MDS map places composers according to our general expectations. Note that the date of birth of composers is never used directly when computing the similarity indices between composers nor is it used through MDS to locate composers on the map. 61 However, in order to assess the placement of composers, we tagged composers within 10 periods (Medieval, Renaissance, Baroque, Pre-Classical, Classical, Post-Classical, Early Romantic, Middle Romantic, and late Romantic, and Modern), using their birthdate as a criterion of decision, and the color of the dots/composers in Fig. 10 represents this tagging. Interestingly we see that the MDS map unfolds the history of classical music, starting with Medieval and Renaissance composers in the East, and, as we move counter-clockwise, progressing towards Baroque, Classical, Romantic and eventually twentieth century composers.
Although the first visual check suggests that the MDS methodology places composers on the map according to general expectations (e.g. Baroque composers together, Fig. 9 Heat map (500 composers). (Color figure online) 58 There exist several types of MDS algorithms, and they differ mostly in the loss function they use. They are at least two dichotomies that allow to structure some possibilities. 1. Distance scaling (Kruskal-Shepard MDS) versus inner product scaling (classical Torgerson-Gower MDS). 2. Metric scaling (using the actual values of the dissimilarities) versus nonmetric/ordinal scaling (interpreting dissimilarities in terms of the ordination of the data). See Buja et al. (2008) for details. Here we use distance metric scaling. 59 Following Kruskal (1964), a value of 0 is a perfect goodness-of-fit, 0.05 is good, 0.1 is fair and 0.2 is poor. More recent articles caution against using this advice since acceptable values of stress depends on the quality of the distance matrix and the number of objects in that matrix. 60 The MDS map has been computed with Python library Scikit-learn/Manifold. Although Kruskal' value is high at 0.20 for the two-dimension maps, suggesting that adding a third dimension might improve the placement, we do not pursue this route here. 61 This does not mean that there is no inherent time dimension in our musicological data set-data on composer-to-composer musical influences include contemporary composers, and the ecological data have also general references to periods.
Classical composers together, etc..) we want to produce a more convenient visual check with a painted contour around composers belonging to the same period while producing a map that is esthetically more pleasing than Fig. 10. In order to realize this, we decided to use a support vector machines classification algorithm (SVM). Typically, a classification algorithm tries to determine the class to which the object of the analysis belongs to. In the case of music composers, we could have a trained data set of composers and their features (characteristics) from which the algorithm would extract classes. 62 Thereafter, using a test data set (additional composers not included in the trained data set) the algorithm would be used to predict the probability that an additional composer belongs to a specific class or group on the basis on his/her own features (i.e., his/her positioning on the two-dimensional graph). There are several classification algorithms, both using supervised learning (SVM, K-nearest neighbors) and unsupervised learning (K-means clustering). Weiss (2017) and Weiss et al. (2018) apply several of these methods using audio features to characterise and then classify composers.
In our context, SVM should be applied normally to a set of composers described by a series of features. Here, though, we apply SVM directly to the two dimensions of the MDS map, which characterises composers according to the two axes in Fig. 10. The two dimensions do not represent directly measured features of composers but instead coordinates ultimately derived from a distance matrix computed on the basis of musical (personal and style) influences and ecological features of composers. The reason why we chose this strategy is that our interest is not in predicting to which class an additional composer (not included in the trained data set) belongs to. Instead, SVM is used here as an algorithm that draws painted contours around composers of a period in which they have been initially tagged, providing a more convenient visual check of the placement of composers on the MDS map, while also producing a more 'aesthetic' map. Finally, note from Fig. 10 that we need to use non-linear SVM because it would be impossible to draw straight contour lines to separate each groups of composers. In other words, we need to bend the lines to separate the classes. For this specific problem we decided to use a non-linear kernel, the radial basis function or Gaussian kernel. 63 Figure 11 shows the results of applying non-linear SVM to the MDS map of Fig. 10 while forcing overfitting so that we have a perfect matching between the ten classes identified by the SVM algorithm (and represented with painted contours) and the ten sets of dots of a specific color, each representing a music period wherein composers have been pre-identified. 64,65 62 To better visualise the approach, suppose that a large number of composers could each be represented with just two measurable features and plotted with dots in a two-dimensional graph accordingly. Then the support vector machine algorithm would try to draw a line between two or more classes of composers in the best possible manner. For SVM, the 'best possible manner' means, in essence, finding a separating line (a hyperplane) between any two groups of composers while producing the widest margin (distance between two parallel lines where each line touches at least a point/dot (a composer) in each class). Support vectors are those points that lie on the two separating margins. 63 In machine learning, kernels are functions that transform data from non-linear spaces to linear ones. 64 One typically faces a trade-off between fitting the training data set perfectly (high bias so that all composers initially tagged within the same music period will be classified in the same class) and how accurately the algorithm can predict the class of an additional composer drawn from a test data set (low variance and consistent predictions using different composers datasets). However, as said above, our objective here is more akin to aesthetic map drawing, with painted contours for composers classes that perfectly match music periods in which composers have been initially tagged. Hence overfitting is our objective here. 65 Figures 11-13 have been constructed using Scikit-learn/SVM and dependencies. With the radial basis function kernel, two parameters can be chosen: C (the penalty parameter of the error term) and Gamma (which defines how far the influence of a single data observation reaches). A higher Gamma means that Besides the overfitting (perfect matching), we also see, more clearly here than in Fig. 10, that the MDS methodology does a good job of positioning composers according to their periods. Relative to a chronologically arranged dictionary of composers, Fig. 11 provides a visual arrangement from which we can also infer the similarity of composers on the basis of their proximity on the map-providing some guidance on which composer to listen to next. If one likes Mozart, why not try to discover Cimarosa, Salieri, Grétry or J.C. Bach?
We now try to apply the method above to shed some light on the classical music of the twentieth century that is notoriously difficult to discover. We apply (and over-fit) the SVM algorithm assuming that each composer of the twentieth century might be tagged with a single main style, that is, mutually exclusive categories such as Impressionist, Expressionist,etc. 66 These categories are given in the legend of Fig. 12 which also includes a category 'Before' representing all those composers from earlier periods, who belong without much ambiguity, to the Common Practice Period. Unlike results in Fig. 11, Fig. 12 shows that overfitting the data creates a very complex map with many 'islands' of seemingly isolated composers. If we believe that the MDS methodology is accurate in positioning pair of composers according to their bilateral distance, so that composers closer on the map are more similar (as Fig. 10 suggests), then a conclusion can be drawn-Categorizing classical composers of the twentieth century is a rather complex task and identifying composers by one unique style and generating the painted contour to which they belong  66 As alluded before, associating a unique style (node color) to one composer is a restrictive and misleading assumption as many composers of the twentieth century explored different styles over their lifetime. The CMN provides a list of styles for most composers. When more than one style was provided, an additional research was done, reviewing the bios of composers in several sources such as Wikipedia and composers websites, to ascertain the dominant style during the composer lifetime. The approach of attributing a single style per composer is questionable. We come back to this issue shortly. points closer to the decision boundary have a close reach and the algorithm will try to fit the dataset even more exactly. In Fig. 11 C = 10 and Gamma = 10 to produce overfitting.
(through an overfitting classification algorithm) provides little pedagogical guidance in terms of communicating music and trends of the twentieth century. Perhaps in this case, the strategy of data overfitting is the problem and should be revised. Figure 13 shows the results of using a non-linear SVM algorithm on a MDS map for our list of 249 composers born since 1862 (to include Debussy, a transitional figure from The microcosms of twentieth century classical music-Support vector machines on MDS map. Note: Figure 12 is produced using SVM parameters that over-fit the musicological data. (Color figure online) late romantic to the modern period), while reducing the overfitting of the data. 67 As shown here, with reduced over-fitting there is no perfect matching between painted contours and the sets of composers (dots) tagged with a same color/style. Still, one color/style tends to dominate within each contour, from which an appropriate classification might be inferred for the whole contour. For example, the green, mauve, yellow, dark grey, orange, and pink contours represent, predominantly, composers related to Late-Romantism, Impressionism, Neo-classicalism, Neo-Romanticism, Expressionism and Serialism, and Experimentalism and Avant-Garde (of their time), respectively. Hence, the map produces a visual framework that helps in communicating classical music trends of the twentieth century, in a move from left to right that is also suggestive of the extent to which composers partially or totally changed the five pillars (music rules) of the Common Practice Period. 68 Note that choosing a higher degree of overfitting for the SVM algorithm would lead to seemingly more homogeneous contours and the appearance of nationalist and minimalist contours, at the cost of a more complex, less pedagogical, map.
To rationalize, in musical terms, non-homogenous painted contours, recall that many composers of the twentieth century did not have a unique style that can easily identify them, while their position assigned on the MDS map captures a much richer aspect of the complexity of a composer by considering the distance metric of Section "Composer similarity indices and heat maps", based on many musical (personal and style) influences and ecological characteristics. A composer tagged (through a coloured dot) with a unique style could be similar (in several different ways) to composers generally associated with another style, and therefore be regrouped with them inside a painted contour. For example, the early style of Szymanowski or Casella (dark blue dots) is typical of the impressionist current, but both composers have been, later, strongly associated with neoclassicism, which 67 In particular we set the values for the parameters of the non-linear SVM algorithm to C = 7 and Gamma = 2. 68 See again Magnuson (2008) who demonstrates through several tables how the different styles of the twentieth century have partly or totally changed one or more of these five pillars. ultimately explains their position in the yellow contour. Honegger is a Neo-romantic composer (light green dot) whose music remained strongly influenced by German romanticism, yet, he is located in the Neoclassical (yellow) contour most likely because of his inclusion to Les Six-a group of French neoclassical composers often seen as reacting against both the musical style of Wagner whose music eventually led to Late/Post Romantic current (green contour), and the impressionist music of Debussy (mauve contour). Ginastera is located in the orange contour because he extensively used serial techniques during what he referred to as his Neo-Expressionist period (in the last twenty years of his life). Yet, he was also a nationalistic composer (brown dot) and his works from the Nationalism period often integrate straightforward Argentine folk themes. It is at first surprising to see Schoenberg at the North-West edge of the neoclassical contour. His name is typically associated with Expressionism, atonality, and twelve-tone serialism so that one would expect to see him in the orange contour. Yet, as Schoenberg always claimed, he viewed his music as extending the traditionally opposed Romantic styles of Brahms and Wagner-in his words: "I do not attach so much importance to being a musical bogeyman as to being a natural continuer of properly-understood good old tradition!" 69 His position close to the border of the Late Romantic (green) contour most likely reflects this fact and shows, among other examples given above, that trying to capture similarities between composers according to a rich set of attributes is, in terms of music education and discovery, a method that complements and improves on a classification of composers based on rigid categories.
Besides visualizing twentieth century musical currents, these maps can be used to discover less well-known composers. Stravinsky is one of the most important and influential composers of the twentieth century. After his 'Primitivism' period (e.g., The Firebird and The Rite of Spring) that locates him close to Bartók's through his own Primitivism period (e.g., the Miraculous Mandarin) on Fig. 13, he eventually turned to Neoclassicism with other well-known composers such as Prokofiev and Hindemith. But Fig. 13 can help discover, say, Alexandre Tansman, at the southern edge of the yellow contour, an important musical figure of Neoclassicism even if somewhat forgotten by the public at large. To back up his position relative to neighbors on the map, we suggest a quick read of Wikipedia's description. While pursuing his musical career in Paris, Tansman was, according to Wikipedia, influenced by Stravinsky but also by Jacques Ibert and Albert Roussel (close on the map). Darius Milhaud tried to persuade him to join Les Six-the group of French neoclassical composers referred above. Of this group, Auric, Tailleferre, and Milhaud himself can be found in near vicinity on the map (while Poulenc, also of Les Six, is admittedly further away, located South of Prokofiev and close to Ravel). Still according to Wikipedia, Tansman was one of the most respected members of the international music group École de Paris, along with Bohuslav Martinů and Alexander Tcherepnin (also neighbors on the map). 70 Going over each twentieth century composer to rationalise his or her position on the MDS map goes beyond the objective of this article but musicologists could contribute by exploring further the results, leading to advances in ways one might capture the similarity/distance between composers in Section "Composer similarity indices and heat maps".
For example, the similarity matrix could be computed on the basis of the 298 ecological categories only, while eliminating the musical (personal and style) influences (in terms of Section "Composer similarity indices and heat maps", using S ecol instead of S comb ). This might improve the accuracy of the placement of composers on the MDS map, putting those who have very similar ecological niches even closer on the map. As pursued further in the conclusion, however, this might also reduce the 'endogenous' serendipity that the current map offers in terms of composer discovery, when exploring the music of a little-known composer in near vicinity of a better-known composer.

Conclusion
This article falls into the realm of an experiment in music education and exploration, not musicology. By proposing maps tracking composers' interconnections based on composerto-composer (i.e. personal) influences and composers' similarities based on musical (personal and style) influences and ecological data, our ultimate objective is to encourage classical music exploration, suggesting to users 'new-to-them' composers they might listen to. In this perspective, we strongly believe in an approach that focuses on a limited number of composers since discovering and listening to important compositions of 500 composers is a lifelong process for most while a more ambitious target might quickly become overwhelming. This rationalizes our choice of visualizing the CMN whose stated purpose is less of a musicological reference than an educational resource. Of course, the results presented here must be understood under this context and limitations. But, in principle, it is possible to extend the analysis presented here to more recent and ambitious musicological database. Of note, musiXplora has rich teacher/student relationships and musical instruments classifications. 71 The Bavarian Online Encyclopaedia of Musicians mentioned before could also be used, although this would also shift emphasis from music education towards musicology and tools serving musicologists.
In an age offering either inculcation through lists, assembled by music educators, of prescribed composers and compositions, or music recommendation algorithms, built by software engineers and data analysts, that automatically propose works to listen to next, this article offers a path that might promote the active rather than passive discovery of composers and their music (as with automatic recommender systems) in a less restrictive way than inculcation through prescription. Yet, as explained below, the active composer discovery proposed here has a few disadvantages so that it is best positioned as a tool that complements inculcation and the use of a recommender within music streaming systems.
One disadvantage of our approach to music discovery is that a user who decides to listen to a composer nearby a better-known composer still faces the challenge of discovering his or her important compositions. In this case, we argue that the CMN website (among many others) remains a source of information by proposing a list of important compositions for most of the 500 composers in the database. The overall approach would therefore promote active discovering of composers nurtered through a prescribed list of compositions. Second, a weakness of the CMN is that it does not cover the more recent developments in classical music. Extending the CMN so as to include more recent composers is possible, 71 See MusiXplora, ed. Joseph Focht, https:// musix plora. de/ musici/. For a digital humanities application analysing the relationships between musical instruments and musical pieces, see Kusnick et al. (2020).
although the process of collecting the musicological data should take advantage of new approaches in digital humanities, including the use of natural language processing (NLP) for music knowledge discovery (e.g. Oramas et al., 2018). With respect to new composers, however, inculcation remains an essential source to music education. For example, Rutherford-Johnson (2017) in his Music after the Fall, offers a retrospective of modern composition and culture since 1989. The book also compiles several lists of composers and their compositions to listen to.
Although we believe that a pre-informed search about who and what to listen to is desirable when using music streaming services, we must also acknowledge the new developments and strong potentials of music recommender systems, from simple (collaborative filtering) to more complex approaches such as audio models using convolutional recurrent neural networks. Collaborative filtering permits to make predictions about the interests of a user, and thus an eventual recommendation, by collecting listening profiles from many users and searching for common and distinct listening patterns among them. As new songs, by definition, are not in the music profile of any users, they cannot be easily recommended, however. More complex audio models typically start with loading segments of audio files, extracting timbral features from the audio signals (e.g., low-level features such as the time domain zero-crossing rate, the spectral centroid, spectral bandwidth, spectral rolloff, the mel-frequency cepstral coefficients, etc.). Once the structured audio features are extracted, a classifier is used, for example an artificial neural network (although in principle it could be a support vector machine algorithm), either for classification purpose (which music genre the different audio segments of the training set belong to), prediction (which genre a new audio file-not part of the initial training set, belongs to), and music recommendation systems (if a new composition is classified in a specific genre, it can be recommended to those who listen to this genre). An early application is Tzanetakis and Cook (2002).
Yet, audio models come with their own problems. As mentioned by Weiss et al. (2018), "[e]xtracting score-like information from audio-referred to as automatic music transcription-is a complex problem where state-of-the art systems do not show satisfactory performance in most scenarios." In other words, the audio processing algorithms needed to extract meaningful audio features are often error-prone and do not reach a high level of specificity regarding human analytical concepts. For example, notes specified by a musical score are hard to extract from an audio file. These problems eventually have led to the adoption of convolutional neural networks (CNNs), traditionally used for image classifications, by a shift of emphasis away from structured audio features towards images (e.g., the spectrogram of a music audio) to improve the accuracy of music classification and recommendation systems (see Costa et al., 2017). 72 The spectrogram, being an 'image', its use with CNNs makes sense; but its horizontal axis represents the time dimension so that recurrent neural networks (RNNs)-typically used with sequential data, may be useful for temporal summation of the spectrogram features. This idea has led to an architecture combining CNN and RNN for music classification, processing the spectrogram in sequence from CNN to RNN (Choi et al., 2016) or in parallel (Feng et al., 2017). Besides these developments, semantic techniques in natural language processing (that search for the semantic meaning of words and sentences through word 'embeddings') are also adapted 72 A spectrogram is best described as a heat map where the horizontal axis represents the time dimension of the audio segment, the vertical axis represents the frequency (pitch-related) and the color within the heat map provides the third dimension, the amplitude (loudness) of a frequency at a particular time.
to music in a way that captures meaningful tonal and harmonic relationships (see Chuan et al., 2020).
We conclude with a well-known issue that the high accuracy (relevance) of some automatic music recommender systems tends to generate the same type of music so that people get bored quickly. The much-discussed concept of serendipity is the idea of a recommender system that can (pleasantly) surprise a listener. Measuring serendipity is not easy or straightforward. One cannot simply import it from a library (e.g., Python sklearn) unlike accuracy metrics such as, say, 'precision', 'recall', 'discounted cumulative gain' or '(mean) average precision', etc. We argue however that the MDS maps presented in this article endogenously includes a degree of serendipity. First, the object of the study is the composer, not the composition (e.g., discovering the music world of Stravinsky is much more than knowing about his 'Rite of Spring'). Second, composers that are closer on the map are more similar not only because they share the same ecological characteristics but also because they share the same musical (i.e. composer-to-composer, and style) influences. However, as argued in Georges (2017), composers who are similar in their composer-tocomposer musical influences may have nevertheless produced music that might sound different in that they belong to different ecological niches (referred in that article as adaptation or music speciation and evolution). Listening to a composer that is in near vicinity to another better-known composer on the MDS maps may, in that sense, lead to new discoveries with sustained serendipity. Further research could possibly compare current MDS maps with maps based on a distance matrix that excludes composer-to-composer musical influences (i.e., excluding the information from the influence network), gaining relevance in terms of similarity accuracy at the possible cost of lower serendipity.