Understanding massive artistic cooperation: the case of Nico Nico Douga

Original Article

Abstract

Many online social networks have been studied in the last decade, giving us insights into the way people diffuse information, communicate, and organize themselves. In this article, we focus on the emergent organization in massive artistic cooperation. We study the creation process of complex music videos in a platform called Nico Nico Douga. We give insights into three aspects of emergent organization:
  • The relation between popularity (in terms of view) and influence on the cooperation process.

  • The specialization of creators.

  • The organization of the network of citation.

Keywords

Social network analysis Massive cooperation Artistic cooperation Nico Nico Douga Peer production Online social networks 

1 Introduction

Online social networks (OSN) have attracted a lot of attention from scientists of many different fields. By the large quantity of data about humans behavior that they make available, they allow to study interactions between individuals in a way that was not possible before. Among the tremendous amount of work published, we can cite works on information diffusion such as (Bakshy et al. 2011; Yang and Counts 2010; Morales et al. 2014), structural properties (Amaral et al. 2000; Clauset et al. 2009; Savage et al. 2014), community structures (Leskovec et al. 2009), influence (Cha et al. 2010), and so on and so forth. A wide variety of platforms have been studied, including Facebook, Twitter, Wikipedia and many others.

In this article, we work on a particular process, massive artistic cooperation, which has not been studied before, to the best of our knowledge. More specifically, we study the creation process of complex music videos in a platform called Nico Nico Douga. Our goal is to better understand the emergent organization of this massive artistic cooperation, how it takes place, and what roles do different users play in it.

In the first section of this paper, we will introduce what is massive artistic cooperation, and how it relates to other forms of collaborative creation. In a second section, we will present our dataset, what data we extracted and how. In the third one, we will study what is the relation between the popularity of a video—the number of time it has been watched—and its importance in the creation process. The fourth section investigates the specialization of actors, one aspect of the self-organization. Finally, the last part concerns the network structure of the creation process, in particular how different categories of creations are related to each other, and what different roles can videos play in it.

2 Massive artistic cooperation

The processes we want to study compose a phenomenon of artistic cooperation. In this section, we will describe what are the particularities of such a creation process, and in particular why do we speak of cooperation, and not of collaboration. We will then discuss about the specific case of artistic cooperation, and why it has not been widely studied until now.

2.1 Massive cooperative creation

A distinction is usually made between two forms of collective creation: collaborative and cooperative. In the collaborative one, the actors involved are conscious about being part of a same process, and of a common objective. These actors are able to communicate, and often use a centralized approach to distribute tasks among themselves. A good example of a collaborative process is the conception of a car, or a plane: many actors are involved, and tasks are distributed among them, but the meetings and the communication in general is essential to reach a final, collective result.

Cooperation, on the contrary, is when many actors or groups of actors are involved in a same process without being formally in contact with each other, each of them following its own personal objective or interest. They can use the productions of other actors in the way that suits their needs, and there is not central organization or planning, but instead a form of self-organization. Actors do not act against each other, but do not act for others either. A well-known example for the reader is the one of scientific production: researchers often are inspired by—or directly use—the work of many different authors that they often do not know personally, and with who they are not in direct communication. Despite this, the science progress in many directions and in an efficient manner.

We speak of massive cooperation when a large number of individuals are involved in the creation process. Because of the cost in time and cost of communication, collaboration processes become less and less efficient while the number of individual increases. Cooperation, on the contrary, can exist both on small scales and large scales. As both situations might have different properties, we use the specific name of massive cooperation to describe a cooperation process involving a large number of users. Although there is no well-defined threshold for this definition, we can consider that a hundred of users at least are necessary to speak of massive cooperation.

On the Internet, new forms of collective work have emerged. In the social sciences, this kind of production have been studied under the term of Peer production (Benkler and Nissenbaum 2006; Wilkinson 2008; Duguid 2006; Haythornthwaite 2009), or Open collaboration (Forte and Lampe 2013; Riehle et al. 2009). Behind these terms, which can represent many kinds of different things such as wikis, crowdsourcing, open source development, and many others, is the idea of a collective production in which everyone is free to join and to contribute. Many factors have been identified as important in such creation, for instance the turnover of individuals or the presence of “leaders” who catalyze the creation.

The goals of the cooperative creations can be very diverse, sometimes obvious such as in Wikipedia or in Open Source development platform such as Github, but it can be considered that a phenomenon such as the efficient diffusion of information is also the result of a process of cooperative creation. The exact frontier between what is and is not a cooperative creation is not in the scope of this paper, it is for instance obvious that Wikipedia and open source development present some degree of communication and centralization, and are, therefore, an hybrid form of collective creation.

2.2 Artistic cooperation

Although collaboration is at the center of many artistic creation, including commercial music and movie creation, large-scale cooperation is rarer, and there is no, to our knowledge, successful, large-scale platform devoted to this purpose. Content databases exist, containing elements (sound, pictures, 3D models, etc.) used by artists to create new contents, but these contents are usually not fed back into the database, so there is no open cooperation, or at least not in a way that can be tracked.

Some platforms exist for the cooperative creation of drawings,1 videos,2 writings3 and so on, but most of them are chiefly games, or experiments, and do not produce contents widely diffused outside of the creators themselves. Furthermore, they seem to have relatively few participants. On the contrary, popular music videos published on NND often have hundreds of thousands of unique viewers, and reach notoriety even outside of their original platform.

One particularity of artistic creation is that it requires a certain degree of freedom; unlike an engineering task, artists usually work better on what motivates them, or inspires them, and are, therefore, more comfortable in an open environment. However, the creation of complex realizations is often not possible for a single person, especially not a professional. Massive cooperation adequately suits these needs; however, the creations issued from the commercial world are usually protected by copyrights that forbid most creators to use them as they would like. On the contrary, the creators of music and music videos on the platform we study in this paper, Nico Nico Douga (NND), release their creations without any form of protection, and any other creator is free to create whatever she likes based on it. This particular set up is a reasonable explication to the wide success of cooperative creation of music videos on NND.
Fig. 1

Degree distribution of the Nico Nico Dataset

3 NND dataset

3.1 Nico Nico Douga network

Nico Nico Douga (NND) is a video sharing social network, with functionalities comparable to those of YouTube or DailyMotion. NND originated from Japan, and is extremely popular in this country, with over 20 million registered users as of 2014, and ranking in the top 15 of most visited websites. Compared to previously cited platforms, NND offers some additional possibilities that we will use in this paper, such as associating keywords to videos and making references between videos.

While any kind of video can be published on NND, this paper will concentrate on videos involved in artistic cooperation, and more particularly on Music Video creation.

3.2 VOCALOID, Hatsune Miku and music videos

VOCALOID is the name of a singing voice synthesizer, a special voice synthesizer able not only to pronounce words but also to sing them according to a defined tune. Since its introduction in 2004, this software has encountered a huge popularity, in particular in Japan. NND played an important role in this popularity. Some users first created songs and published them on NND as music videos, usually with very simple visuals such as a static drawing. Other users liked these songs, and started to produce derived music videos, modifying visuals, voice, music, in thousands of different ways. Inspired by the character represented on the packaging of the software, they started to assimilate all songs composed with the synthesizer with a fictional singer, called Hatsune Miku. Although other voices have been created for VOCALOID, corresponding to different characters, Hatsune Miku remains the most famous one. While the authors of the songs and derived videos were initially amateurs, they became so famous that some of them started to release commercial versions of their productions. For instance, the group of producers known as Supercell have published albums sold in hundred of thousands of copies.

It is this cooperative creation of music videos that we will study in this article. To give an idea of the scale of the phenomenon, more than 200,000 videos have been tagged with the keyword VOCALOID in our dataset, and more than 130,000 with the keyword Hatsune Miku; these videos represent only a fraction of the total number of videos involved in the cooperative creation, as derivative productions such as dancing or singing are not usually tagged with it.

The dataset that we used has been used and described in some previous papers (Hamasaki et al. 2013; Cazabet and Takeda 2014), but none of the topics we address here have already been studied. In this chapter, we will describe its characteristics and the preprocessing we performed to study more in-depth the artistic cooperation phenomenon. Although we cannot share the full dataset due to privacy and properties issues, we share4 all data necessary for reproduction of our results on the figshare repository.

The dataset was composed by crawling metadata associated with all videos published on the network between January 2007 and December 2012. It is composed of a set of 2.6 Million videos with at least one keyword associated to them. For each video, the collected metadata consist of the author, associated keywords, associated description (author comment), number of views, and date of publication.

3.3 Categories of videos

The productions in NND can be of different categories, and we wanted to study the similarities and differences between these categories. By our knowledge of the network and observation of common tags, we derived a classifier, similar to the one used in Hamasaki et al. (2013), which, according to the keywords associated with a video, attribute a category to this video, using direct matching and regular expressions. Keywords are rich on NND because any user is free to associate a keyword to a video, even if it is not its own. As the number of keywords for a video is limited to 10, only the most relevant one remains.

The possible categories are the following:
  • ORIGINALSONG: an original musical composition

  • SINGING: a person is singing (example: replace the original voice of a famous song)

  • VOCALOID: a person is using VOCALOID to create a voice (example: replace the original voice of a famous song)

  • INSTRUMENT: a person uses musical instruments to create this video (example: add an instrument to a famous music)

  • PICTURE: a user creates one or several static pictures (example: illustrate a famous song with original drawing(s))

  • DANCING: a user films himself dancing for a famous song

  • 3DCG: a user uses a 3D Computer Graphic software to create this video (example: animate dancers dancing on a famous music)

  • ANIMATION: a user creates an animated picture (example : illustrate the lyrics of a famous song)

  • MASHUP: a mashup is a music video created by combining several original sources, such as different original music, or different versions of a same music.

  • MAD: MAD videos are an original type of video originally invented in Japan, involving a collage of videos and sounds from multiple sources. Compared to MASHUPs, MAD videos are more free, as they can be composed of people speaking, unrelated sounds or pictures, and do not usually compose a single, coherent music or song.

We were able to associate one of these categories to 1,427,715 of the collected videos. The other ones are not conserved as candidate for being part of an artistic cooperative process.

A small fraction of videos (less than 1 %) can be associated with several categories. In this case, to avoid confusion, we keep only one category, the less common one.

3.4 Links

In NND it is a common practice, in particular among music video creators, to reference the videos they used, or by which they have been inspired. This explicit citation is done by inserting the unique identifier of the referenced video in the comments of the video. This method is recognized by the platform, and a hyperlink is automatically created to the referenced videos. We crawled the collected comments to automatically extract the links associated to videos.

A total of 7.9 million links have been identified this way. On problem is that these references have many usages. For instance, they can be used to reference videos in a same series by the same authors, such as a long video cut in several parts, or just to link the other creations made by the same author. As we are only interested in cooperation, we filtered out all videos referencing a later one, and all videos referencing another video by the same author. Although this might suppress some interesting links, it allows us to focus on actual cooperation between individuals. The resulting graph, composed only of videos with a recognized category and having at least one link to another video of a known category, is composed of 671,428 nodes and 960,854 edges, following a typical long-tailed distribution for the in-degree, as illustrated in Fig. 1. Note that there are some irregularities in the distribution for the out-degrees around \(d=20\) and \(d=40\), probably due to a platform limit.

4 Relation between popularity and influence on the creation process

In this chapter, we will study the relation between two forms of impact that the videos of our dataset can have. On the one hand, videos can be viewed by any user of the platform, whether they themselves create videos or not. By counting the number of views totalized by a video, we can evaluate its Consumption Impact, or Co-Impact, that is a measure of the popularity of a video among the general public.

On the other hand, each video can also inspire other creators, and lead them to produce new videos based on it. By counting the number of references accumulated by a video, we can compute its Creation Impact, or Cr-Impact.
Fig. 2

Scatter plot, for each video, of its number of views compared to its number of references

Fig. 3

Same as Fig. 2, but with colors identifying some of the categories. We can observe significant differences between categories (color figure online)

4.1 Impact of individual videos

The first question we ask is how strong is the linear correlation between these two forms of impact. We intuitively expect them to be correlated, but our observation shows that this relation is not reciprocal. In Fig. 2, we plot the relation between these two variables. As we can see, the relation exists but is rather weak (Pearson’s correlation \(=\) 0.31).

We can observe empirically in Fig. 2 that the relation between these variables is asymmetric:
  • If a video has a low Co-Impact, then it also has a low Cr-Impact

  • If a video has a high Cr-Impact, then it also has a high Co-Impact

  • If the video has a low Cr-Impact or a high Co-Impact, it does not tell us much about the other variable.

This is an interesting observation: it is not because a production is successful among the general public that it will necessarily inspire other creators, but the videos that generate the more subsequent creations are also among the most popular videos in terms of number of views.

In Fig. 3, we propose a variation of this graph, for which the color of the dots corresponds to some chosen categories: CG3D, ORIGINALMUSIC and SINGING. We can observe different behaviors for each of these categories. In particular, CG3D videos tend to have high Cr-Impact relatively to their Co-Impact, while SINGING videos have an opposite behavior. The interpretation is that many SINGING videos have a lot of success among the general public, but do not inspire much other creators. On the contrary, CG3D videos are rarely seen as much as the most famous videos, but many other authors create new videos based on them.

Interestingly, the correlation coefficient of the two measures for SINGING and ORIGINALMUSIC videos in particular is quite higher than the global one, with respectively \({\text {cor}}=0.52\) and \({\text {cor}}=0.72\). This means that the rather low linear correlation between Cr-Impact and Co-Impact observed when taking all videos together might be expressed as the combination of relevant linear correlation with different slopes between these two properties, for videos of a same category.
Fig. 4

Scatter plot for each author of her values of Cr-Impact and Co-Impact

4.2 Impact of authors

We can also define the Impact of an author, by taking the sum of the Impact of the videos she has published. The overall relation is, unsurprisingly, rather similar to the one for the videos. In Fig. 4, we show this relation, including another element: the “speciality” of the author, defined as the most common type of videos she has created. Here again, we can observe different trends according to the specialty of the author. However, compared to the video case, we can see that the authors specialized in CG3D do not present a clear pattern. A possible explanation is that these authors publish other types of videos as well, or that several types of CG3D creators exist.

4.3 Preponderance of top influencers

As is common in social networks, the distribution of the influence of authors follows a long tail distribution. This means that a small fraction of users, that we call top influencers, generate most of the impact, both for Creation and Consumption Impact. However, the strength of these top users is not the same for both types of influence. In Fig. 5, we show that the top users are significantly more important for Cr-Impact. If we look at the influence of the top 10, top 100 and top 1000 users, we observe that they represent, respectively, 2, 14 and 40 % for Co-Impact, and 12, 37 and 68 % for Cr-Impact.

This shows how essential a fraction of the users can be for the cooperative creation. While hundreds of thousands of users are involved in the creation process, only a 1000 of them are the source of inspiration of more than two-thirds of all creations.

But we can also show that there are no, on one side, famous creators that attract all the attention, and on the other side, unknown creators whose work is unknown from the general public. On the contrary, the number of views is distributed more evenly than the number of citations. On Fig. 6, we represent the relation between the cumulative frequency of Cr-Impact and the one of Co-Impact for authors sorted by Cr-Impact. We can read this graph in the follower manner: the authors that attract 50 % of all references (Cr-Impact) attract only 13 % of all views. The authors that concentrate 85 % of Cr-Impact represent only around 50 % of the global Co-Impact.

We can conclude that, although famous creators are very important in the creation process, because they inspire other creators, boosting the overall number of creations, these other creations, less inspiring for other authors, are nevertheless attracting a lot of interest in terms of viewing by the general public of NND.
Fig. 5

Cumulative frequency of the impact of users, ordered by their rank. We can observe that the top users are more important in terms of number of references than in terms of number of views

Fig. 6

Relation between the cumulative frequencies of Cr-Impact of authors ordered by decreasing order of Cr-Impact and of the cumulative frequencies of Co-Impact for these same authors

5 Users specialization

When several individuals are working together to solve a complex problem, one may argue that they are more likely to succeed if they do not share the same knowledge, if they have different backgrounds and capabilities that can complete each other. This is a common observation in economics, with the specialization of Labor for instance, or in the animal world, where social animals like ants or bees are highly specialized. Because each individual is better in what she is doing than anyone could have become if they all worked on everything, each individual can take care of one aspect of the global problem, and the resulting product is better than if it were done by a similar number of non-specialists.

An interesting aspect of our dataset is that we can easily differentiate the type of contribution done by the different users. Our idea is, therefore, to test if we can observe this specialization, and how important it is.
Fig. 7

Difference in entropy between observation and a null model, averaged by authors with a same number of videos published. The lower entropy of the observation reflects the specialization of the users

5.1 Entropy as a measure of specialization

78,833 users have published two or more videos of an identified category in our dataset. We want to check if users are more likely to publish several videos of the same category than it would be the case if videos were published independently of their types. To do so, we use Shannon entropy, Shannon (1951), which can be used as a measure of diversity. We want to compute the average diversity in the type of videos published by users. For a given user, if all the videos she publishes are of the same type, then the value of the Shannon entropy will be 0, there is no diversity in its publications. The more different types of videos she publishes, the higher the diversity. As the maximum possible entropy for an author depends on the number of videos she published, we compute the average entropy for authors having published the same number of videos, as presented in Fig. 7. In this figure, we also display the expected value of entropy at this level, given a null model. Our null model consists in a random attribution of categories, given the observed global frequencies of each category. We observe a large difference independently of the number of videos published by the author. This observation confirms the specialization of authors in terms of the number of video they publish.
Fig. 8

Visualization of the specialization of famous users. Each column corresponds to a user. The height of the colored column corresponds to the fraction of the famous videos of the corresponding authors that are of the category corresponding to this color. We can observe that two-thirds of the famous users have published only one type of famous video, but that different authors can be specialized in different categories (color figure online)

5.2 Specialization of famous users

Only a small fraction of users manage to publish videos that become famous. In this section, we show that the specialization is even more obvious when studying only famous videos. We define a video as famous if it has more than 500,000 views, and we define a famous user as one with at least 5 famous videos. We identify 60 of such famous users. In Fig. 8, we represent the specialization of these users. We can see that two-thirds have published famous videos of a single category, while most of the others also have much of their famous videos of a single type, which is not common in the dataset, in particular ORIGINALMUSIC. These famous users are, therefore, experts in a particular domain of creation.
Fig. 9

Description of the possible triads for cooperative production

6 Network of cooperation

In this section, we are interested in understanding the network of cooperation in NND. We first describe the topological properties of the cooperation network, using an adapted version of the triad census method. In a second part, we describe how different categories of videos relate together.
Table 1

Topological properties of the cooperation: triad census, diameter and maximal in-degree for three networks

 

NND

Twitter

DBLP

Synthesis

0.0078

0.0463

0.1578

OTM

0.99

0.853

0.684

Cascade

0.0012

0.071

0.1383

Transitive

0.0006

0.0289

0.0193

Diameter

97

15

27

maxDegree

7838

987

1492

6.1 Topological properties of the cooperation

Triad census is a common method to unravel topological properties of networks (Davis and Leinhardt 1967). It is usually applied on directed social networks, on which reciprocal links are possible, and count the relative proportion of each of the 16 possible types of triads. However, in the case of our cooperation network, we are only interested in triads involving more than one link, and without reciprocity. Figure 9 summarizes the possible triads. Our interest is to see the frequency of each type of cooperation:
  • Synthesis triads (a) appear when a production makes references to several other ones with no relations between themselves. These productions are likely to take their inspiration from several sources.

  • OTM (One-to-Many) triads (b) appear when several productions are based on the same one, and do not reference each other. The typical situation is a famous production inspiring many unrelated new ones.

  • Cascade triads (c) appear when a production A references a production B, B references a production C, but A does not reference C. The reason might be that A is not aware of C, A is only referencing the “added value” of C, or A just does not acknowledge the transitivity.

  • Transitive triads (d) can have two interpretations: it can be seen as a cascade triad acknowledging the previous work, or as a synthesis of two works referencing themselves.

The results of triad count is usually not used as it is, but compared to other networks. We selected two networks with different properties for the comparison.
  • The DBLP (Ley 2002) is a well-known database of scientific papers, we used a version including references between papers, as described in Tang et al. (2008). Nodes represent papers while links represent citations

  • TwitterRT is a Twitter dataset described in Toriumi et al. (2013), Remy et al. (2013), that contains between 80 and 90 % of all tweets published between Japanese users on a period of several days. Nodes represent tweets, and links represent a follower/followee relation between a retweeter and either the original author of the tweet, or a previous retweeter of this same tweet. This dataset is different from DBLP and NND, because it corresponds more to information diffusion than pure cooperation. However, we propose that information diffusion on social media can be seen as a special case of cooperation in which the productions are (hopefully) not altered. A retweet by a user U can be seen as the publication of a production on the personal space of user U. This production is directly inspired by the production of a previous user, and is, most of the time, identical. In this particular case, the implicit objective of the cooperation is not to be understood as the creation of more complex creations, but as the efficient diffusion of information.

Table 1 summarizes the prevalence of each type of triad, together with diameter and maximal in-degree for each graph. We first observe if the cooperation is rather deep or shallow, that is, if productions are built stone after stone, gradually, each author adding its own touch to the previous one, or on the contrary if most authors directly create their own version of an existing, well-known creation. In the special case of information diffusion, it can be seen as a transmission by word of mouth compared to a broadcast diffusion. One could think that the diameter—the length of the longest chain—and maximal in-degree could be indicators of this. However, it is not the case, mainly because mass cooperation, contrary to diffusion of single pieces of information, forms “endless” chains, as, for instance, the last production of a series can be referenced by the first production of an otherwise unrelated, more recent series. Triad census can be used as a more reliable indicator: OTM triads, and in a lesser extend synthesis, are occurrences of shallow cooperation, directly from the source to the final usage. The two other types of triad contain an intermediary, and add depth to the cooperation.

Whereas all three networks have a majority of OTM triads, we can see that the proportions greatly vary. The citation dataset is the less shallow, an observation not surprising as it is common to cite the most recent articles, which cite older articles, and so on and so forth. Our artistic cooperation dataset is by far the shallowest, an interesting observation that shows how important are famous productions for the cooperation process. We, however, want to stress that although the proportion is small, we nevertheless counted 418,111 Cascade triads and 217,643 Transitive ones, meaning that they are not rare phenomenons, but rather dwarfed by the extreme importance of OTM triads.

If we focus on the Synthesis triads, we can observe that DBLP has this time the highest proportion, which is again rather intuitive, as scientific does not make research based on a single previous work but rather by building upon the work of many unrelated researchers. In NND, the global proportion of Synthesis is much less, but, compared to Cascade and Transitive ones, it is rather high. We can, therefore, assume that it is also a common creation process in NND to create videos inspired by several sources.
Fig. 10

Visualization of the relative importance of different categories in the cooperation process

Fig. 11

Heat map of the relations between categories. The references are going from categories on the vertical axis to the horizontal ones. Left preponderance of the citing category from the cited category point of view (sum of columns \(=\) 1). Right preponderance of the cited category from the citing point of view (sum of lines \(=\) 1)

6.2 References between categories

To study in more details the cooperation behaviors, we looked at the chaining between categories. We cannot simply look at global metrics, such as the proportion of inter-category links, or the modularity of categories taken as classes, because categories are highly diverse: first, some categories are much more common than other, in particular the Singing category represents nearly 75 % of all videos, while other categories, despite their importance in the cooperation process, represent less than 1 % of the total amount of videos, such as ORIGINALMUSIC. Secondly, because, as we will see in the following sections, videos from each category have different relation patterns, that we highlight.

6.2.1 Prevalence of pairs of categories

In Fig. 10a, we show the distribution of the number of videos of each type. We see that the SINGING videos are by far the most common ones. Figure 10b gives a complementary vision of the links between these categories. It represents the probability of observing a particular type of chaining between two categories. The number of occurrences of a chain of categories \((c_1,c_2)\) is defined as:
$$\begin{aligned} {\text {occ}}(c_1,c_2) &= | \{ ({ v }_{ 1 },{ v }_{ 2 }),{ \quad ({ v }_{ 1 },{ v }_{ 2 }) }\in E, {\text {cat}}({ v }_{ 1 })\\ &={ c }_{ 1 },{\text {cat}}({ v }_{ 2 })={ c }_{ 2 }\} | \end{aligned}$$
with E the edges of the reference networks and cat(x) the category of the video x. The graphic is otherwise a classical sunburst chart: the inner ring represents the type of the referenced video and the outer one represents the type of the referencing video. This graph allows us to see how frequent is a particular type of cooperation. In particular, it is interesting to compare it with Fig. 10a. A first observation is that ORIGINALMUSIC videos represent more than half of all cited videos, while they represent a small fraction of the total number of videos published. ANIMATION and VOICE are also overrepresented in the proportion of referenced videos. We can say that videos of these types are good at generating cooperation. On the contrary, we see that SINGING videos are the most common types of videos in the outer ring, that is, videos of this type tend to reference other videos, and are not referenced as much as videos of other types. Finally, we can observe that some types of cooperation represent a significant fraction of all references between videos. In particular, SINGING videos referencing ORIGINALMUSIC represent nearly half of all references.

6.2.2 Heat map analysis

In Fig. 11, we present the heat maps of the relations between categories. The categories on the vertical axis correspond to categories making references, while categories on the horizontal axis are receiving references. The heat map on the left corresponds to the relative importance of the referencing category for the referenced category (sum of columns \(=\) 1) while the one on the right corresponds to the relative importance of the referenced category in the referencing one (sum of rows \(=\) 1).

We can first observe that some categories are self-replicating, or assortative, i.e., videos of the same type tend to reference each other. This is particularly true for DANCE and CG3D. This effect is also strong for MAD movies, but mostly in one direction: MAD movies reference in majority other MAD movies, but they are referenced by other categories of videos as well, in particular SINGING videos.

In the left figure, we can observe that SINGING and MASHUPS are the most common referencing categories for most other categories, but CG3D and DANCE.

In the right figure, we can observe that ORIGINALMUSIC, and, to a lesser level, VOCALOIDVOICE are the most common sources of inspiration of most other types, but CG3D, MAD and DANCE.
Fig. 12

Detail of the network of cooperation among famous videos (nbViews \(>\) 500,000)

6.2.3 Typical patterns of cooperation

In the previous sections, we have studied the topological properties of the network and how categories are linked between themselves. We argued that the cooperation was shallow, and we observed some common pairs of categories. By limiting ourselves to famous videos, we can generate a visual representation of the network, helpful to understand intuitively these observations. We generate the network of references between 664 videos having more than 500,000 views. Figure 12 is a partial view of this network, where the sizes of the nodes are proportional to their in-degree. We can see that it is mostly composed of small connected components (max size: 25), containing different categories. These connected component are typically star shaped, or composed of several star-shaped subnetworks. The center of the star is, commonly, an ORIGINALMUSIC video.

This visualization can help us to understand how the cooperation takes place on NND: an author proposes a new music video, which becomes successful. This video inspires other authors, specialists in their own field, to create videos based on it. As a consequence, several variations of the original music, improved in a direction or another, become also famous among the NND public.

7 Conclusion

In this article, we investigated several aspects of a massive artistic cooperation phenomenon on an online social network platform, Nico Nico Douga.

The main contributions of the paper are insights into the properties of the emerging organization of such a cooperation process. We have shown that:
  • The relation between the popularity (in terms of views) and the influence on the cooperation process is complex, but could be better understood by studying the type of videos.

  • Users are specialized. Whereas many works have shown the different roles of users (Bridges, Hubs, etc.) in dynamic processes such as information diffusion, we were able to show that, for artistic cooperation, users were also specialized in terms of the type of their productions. This can explain the high quality of some productions on NND, despite all contributors being only amateurs. We have shown that this specialization was especially strong among famous authors of videos.

  • In the last chapter, we investigated network properties of the cooperation. We found that cooperation was mostly shallow, that is to say, there are no long sequences such as video A reference video B that reference video C and so on and so forth. Despite the existence of such long chains, most of the works inspired by a given video reference it directly, without intermediaries.

These study’s results allow us to understand better how mass cooperation processes occur. This form of creation is ubiquitous, but studying it is usually extremely difficult due to the lack of information. Because we have extremely rich information about both videos and the relation between them, we were able to observe some properties that had never been observed before. Some of these properties should be also studied in other mass creation process, when it is possible, to know if they are specific to the case we studied in this paper or if they are generic properties.

Observed properties could have broad implications on the way we understand these cooperation processes: if creators are specialized, then the balance of the number of specialists is each category might be an explication for successful or unsuccessful processes, for instance. The complex relation between popularity and influence is also a factor to take into account when designing a creation-sharing platform for instance: individuals using the platform only as consumers and individuals who are also creators could be shown different videos, as the latter might be more interested in influencial videos than the former. Finally, our last observation on the network of citations is also very important, it shows, on the one hand, that one cannot really understand the network of cooperations without taking into account the nature of the productions, and, on the other hand, it shows how concentrated the cooperation process is. Most previous work have only observed this by looking at the degree distribution of nodes, whereas, in this paper, we looked also the sequences of nodes using the triad census method, and observed how the One-to-Many pattern was extremely dominant, a characteristic of a system in which the diffusion of creation is occurring mostly directly, and not by long chains of inspirations, which tells us a lot about how the mass cooperation process takes place.

We hope our observations could be used to create and improve existing platforms for artistic cooperation, by taking into account its specificity. We also think that some of our observations are not specific to artistic cooperation but could be related to other types of cooperative creations, in particular scientific research.

In future works, we plan to study more aspects of auto-organized cooperative creation process, in particular their dynamics. Understanding the dynamics of the system could allow us to explain the observations we highlighted in this paper: how a video becomes popular/important for the cooperation, how users choose to contribute to a growing cooperation process, or how users find the source of their inspiration.

Footnotes

References

  1. Amaral LAN, Scala A, Barthelemy M, Stanley HE (2000) Classes of small-world networks. Proc Natl Acad Sci 97(21):11149–11152CrossRefGoogle Scholar
  2. Bakshy E, Hofman JM, Mason WA, Watts DJ (2011) Everyone’s an influencer: quantifying influence on twitter. In: Proceedings of the fourth ACM international conference on web search and data mining, pp 65–74, ACMGoogle Scholar
  3. Benkler Y, Nissenbaum H (2006) Commons-based peer production and virtue*. J Political Philos 14(4):394–419CrossRefGoogle Scholar
  4. Cazabet R, Takeda H (2014) Understanding mass cooperation through visualization. In: Proceedings of the 25th ACM conference on hypertext and social media, pp 206–211, ACMGoogle Scholar
  5. Cha M, Haddadi H, Benevenuto F, Gummadi PK (2010) Measuring user influence in twitter: the million follower fallacy. ICWSM 10:10–17Google Scholar
  6. Clauset A, Shalizi CR, Newman ME (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703MathSciNetCrossRefMATHGoogle Scholar
  7. Davis JA, Leinhardt S (1967) The structure of positive interpersonal relations in small groups. Darthmouth CollegeGoogle Scholar
  8. Duguid P (2006) Limits of self-organization: peer production and “laws of quality”. First Monday 11(10) (2006). http://firstmonday.org/ojs/index.php/fm/article/view/1405/1323
  9. Forte A, Lampe C (2013) Defining, understanding, and supporting open collaboration lessons from the literature. Am Behav Sci 57(5):535–547CrossRefGoogle Scholar
  10. Hamasaki M, Goto M (2013) Songrium: a music browsing assistance service based on visualization of massive open collaboration within music content creation community. In: Proceedings of the 9th International Symposium on open collaboration, p 4, ACMGoogle Scholar
  11. Haythornthwaite C (2009) Crowds and communities: light and heavyweight models of peer production. In: HICSS’09. 42nd Hawaii International Conference on system sciences, 2009, pp 1–10, IEEEGoogle Scholar
  12. Leskovec J, Lang K, Dasgupta A, Mahoney M (2009) Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29–123MathSciNetCrossRefMATHGoogle Scholar
  13. Ley M (2002) The dblp computer science bibliography: evolution, research issues, perspectives. In: Laender AHF, Oliveira AL (eds) String processing and information retrieval. Springer, Berlin, Heidelberg, pp 1–10Google Scholar
  14. Morales A, Borondo J, Losada J, Benito R (2014) Efficiency of human activity on information spreading on twitter. Soc Netw 39:1–11CrossRefGoogle Scholar
  15. Remy C, Pervin N, Toriumi F, Takeda H (2013) Information diffusion on twitter: everyone has its chance, but all chances are not equal. In: 2013 International Conference on signal-image technology & internet-based systems (SITIS), pp 483–490, IEEEGoogle Scholar
  16. Riehle D, Ellenberger J, Menahem T, Mikhailovski B, Natchetoi Y, Naveh B, Odenwald T (2009) Open collaboration within corporations using software forges. Softw IEEE 26(2):52–58CrossRefGoogle Scholar
  17. Savage D, Zhang X, Yu X, Chou P, Wang Q (2014) Anomaly detection in online social networks. Soc Netw 39:62–70CrossRefGoogle Scholar
  18. Shannon CE (1951) Prediction and entropy of printed english. Bell Syst Tech J 30(1):50–64CrossRefMATHGoogle Scholar
  19. Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 990–998, ACMGoogle Scholar
  20. Toriumi F, Sakaki T, Shinoda K, Kazama K, Kurihara S, Noda I (2013) Information sharing on twitter during the 2011 catastrophic earthquake. In: Proceedings of the 22nd international conference on World Wide Web companion, pp 1025–1028. International World Wide Web Conferences Steering CommitteeGoogle Scholar
  21. Wilkinson DM (2008) Strong regularities in online peer production. In: Proceedings of the 9th ACM conference on electronic commerce, pp 302–309, ACMGoogle Scholar
  22. Yang J, Counts S (2010) Predicting the speed, scale, and range of information diffusion in twitter. ICWSM 10:355–358Google Scholar

Copyright information

© Springer-Verlag Wien 2016

Authors and Affiliations

  1. 1.National Institute of InformaticsTokyoJapan

Personalised recommendations