Keywords

Digital repositories for the humanities are mainly developing around huge amounts of textual documents. However, researchers in the humanities and social sciences are also struggling to design research tools for corpuses made of pictures, sounds, and objects. There is a real challenge to ensure better access to such resources, as well as to open new opportunities to represent and study them. Some humanities scholars have been using digital tools for more than 50 years: linguists, quantitative sociologists, archaeologists, etc. Other researchers have discovered only recently the possibilities offered by digital tools, exploring, for instance, the use of native data produced by digital technologies and sensors. More and more scholars are wondering about the ways they can use digital tools to transform or enhance their work methods, especially with regard to non-textual data. This is the case of the Dessins de dieux (DDD), the Children’s Drawings of Gods project.Footnote 1 Its members have been going through the production, gathering, digitization, and analysis of thousands of drawings made by children from different parts of the world. They are using extant digital tools for their research, designing new ones, and setting up a large-scale knowledge infrastructure, i.e. an infrastructure designed, built, and maintained to enhance and further the production and dissemination of knowledge. Our paper documents some aspects of this process, drawing on an ongoing ethnographic inquiry. Doing so, we examine how the notions of “equipment [équipement]” and “equipping work [travail d’équipement],” terms coined by one of us in previous studies (Vinck, 2006, 2009, 2011; Vinck & Penz, 2008), may shed some light on the shaping of this knowledge infrastructure.

The Sociological Study of Knowledge Infrastructures

If infrastructuring processes are relevant to scholars in the field of Science and Technology Studies (STS), it is because knowledge infrastructures shape the knowledge they help to produce. As far as knowledge depends on data and instruments, knowledge infrastructures are important to look at because they are not passive backdrops. Whether they affect theory, information, or scientific communities, they are core sites of political action bringing forth concerns of inclusion, exclusion, and marginalization (Karasti et al., 2016b). Social scientists have described the mutual shaping of scientific infrastructures, instruments, research collectives, and knowledge (Shankar et al., 2016). Among other aspects, knowledge infrastructures participate in the reconfiguration of academic labor environment.

Studying “Infrastructures-in-the-Making”

Since the seminal work of Star and Ruhleder (1996), STS scholars have witnessed these changes in research and knowledge production (Bowker, 2005; Hine, 2006; Edwards et al., 2007, 2013; Olson et al., 2008; Jankowski, 2010; Dutton & Jeffreys, 2010; Wouters et al., 2013; Mongili & Pellegrino, 2014; Karasti et al., 2016a, 2016b, 2016c). This interest can be easily explained: digital infrastructures are a major part of contemporary scientific work, even though we often hardly notice them. Infrastructure, indeed, is best thought of as a “contextualized ‘relation’” (Star & Ruhleder, 1996) rather than as a thing in itself, as an invisible although constant work, rather than as an easily defined or delineated artifact. In other words, the notion of infrastructure refers not so much to material entities as to the wider assemblage of activities that sustain their existence. To study infrastructures thus means to account for the situated and practical work of designing, developing, building, transforming, maintaining, and using infrastructures. Such activities ordinarily remain invisible—an invisibility that contributes to their efficiency. Users do not want to be bothered with problems related to the functioning and maintenance of infrastructures. They desire immediate access to what they search for and expect the infrastructure supporting the search to be “invisible”.

Invisibility and Infrastructural Inversion

Invisibility has long been a fundamental notion in infrastructure studies (Star & Ruhleder, 1996; Star, 1999, 2002; Bowker & Star, 1999; Bowker et al., 2009) as well as in the study of knowledge infrastructures (Karasti et al., 2016b). Going beyond the usual invisibility of infrastructures is only possible under peculiar circumstances, such as during their production or when they need to be repaired, maintained, or upgraded (Star & Bowker, 2006; Bowker et al., 2009; Karasti et al., 2010; Wouters et al., 2013; Jackson, 2014). An ethnographic methodology called infrastructural inversion is required to make visible otherwise neglected things, i.e. to scrutinize “technologies and arrangements that, by design and habit, tend to fade into the woodwork” (Bowker & Star, 1999, p. 34). Of course, the interest in and even the study of infrastructures is not an ethnographer’s privilege; rather, it is constitutive of our ordinary practices (Dagiral & Peerbaye, 2016). As epistemic, institutional, professional, ethical, or political issues associated with knowledge infrastructures are brought to the forefront, some previously unnoticed characteristics of those infrastructures tend to (re)gain some visibility.

Data and Knowledge Infrastructures

Besides the production and/or gathering of machines, protocols, standards, and people, major concerns in the shaping of knowledge infrastructures also include data design, production, sustenance, and circulation. As such, the study of data production cannot be dissociated from the study of knowledge infrastructures. The process by which “raw” data are processed and “cooked” into a new resource and integrated into the infrastructure is not straightforward (Bowker, 2005). Information technology (IT) experts and invisible workers have to face ambiguity and uncertainty as they determine if data are “good enough” to be embedded in a particular infrastructure and used for specific research purpose. Subtle terminological choices, such as the distinction between knowledge and “mere” information or raw data, are deeply significant. Indeed, they often embed competing visions in what should be seen as knowledge, what the infrastructure should be like, and what one should get from it (Gieryn, 1983; Dagiral & Peerbaye, 2016). In the case of digital knowledge infrastructures, data production often involves conflicting views of future scientific practices (Granjou & Walker, 2016). More generally, data production depends on various technical and social components, including the physical and emotional engagement of science workers, as well as the development of tacit knowledge and socialization to professional standards. This is what makes unusual divisions of cognitive labor challenging, notably in the case of crowdsourcing and the involvement of “citizen scientists” in research projects (Lin et al., 2016; Shavit & Silver, 2016).

New Challenges

As we have seen, infrastructure studies already include a large set of works, many of which target knowledge infrastructures. However, several paths of inquiry remain underexplored. One of the most obvious concerns the scientific areas under study. While most of published investigations have focused on the natural, medical, and engineering sciences, studies of knowledge infrastructures in the humanities and social sciences are still on the rise. Some researchers recently started to study these fields (Kleiner et al., 2013; Wouters et al., 2013; Meyer & Schroeder, 2015). However, very few of them analyze the building of digital tools, data, and infrastructures. This lack of interest is somewhat paradoxical, as digital data sharing has become more and more important in the humanities over the last few years. An increasing number of researchers collect and organize digital data in order to make them searchable. Doing so, they face conflicting situations and frictional moments (Edwards et al., 2011; Jaton & Vinck, 2016). This chapter, and more generally our inquiry into the DDD project, aim to fill this gap.

Methodological Approach

The paper draws on data gathered during an ongoing participant observation among the members of the DDD project. The team members themselves furthered our involvement. They wanted to engage with social scientists in order to enhance their reflexivity regarding digitization and interdisciplinary collaboration. As their project appeared challenging in the field of digital humanities, involving ethnographers was seen as a way to document new opportunities—and, maybe, new difficulties—facing researchers. We have followed them as they have shaped a knowledge infrastructure designed to support their own research projects, but also to provide researchers all over the world with access to their unique collection of drawings (which now encompasses more than 6500 drawings). We have also kept track of their efforts in the design and development of digital tools for textual—at least at the beginning of the project—and iconographic analysis. We have had the chance to observe many meetings and work situations, and to actually record many of them. We have conducted formal and informal interviews with the research group members and also with the IT specialists who have been involved in the project. We have gathered various working documents, including the minutes of team meetings. Some of the research members agreed to write a personal research diary that we later used to conduct interviews.

Among other aspects, we have studied the migration of the data gathered throughout the DDD project from one database to another (for more on this, see the recent work by Serbaeva, Chap. 18, this volume). Indeed, the database that had initially been produced and used soon appeared to the researchers to be obsolete, limiting, and not sustainable for long-term preservation. It supported only simple requests, such as selecting drawings according to various categories (country, year, age, and gender). After the project gained support from the Swiss National Science Foundation (SNSF) in 2014, and following the advice of IT specialists, the project members engaged in the design of a new data model and the transfer of the data to a new digital interface, still under development, aimed at the long-term preservation of humanities and social science research data. Since then, the project members have been building up a digital knowledge infrastructure enabling online deposits, visualization, and analysis of drawings. To accomplish this, they have collaborated with various groups of IT specialists (Oberhauser, 2016). These circumstances enabled us to witness and describe the various stages that lead to the existing digital infrastructure. However, the descriptions and analysis presented here focus mostly on the initial phases of the project, before it was funded by the SNSF.

From Drawings to Data: Some Steps on the “Long and Winding Road” of Equipping Work

In this empirical section, we use the notions of equipment and equipping work to analyze some aspects of the Children’s Drawings of Gods project. More specifically, these notions allow us to analyze the dynamics of producing metadata at various levels. We speak of equipment in a mundane fashion to refer to entities that are progressively added to others in order to enable certain actions. As plain as it appears, this notion has a real heuristic value, helping the ethnographer to shed light on issues that would otherwise remain invisible. For instance, one of us has shown in a previous study (Vinck, 2011) how the disagreements and misunderstandings between design technicians and engineers regarding technical drawings lead them to equip these drawings with marks or codes. Such details are central for both technicians and engineers, and they sometimes talk about them with such precision as to surprise (or bore) the inadvertent observer. The proper advancement of a project relies on these easily forgotten signs, e.g. an improperly numbered drawing will jeopardise the integration of each designer’s results. Analysing the Children’s Drawings of Gods project through the lens of equipping work, we want to highlight some of the ordinarily overlooked aspects of the building of a digital infrastructure. Due to the limited space available here, this depiction can only be incomplete. We have chosen to focus on two specific aspects of the infrastructuring process: the production of the drawings themselves and the subsequent definition of descriptors for analytical purposes.

Producing “Raw” Data

We could portray the research process as if it had started with data collection. We would then begin our description with the gathering of a first set of drawings in Japan by a graduate student (Kagata, 2006). That was back in 2003. However, even in this preliminary step, the data had already been “framed.” As many authors have pointed out, we never face “raw” data (Räsänen & Nyce, 2013). In this case, the initial framing was scientific. The student’s research took place in the psychology department of a Swiss university. It was supervised by a professor of developmental psychology (Piagetian tradition), and a professor in the epistemology and methodology of psychology. The former would become the leader of the Children’s Drawings of Gods project. Drawings had already been used to investigate children’s representation of supernatural agents, in a developmental perspective (Harms, 1944). Additional studies had pursued this developmental approach while trying to identify the influence of such factors as culture, religion, and gender on children’s representations of godFootnote 2 (Hanisch, 1996). However, these studies only analyzed drawings from Western countries heavily influenced by Christianity. Reflecting on this gap in the existing literature, both supervisors saw the project as an opportunity to expand the focus beyond Judeo-Christian representations within Western populations. They were also interested in the use of drawings as an alternative to the quantitative methods that predominated in their field, and which were thought to both orient and limit what respondents could express. They did not want to reproduce what they saw as interpretive “subjectivity” in existing analysis of drawings. Regarding the DEA student’s project, their aim was to come up with a list of descriptors that could offer an unbiased depiction of the children’s drawings. Such objective descriptors would for instance differentiate types of figures in the drawings (e.g. angel, Buddha, etc.), specify their location on the sheet (in the middle or close to the lower or upper edge), or define page orientation (landscape or portrait).

In this context, the student and her main supervisor designed a protocol to produce the drawings they would use as data. The researchers were very concerned by the precise formulation of the instructions to be given to the children. Such concern is usual in their discipline, in order to avoid cultural and gender bias. The children were invited to draw on a blank A4 sheet of paper using specific water-resistant wax crayons. The student researcher organized the data collection to work with small groups of children, not to exceed ten participants at a time. Each child was seated at a different desk to ensure they would not see the drawings of their peers or communicate with one another. Once the drawing task was finished, each participant was asked to write on the back of the page: the date of the drawing, his/her birthdate, his/her first name, a restatement of the instructions they received at the beginning of the drawing task, and a narrative description of the drawing s/he had made. Additional material was sometimes collected and gathered with the drawings, such as a list of religious figures provided by the participating institutions. Following this protocol, the student collected 142 drawings in 2003–2004 from children 7–13 years old, out of the Chiba, Fukushima, Kyoto, and Tokyo regions of Japan. Some participants attended public (secular) schools, while others attended religious (Buddhist) schools.

After data collection began in Japan, it was carried on by other researchers. Some of them were members of the Swiss research group, one of whom was a cultural psychologist from Buryatia (Russia). Others were partners from different countries (Russia, Iran, etc.), with various scientific and religious backgrounds, interested in collaborating on the project. The production and travelling conditions of these new drawings soon became a major concern for the research team members. They engaged in multilingual translations of the instructions and questionnaire. They also made thorough investigations about the local contexts and conditions in which the drawings would be made (e.g. if children had tables to draw on or not, if the task was performed in a closed classroom or in open air, if children were talking to each other, or if they worked silently as expected, if adults or older children helped them or not, etc.). The goal was both to stabilize the protocol despite its translations and to identify variations regarding its application. Translation was not only a matter of language, but also of cultural and institutional variations. Some partners had specific research or cultural interests, which were not necessarily directly congruent with developmental psychology, and would result in some local adaptations (e.g. adding a complementary question). In some cases, parents, teachers, or school directors would want more information, or even oppose the task (for instance if they thought that the act of drawing god should not be allowed). Sometimes, children themselves objected the task for this reason, or because they believed that drawing god was a task reserved for “specially trained artists” only. Children who objected to the task were invited to write down why they had declined, to fill in the questionnaire, and to provide a written description of what their representation would have looked like, if they had drawn one.

We want to emphasize three aspects of this initial step of the research process. First, in order to produce the drawings, decisions were made regarding psychological questions; cultural, gender, and religious influence; local institutional influence; and materiel conditions. The researchers kept track of these decisions, even if what led up to the actual decision has been forgotten (as is often the case). The protocol and precise instructions used in the field are preserved with the drawings. Without these data, the drawings would lose part of their meaning and usefulness. For example, it would be very difficult to interpret the drawings if the instruction “You can draw all that comes to your mind when you think of the word god was lost or forgotten.

Second, the decisions regarding the protocol are echoed by the efforts made in the field to standardize the drawing conditions, the possibilities offered, and the constraints imposed on the children. The work of eliminating bias is complex and entails various entities: the protocol, the instructions, and the questionnaires (with their multilingual translations), of course, but also tables and chairs, A4 sheets of paper, and wax crayons. It requires adults who can make sure that the children: understand what they have to do, do not speak to one another, and do not look at one another’s drawings. It requires children who are willing to obey their teacher and the investigator, etc.

Third, the drawings are not isolated. Just as they are bound to traces of the conditions in which they were produced, they are linked to a set of other heterogeneous entities (see Fig. 17.1). On the back of the sheet are written the date, the first name of the participant, the instruction s/he recalled having received at the beginning of the task, and his/her own narrative description of the drawing. To the drawing is added a document containing a questionnaire and information regarding the gender, age, school, and religious affiliation of the child. The document and drawing are linked physically, but also through an identical reference number. A specific series of drawings is sometimes associated with photographs coming from the collection site. Further contextual information is gathered regarding the school, the specific context of the task, the person collecting the data, the local language, the precise wording of the task given to children, the number of children participating in the activity, and possible comments by partners on the field.

Fig. 17.1
figure 1

Raw data as an assemblage of heterogeneous traces

Regarding these three aspects of the data collection, we would like to point out that various forms of equipping work must be achieved in order to render the drawings commensurable. If the information regarding the participants was lost, or if the drawings could no longer be linked to the protocols used to create them, then they could not be compared and thus analyzed. In fact, they would no longer qualify as data, as it is such commensurability that defines a dataset: entities that cannot be compared in any defined way do not belong in such a set. We understand the equipment of the drawings to be a condition for their existence as data. This point is often missed in discussions about raw data. There is no such thing as isolated data: links must be produced that define sameness and control the differences between the entities under study. Thus, the drawings do not travel alone: they are charged with information regarding the conditions in which they were produced and accompanied by other data. They emerge as parts of a vast assemblage gathering heterogeneous traces (drawing, narration, written information on the sheet of paper, marks and words on the questionnaire, documents of the protocol, pictures of figures of the site, etc.). The drawings are thus specific components related to other entities from which they receive some attributes and properties, among others “to be a drawing of god”—even when the so-called “drawing” consists in a white sheet without any pencil marks. Ethnographically speaking, it would be a mistake to describe the drawings as raw data and sever them from these other entities. As a matter of fact, researchers are quite preoccupied with maintaining the assemblage or repairing it (recoding, looking for the questionnaires associated to the drawings if missing, etc.) when it has been damaged (lack of space on a shelf, provisional separation of a subset of data for a specific treatment, travel incidents, etc.). Among others, specific traveling conditions are required in order to protect these assemblages, to avoid losing some elements, or the connections between them. Using the notion of equipment to understand data production thus help us see the intrinsic connectedness of data, and all that is necessary in order to produce and preserve such connections.

Producing “Objective” Descriptors

The team members wanted to add descriptors to the drawings before engaging in the analysis. They could have written an open-ended narrative description of each drawing. However, as the scientific preoccupation was to engage in rigorous analysis, the idea was to avoid letting each individual analyst decide what to describe in the drawings and how to describe it. The potential influence of the researchers’ religious background was particularly dreaded. The preoccupation was to avoid any form of interpretation, either psychological or religious, at this early stage of data treatment, in order to keep the datasets as open as possible to various research questions. This ideal of neutrality and openness appears to be related to various factors, including the quantitative scientific methodologies promoted by international journals in psychology, and the team members’ various religious backgrounds and psychological approaches (Piagetian developmental psychology, Vygotskian cultural psychology, cognitivist psychology, social psychology). Particularly present throughout the project has been the goal of providing an interesting and usable database that would be accessible to other researchers around the world, and through which they could engage in new and still unthought-of analysis. Thus, neutrality and openness were seen as ensuring that the database would be compatible with the scientific interests and goals of virtually any researcher studying children’s representations of supernatural agents.

Neutrality, itself, however, can be open to many interpretations. If the research team wanted to contain subjectivity and reduce biases, the idea was nevertheless to produce descriptors that “made sense”. They did not see any use for inconsequential descriptions such as “This drawing shows five colors,” “Yellow covers a total area of 8 cm2,” “There is a red circle on the upper half of the sheet and a green rectangle on the lower half,” etc. The researchers wanted to avoid such overly factual descriptions as much as they wished to avoid overly interpretive ones. In the end, they adopted a long list of descriptors regarding: the number of figures present in the drawing, the presence or absence of a supernatural agent, the presence or absence of a clear-cut distinction between the supernatural agent and the scenery, the framing and characteristics of the supernatural agent, etc. The list was composed of more than 40 descriptors organized into four categories: (a) composition of the drawing, (b) scenery, (c) objective description of the main figure (e.g. hand position), and (d) attributes of the main figure (e.g. gender).

The researchers used these descriptors as they coded the drawings one by one (see Fig. 17.2). However, discussions continually arose regarding ambiguous cases (e.g. “Is this a human with an animal body, or an animal with a human face?”). Facing such difficulties, the researchers decided to use categories that accounted for the ambiguity, e.g. hybrid. They also relied on children’s written descriptions of the drawings as a way of interpreting them. Nevertheless, these solutions left open a vast array of questions. How many figures can the researchers code as hybrid before it becomes a problem for further analysis? Does using children’s written descriptions to interpret drawings introduce some sort of bias to the analysis? How should the researcher conceptualize the relation between drawings and children’s descriptions? Should they give priority to the drawing? How many descriptors do they need in order to properly describe a drawing? Should they add new categories when those already present seem inaccurate, or is it better to attempt to reduce their number? At what point should they prioritize speed over accuracy? Are some descriptors redundant? Such questions show that the work of defining descriptors is complex and often tiresome, grounded in the team members’ experiences regarding coding and data production as well as in ongoing discussions on scientific and analytical goals pursued by the project.

Fig. 17.2
figure 2

Examples of descriptors noted on drawings

Having coded a significant share of the drawings they had gathered, the researchers began to analyze them. At first, they studied the influence of age, gender, and type of school on the representation of god among Japanese children. They constructed a typology of drawings, distinguishing nine types of drawings:

  1. 1.

    Celestial figures,

  2. 2.

    Celestial human,

  3. 3.

    Terrestrial figures,

  4. 4.

    Buddha,

  5. 5.

    Monsters,

  6. 6.

    Masked entities,

  7. 7.

    Non-anthropomorphic entities,

  8. 8.

    Relation or narration,

  9. 9.

    Light.

They then produced a large array of statistics: about the children; about the drawings, using both descriptors and types; crossing children’s characteristics and type of drawing, type of school and type of drawing. These analyses lead them to a new argument regarding children’s representations of supernatural agents. In an article published in 2009, they showed that there was a contrast between the drawings collected in Japan and those from Western countries. In the Western sample, almost all anthropomorphic figures were male; in Japan, however, half of the girls drew a female god. Their data also suggested that religious (Buddhist) education influenced children, the older ones mostly opting for non-anthropomorphic rather than anthropomorphic representations. They also pointed out the influence of visual culture, as drawings of human(-like) beings were deeply influenced by manga codes (Brandt et al., 2009).

What can we say about these next few steps of the equipment process? A classic way of analyzing these operations would be to see them as the first “links” of a “chain of re-representations” (Latour, 1995), a series of translations during which the drawings are transformed into new representations, i.e. a set of codified descriptors that will enable new associations and actions. As they use the descriptors to code the drawings, the team members in effect translate them into sets of standardized attributes. Digitized, these sets of attributes enable statistical analysis. Aggregating a set of descriptors to each drawing transforms it into standardized data, opening new analytical ventures. The drawings are thus converted into data similar to and—of most importance—commensurable with, those from the questionnaire. Both sets of data become homogeneous, which allows for simultaneous data treatment, e.g. cross analysis of descriptors referring to the drawings with descriptors referring to the children, their intentions, or their environment. By producing a typology of drawings, the team members perform a new translation, which leads to further translations as they shape new representations of their data in scientific talks and articles. The final links of the chain articulate statements coming from the scientific literature with the representations previously produced: drawings, descriptors, types, statistics, and graphs.

Such an analysis is interesting because it underlines the fundamental, sequential nature of scientific work. Nevertheless, what should not be overlooked is that each translation—and even more so the whole sequence—can be controversial, especially when new practices and entities are involved. Regarding the project under study, replacing the drawing by standardized descriptors for all analytical purposes raised the fear that some interpretive quality would be lost along the way. Could the researchers understand the cultural component of drawings properly through such analysis? Paradoxically, a perspective very much concerned with the temporal dimension of scientific practices tends to lose sight of the fact that chains of re-representations must be painstakingly established before they become self-evident. To put it differently, such a process either rests heavily on previously stabilized methods, involving explicit and tacit knowledge, competencies, goals, incentives, etc. distributed between the various entities engaged in research practices; or it involves heated debates and sometimes-painful disagreements. As such, these methods can be seen as a part of the ordinary equipment researchers use when creating new re-representations from the data they previously produced, i.e. when equipping those data. In the case of the Children’s Drawings of Gods project, such equipment had to be produced (almost) from scratch.

Conclusion

The new avenues opened by digital technologies for researchers in the humanities and social sciences increasingly lead them to build large-scale digital infrastructures. In this paper based on our ethnography of the Children’s Drawings of Gods project, we focused on specific aspects of the shaping of such an infrastructure. Doing so, we put forward a very basic point: the process of building a digital infrastructure is not straightforward. The path that leads to collectively usable data is complex. Never-before-thought-of questions soon become impossible to avoid. Details that once appeared as almost meaningless raise what sometimes become crucial dilemmas. Along the way, decisions must be made, and choices of various kinds are thus progressively blackboxed into data, IT tools, research protocols, organization and skills. We also tried to formulate some more theoretically informed conclusions, using the notions of equipment and equipping work.

First, analyzing how raw data (i.e. children’s drawings) are produced, we suggested that data could be seen as such by the researchers only as far as they are, and continue to be, connected to a set of various entities. Data production thus consists in a large part of equipping work. The drawings must be securely and durably linked to the protocols and precise instructions used in the field, and to other information such as the name of the participant, the instruction s/he recalled having received at the beginning of the task, his/her own description of the drawing, etc.

Second, following the way that descriptors were added to the drawings, we used the notion of equipment in two different ways. On the one hand, focusing on data equipping, we highlighted the multiplicity and complexity of the mundane tasks facing the researchers. On the other hand, focusing on the equipment available to the researchers themselves when producing and organizing their data, we emphasized that such a digital humanities endeavor is at odds with many ordinary, well-stabilized infrastructuring processes, in which methods, processes, tools, competencies, etc., can be taken for granted.

Of course, a lot more could, and should, be said about this peculiar infrastructure and the way it is being built. We hope that these few propositions will help generate further interest in the “Children’s Drawing of Gods” project as well as in similar undertakings in the digital humanities.