A perspective is by nature limited. It offers us one single vision of a landscape. Only when complementary views of the same reality combine are we capable of achieving fuller access to the knowledge of things. The more complex the object we are attempting to apprehend, the more important it is to have different sets of eyes, so that these rays of light converge and we can see the One through the many. That is the nature of true vision: it brings together already known points of view and shows others hitherto unknown, allowing us to understand that all are, in actuality, part of the same thing. (Grothendieck 1986)

2.1 Authenticity, Completeness and the Digital

For the past twenty years, digital tools, technologies and infrastructures have played an increasingly determining role in framing how digital objects are understood, preserved, managed, maintained and shared. Even in traditionally object-centred sectors such as cultural heritage, digitisation has become the norm: heritage institutions such as archives, libraries, museums and galleries continuously digitise huge quantities of heritage material. The most official indication of this shift towards the digital in cultural heritage is perhaps provided by UNESCO which, in 2003, recognised that the world’s documentary heritage was increasingly produced, distributed, accessed and maintained in digital form; accordingly, it proclaimed digital heritage as common heritage (UNESCO 2003). Unsurprisingly yet significantly, the acknowledgement was made in the context of endangered heritage, including digital, whose conservation and protection must be considered ‘an urgent issue of worldwide concern’ (ibid.).

The document also officially distinguished between heritage created digitally (from then on referred to as digitally born heritage), that is, heritage for which no other format but the digital object exists, and digitised heritage, heritage ‘converted into digital form from existing analogue resources’ (UNESCO 2003). Therefore, as per heritage tradition, the semantic motivation behind digitisation was that of preserving cultural resources from feared deterioration or forever disappearance. It has been argued, however, that by distinguishing between the two types of digital heritage, the UNESCO statement de facto framed the digitisation process as a heritagising operation in itself (Cameron 2021). Consequently, to the classic cultural heritage paradigm ‘preserved heritage = heritage worth preserving’, UNESCO added another layer of complexity: the equation ‘digitised = preserved’ (ibid.).

UNESCO’s acknowledgement of digital heritage and in particular of digitised heritage as common heritage has undoubtedly had profound implications for our understanding of heritage practices, material culture and preservation. For example, by officially introducing the digital in relation to heritage, UNESCO’s statement deeply affected traditional notions of authenticity, originality, permanent preservation and completeness which have historically been central to heritage conceptualisations. For the purposes of this book, I will simplify the discussion1 by saying that more traditional positions have insisted on the intrinsic lack of authority of copies, what Benjamin famously called the ‘aura’ of an object (Benjamin 1939). Museums’ culture has conventionally revolved around these traditional, rigid rules of originality and authenticity, established as the values legitimising them as the only accredited custodians of true knowledge. Historically, such understanding of heritage has sadly gone hand in hand with a very specific discourse, the one dominated by Western perspectives. These views have been based on ideas of old, grandiose sites and objects as being the sole heritage worthy of preservation which have in turn perpetuated Western narratives of nation, class and science (ACHS 2012).

More recent scholarship, however, has moved away from such object-centred views and reworked conventional conceptualisations of authenticity and completeness in relation to the digital (see for instance, Council on Library and Information Resources 2000; Jones et al. 2018; Goriunova 2019; Zuanni 2020; Cameron 2021; Fickers 2021). From the 1980s onwards, for example, the influence wielded by postmodernism and post-colonialism theories has challenged these traditional frameworks and brought new perspectives for the conceptualisation of material culture (see for instance, Tilley 1989; Vergo 1989). The idea key to this new approach particularly relevant to the arguments advanced in this book is that material culture does not intrinsically possess any meanings; instead, meanings are ascribed to material culture when interpreting it in the present. As Christopher Y. Tilley famously stated, ‘The meaning of the past does not reside in the past, but belongs in the present’ (Tilley 1989, 192). According to this perspective, the significance of material culture is not eternal and absolute but continually negotiated in a dialectical relationship with contemporary values and interactions. For example, in disciplines such as museum studies, this view takes the form of a critique of the social and political role of heritage institutions. Through this lens, museums are not seen as neutral custodians of material culture but as grounded in Western ideologies of elitism and power and representing the interests of only a minority of the population (Vergo 1989).

Such considerations have led to the emergence of new disciplines such as Critical Heritage Studies (CHS). In CHS, heritage is understood as a continuous negotiation of past and present modularities in the acknowledgement that heritage values are not fixed nor universal, rather they are culturally situated and constantly co-constructed (Harrison 2013). Though still aimed at preserving and managing heritage for future generations, CHS are resolutely concerned with questions of power, inequality and exploitation (Hall 1999; Butler 2007; Winter 2011) thus showing much of the same foci of interest as critical posthumanities (Braidotti 2019) and perfectly intersecting with the post-authentic framework I propose in this book.

The official introduction of the digital in the context of cultural heritage has necessarily become intertwined with the political and ideological legacy concerning traditional notions of original and authentic vs copies and reproductions. Simplistically seen as mere immaterial copies of the original, digital objects could not but severely disrupt these fundamental values, in some cases going as far as being framed as ‘terrorists’(Cameron 2007, 51), that is destabilising instruments of what is true and real. In an effort to defend material authenticity as the sole element defining meaning, digital artefacts were at best bestowed an inferior status in comparison to the originals, a servant role to the real.

The parallel with DH vs ‘mainstream humanities’ is hard to miss (cfr. Chap. 1). In 2012, Alan Liu had defined DH as ‘ancillary’ to mainstream humanities (Liu 2012), whereas others (Allington et al. 2016; Brennan 2017, e.g.,) had claimed that by incorporating the digital into the humanities, its very essence, namely agency and criticality, was violated, one might say polluted. In opposition to the analogue, the digital was seen as an immaterial, agentless and untrue threatening entity undermining the authority of the original. Similar to digital heritage objects, these criticisms of DH did not problematise the digital but simplistically reduced it to a non-human, uncritical entity.

Nowadays, this view is increasingly challenged by new conceptual dimensions of the digital; for instance Jones et al. (2018) argue that ‘a preoccupation with the virtual object—and the binary question of whether it is or is not authentic—obscures the wider work that digital objects do’ (Jones et al. 2018, 350). Similarly, in her exploration of the digital subject, Olga Goriunova (2019) reworks the notion of distance in Valla and Benenson’s artwork in which a digital artefact is described as ‘neither an object nor its representation but a distance between the two’ (2014). Far from being a blank void, this distance is described as a ‘thick’ space in which humans, entities and processes are connected to each other (ibid., 4) according to the various forms of power embedded in computational processes. According to this view, the concept of authenticity is considered in relation to the digital subject, i.e., the digital self, which is rethought as a much more complex entity than just a collection of data points and at the same time, not quite a mere extension of the self. More recently, Cameron (2021) states that in the context of digital cultural heritage, the very conceptualisation of a digital object escapes Western ideas of curation practices, and authenticity ‘may not even be something to aspire to’ (15).

This chapter wants to expand on these recent positions, not because I disagree with the concepts and themes expressed by these authors, but because I want to add a novel reflection on digital objects, including digital heritage, and on both theory and practice-oriented aspects of digital knowledge creation more widely. I argue that such aspects are in urgent need of reframing not solely in museum and gallery practices, and heritage policy and management, but crucially also in any context of digital knowledge production and dissemination where an outmoded framework of discipline compartmentalisation persists. Taking digital cultural heritage as an illustrative case of a digital object typical of humanities scholarship, I devote specific attention to the way in which digitisation has been framed and understood and to the wider consequences for our understanding of heritage, memory and knowledge.

2.2 Digital Consequences

This book challenges traditional notions of authenticity by arguing for a reconceptualisation of the digital as an organic entity embedding past, present and future experiences which are continuously renegotiated during any digital task (Cameron 2021). Specifically, I expand on what Cameron calls the ‘ecological composition concept’ (ibid., 15) in reference to digital cultural heritage curation practices to include any action in a digital setting, also understood as bearing context and therefore consequences. She argues that the act of digitisation does not merely produce immaterial copies of their analogue counterparts—as implied by the 2003 UNESCO statement with reference to digitised cultural heritage—but by creating digital objects, it creates new things which in turn become alive, and which therefore are themselves subject to renegotiation. I further argue that any digital operation is equally situated, never neutral as each in turn incorporates external, situated systems of interpretation and management. For example, the digitisation of cultural heritage has been discursively legitimised as a heritigising operation, i.e., an act of preservation of cultural resources from deterioration or disappearance. Though certainly true to an extent, preservation is only one of the many aspects linked to digitisation and by far not the only reason why governments and institutions have started to invest massively in it. In line with the wider benefits that digitisation is thought to bring at large (cfr. Chap. 1), the digitisation of cultural heritage is believed to serve a range of other more strategic goals such as fuelling innovation, creating employment opportunities, boosting tourism and enhancing visibility of cultural sites including museums, libraries and archives, all together leading to economic growth (European Commission 2011).

Inevitably, the process of cultural heritage digitisation itself has therefore become intertwined with questions of power, economic interests, ideological struggles and selection biases. For instance, after about two decades of major, large-scale investments in the digitisation of cultural heritage, self-reported data from cultural heritage institutions indicate that in Europe, only about 20% of heritage material exists in a digital format (Enumerate Observatory 2017), whereas globally, this percentage is believed to remain at 15%.2 Behind these percentages, it is very hard not to see the colonial ghosts from the past. CHS have problematised heritage designation not just as a magnanimous act of preserving the past, but as ‘a symbol of previous societies and cultures’ (Evans 2003, 334). When deciding which societies and whose cultures, political and economic interests, power relations and selection biases are never far away. For example, particularly in the first stages of large-scale mass digitisation projects, special collections often became the prioritised material to be digitised (Rumsey and Digital Library Federation 2001), whereas less mainstream works and minority voices tended to be largely excluded. Typically, libraries needed to decide what to digitise based on cost-effective analyses and so their choices were often skewed by economic imperatives rather than ‘actual scholarly value’ (Rumsey and Digital Library Federation 2001). The UNESCO-induced paradigm ‘digitising = preserving’ contributed to communicate the idea that any digitised material was intrinsically worth preserving, thus in turn perpetuating previous decisions about what had been worth keeping (Crymble 2021).

There is no doubt that today’s under-representation of minority voices in digital collections directly mirrors decades of past decisions about what to collect and preserve (Lee 2020). In reference to early US digitisation programmes, for example, Rumsey Smith points out that as a direct consequence of this reasoning:

foreign language materials are nearly always excluded from consideration, even if they are of high research value, because of the limitations of optical character recognition (OCR) software and because they often have a limited number of users. (Rumsey and Digital Library Federation 2001, 6)

This has in turn had other repercussions. As most of the digitised material has been in English, tools and software for exploring and analysing the past have primarily been developed for the English language. Although in recent years greater awareness around issues of power, archival biases, silences in the archives and lack of language diversity within the context of digitisation has certainly developed not just in archival and heritage studies, but also in DH and digital history (see for instance, Risam 2015; Putnam 2016; Earhart 2019; Mandell 2019; McPherson 2019; Noble 2019), the fact remains that most of that 15% is the sad reflection of this bitter legacy.

Another example of the situated nature of digitisation is microfilming. In his famous investigative book Double Fold, Nicholson Baker (2002) documents in detail the contextual, economic and political factors surrounding microfilming practices in the United States. Through a zealous investigation, he tells us a story involving microfilm lobbyists, former CIA agents and the destruction of hundreds of thousands of historical newspapers. He pointedly questions the choices of high-profile figures in American librarianship such as Patricia Battin, previous Head Librarian of Columbia University and the head of the American Commission on Preservation and Access from 1987 to 1994. From the analysis of government records and interviews with persons of interest, Baker argues that Battin and the Commission pitched the mass digitisation of paper records to charitable foundations and the American government by inventing the ‘brittle book crisis’, the apparent rapid deterioration that was destroying millions of books across America (McNally 2002). In reality, he maintains, her convincing was part of an agenda to provide content for the microfilming technology.

In advocating for preservation, Baker also discusses the limitations of digitisation and some specific issues with microfilming, such as loss of colour and quality and grayscale saturation. Such issues have had over the years unpredictable consequences, particularly for images. In historical newspapers, some images used to be printed through a technique called rotogravure, a type of intaglio printing known for its good quality image reproduction and especially well-suited for capturing details of dark tones. Scholars (i.e., Williams 2019; Lee 2020) have pointed out how the grayscale saturation issue of microfilming directly affects images of Black people as it distorts facial features by achromatising the nuances. In the case of millions and millions of records of images digitised from microfilm holdings, such as the 1.56 million images in the Library of Congress’ Chronicling America collection, it has been argued that the microfilming process itself has acted as a form of oppression for communities of colour (Williams 2019). This together with several other criticisms concerning selection biases have led some authors to talk about Chronicling White America (Fagan 2016).

In this book I argue in favour of a more problematised conceptualisation of digital objects and digital knowledge creation as living entities that bear consequences. To build my argument, I draw upon posthuman critical theory which understands the matter as an extremely convoluted assemblage of components, ‘complex singularities relate[d] to a multiplicity of forces, entities, and encounters’(Braidotti 2017, 16). Indeed, for its deconstructing and disruptive take, I believe the application of posthumanities theories has great potential for refiguring traditional humanist forms of knowledge. Although I discuss examples of my own research based on digital cultural heritage material, my aim is to offer a counter-narrative beyond cultural heritage and with respect to the digitisation of society. My intention is to challenge the dominant public discourse that continues to depict the digital as non-human, agentless, non-authentic and contextless and by extension digital knowledge as necessarily non-human, cultureless and bias-free. The digitisation of society sharply accelerated by the COVID-19 pandemic has added complexity to reality, precipitating processes that have triggered reactions with unpredictable, potentially global consequences. I therefore maintain that with respect to digital objects, digital operations and to the way in which we use digital objects to create knowledge, it is the notion of the digital itself that needs reframing. In the next section, I introduce the two concepts that may inform such radical reconfiguration: symbiosis and mutualism.

2.3 Symbiosis, Mutualism and the Digital Object

This book recognises the inadequacy of the traditional model of knowledge creation, but it also contends that the 2020 pandemic-induced pervasive digitisation has added further urgency to the point that this change can no longer be deferred. Such re-figured model, I argue, must conceptualise the digital object as an organic, dynamic entity which lives and evolves and bears consequences. It is precisely the unpredictability and long-term nature of these consequences that now pose extremely complex questions which the current rigid, single discipline-based model of knowledge creation is ill-equipped to approach.3 This book is therefore an invitation for institutions as well as for us as researchers and teachers to address what it means to produce knowledge today, to ask ourselves how we want our digital society to be and what our shared and collective priorities are, and so to finally produce the change that needs to happen.

As a new principle that goes beyond the constraints of the canonical forms, posthuman critical theory has proposed transversality, ‘a pragmatic method to render problems multidimensional’ (Braidotti and Fuller 2019, 1). With this notion of geometrical transversality that describes spaces ‘in terms of their intersection’ (ibid., 9), posthuman critical theory attempts to capture ‘relations between relations’. I argue, however, that the suggested image of a transversal cut across entities that were previously disconnected, e.g., disciplines, does not convey the idea of fluid exchanges; rather, it remains confined in ideas of separation and interdisciplinarity and therefore it only partially breaks with the outdated conceptualisations of knowledge compartmentalisation that it aims to disrupt. The term transversality, I maintain, ultimately continues to frame knowledge as solid and essentially separated.

This book firmly opposes notions of divisions, including a division of knowledge into monolithic disciplines, as they are based on models of reality that support individualism and separateness which in turn inevitably lead to conflict and competition. To support my argument of an urgent need for knowledge reconfiguration and for new terminologies, I propose to borrow the concept of symbiosis from biology. The notion of symbiosis from Greek ‘living together’ refers in biology to the close and long-term cooperation between different organisms (Sims 2021). Applied to knowledge remodelling and to the digital, symbiosis radically breaks with the current conceptualisation of knowledge as a separate, static entity, linear and fragmented into multiple disciplines and of the digital as an agentless entity. To the contrary, the term symbiosis points to the continual renegotiation in the digital of interactions, past, present and future systems, power relations, infrastructures, interventions, curations and curators, programmers and developers (see also Cameron 2021).

Integral to the concept of symbiosis is that of mutualism; mutualism opposes interspecific competition, that is, when organisms from different species compete for a resource, resulting in benefiting only one of the individuals or populations involved (Bronstein 2015). I maintain that the current rigid separation in disciplines resembles an interspecific competition dynamic as it creates the conditions for which knowledge production has become a space of conflict and competition. As it is not only outdated and inadequate but indeed deeply concerning, I therefore argue that the contemporary notion of knowledge should not simply be redefined but that it should be reconceptualised altogether. Symbiosis and mutualism embed in themselves the principle of knowledge as fluid and inseparable in which areas of knowledge do not compete against each other but benefit from a mutually compensating relationship. When asking ourselves the questions ‘How do we produce knowledge today?’, ‘How do we want our next generation of students to be trained?’, the concepts of symbiosis and mutualism may guide the new reconfiguration of our understanding of knowledge in the digital.

Symbiosis and mutualism are central notions for the development of a more problematised conceptualisation of digital objects and digital knowledge production. Expanding on Cameron’s critique of the conceptual attachment to digital cultural heritage as possessing a complete quality of objecthood (Cameron 2021, 14), I maintain that it is not just digital heritage and digital heritage practices that escape notions of completeness and authenticity but in fact all digital objects and all digital knowledge creation practices. According to this conceptualisation, any intervention on the digital object (e.g., an update, data augmentation interventions, data creation for visualisations) should always be understood as the sum of all the previously made and concurrent decisions, not just by the present curator/analyst, but by external, past actors, too (see for instance, the example of microfilming discussed in Sect. 2.2). These decisions in turn shape and are shaped by all the following ones in an endless cycle that continually transforms and creates new object forms, all equally alive, all equally bearing consequences for present and future generations. This is what Cameron calls the ‘more-than-human’, a convergence of the human and the technical.

I maintain, however, that the ‘more-than-human’ formulation still presupposes a lack of human agency in the technical (the supposedly non-human) and therefore a yet again binary view of reality. In Cameron’s view, the more-than-human arises from the encounter of human agency with the technical, which therefore would not possess agency per se. But agency does not uniquely emerge from the interconnections between let’s say the curator (what could be seen as ‘the human’) and the technical components (i.e., ‘the non-human’) because there is no concrete separation between the human and the technical and in truth, there is no such a thing as neutral technology (see Sect. 1.2). For example, in the practices of early large-scale digitisation projects, past decisions about what to (not) digitise have eventually led to the current English-centric predominance of data-sets, software libraries, training models and algorithms. Using this technology today contributes to reinforce Western, white worldviews not just in digital practices, but in society at large.

Hence, if Cameron believes that framing digital heritage as ‘possessing a fundamental original, authentic form and function […] is limiting’ (ibid.,12), I elaborate further and maintain that it is in fact misleading. Indeed, in constituting and conceptualising digital objects, the question of whether it is or it is not authentic truly doesn’t make sense; digital objects transcend authenticity; they are post-authentic. To conceptualise digital objects as post-authentic means to understand them as unfinished processes that embed a wide net of continually negotiable relations of multiple internal and external actors, past, present and future experiences; it means to look at the human and the technical as symbiotic, non-discriminable elements of the digital’s immanent nature which is therefore understood as situated and consequential. To this end, I introduce a new framework that could inform practices of knowledge reconfiguration: the post-authentic framework. The post-authentic framework problematises digital objects by pointing to their aliveness, incompleteness and situatedness, to their entrenched power relations and digital consequences. Throughout the book, I will unpack key theoretical concepts of the post-authentic framework and, through the illustration of four concrete examples of knowledge creation in the digital—creation of digital material, enrichment of digital material, analysis of digital material and visualisation of digital material—I evaluate its full implications for knowledge creation.

2.4 Creation of Digital Objects

The post-authentic framework acknowledges digital objects as situated, unfinished processes that embed a wide net of continually negotiable relations of multiple actors. It is within the post-authentic framework that I describe the creation of ChroniclItaly 3.0 (Viola and Fiscarelli 2021a), a digital heritage collection of Italian American newspapers published in the United States by Italian immigrants between 1898 and 1936. I take the formation and curation of this collection as a use case to demonstrate how the post-authentic framework can inform the creation of a digital object in general, reacting to and impacting on institutional and methodological frameworks for knowledge creation. In the case of ChroniclItaly 3.0, this includes effects on the very conceptualisation of heritage and heritage practices.

Being the third version of the collection, ChroniclItaly 3.0 is in itself a demonstration of the continuously and rapidly evolving nature of digital research and of the intrinsic incompleteness of digital objects. I created the first version of the collection, ChroniclItaly (Viola 2018) within the framework of the Transatlantic research project Oceanic Exchanges (OcEx) (Cordell et al. 2017). OcEx explored how advances in computational periodicals research could help historians trace and examine patterns of information flow across national and linguistic boundaries in digitised nineteenth-century newspaper corpora. Within OcEx, our first priority was therefore to study how news and concepts travelled between Europe and the United States and how, by creating intricate entanglements of informational exchanges, these processes resulted in transnational linguistic and cultural contact phenomena. Specifically, we wanted to investigate how historical newspapers and Transatlantic reporting shaped social and cultural cohesion between Europeans in the United States and in Europe. One focus was specifically on the role of migrant communities as nodes in the Transatlantic transfer of culture and knowledge (Viola and Verheul 2019a). As the main aim was to trace the linguistic and cultural changes that reflected the migratory experience of these communities, we first needed to obtain large quantities of diasporic newspapers that would be representative of the Italian ethnic press at the time. Because of the project’s time and costs limitations, such sources needed to be available for computational textual analysis, i.e., already digitised. This is why I decided to machine harvest the digitised Italian American newspapers from Chronicling America,4 the Open Access, Internet-based Library of Congress directory of digitised historical newspapers published in the United States from 1777 to 1963. Chronicling America is also an ongoing digitisation project which involves the National Digital Newspaper Program (NDNP), the National Endowment for the Humanities (NEH), and the Library of Congress. Started in 2005, the digitisation programme continuously adds new titles and issues through the funding of digitisation projects awarded to external institutions, mostly universities and libraries, and thus in itself it encapsulates the intrinsic incompleteness of digital infrastructures and digital objects and the far-reaching network of influencing factors and actors involved.

This wider net of interrelations that influence how digital objects come into being and which equally influenced the ChroniclItaly collections can be exemplified by the criteria to receive the Chronicling America grant. In line with the main NDNP’s aim ‘to create a national digital resource of historically significant newspapers published between 1690 and 1963, from all the states and U.S. territories’ (emphasis mine NEH 2021, 1), institutions should digitise approximately 100,000 newspaper pages representing their state. How this significance is assessed depends on four principles. First, titles should represent the political, economic and cultural history of the state or territory; second, titles recognised as ‘papers of record’, that is containing ‘legal notices, news of state and regional governmental affairs, and announcements of community news and events’ are preferred (ibid., 2). Third, titles should cover the majority of the population areas, and fourth, titles with longer chronological runs and that have ceased publication are prioritised. Additionally, applicants must commit to assemble an advisory board including scholars, teachers, librarians and archivists to inform the selection of the newspapers to be digitised. The requirement that most heavily conditions which titles are included in Chronicling America, however, is the existence of a complete, or largely complete microfilm ‘object of record’ with priority given to higher-quality microfilms. In terms of technical requirements, this criterion is adopted for reasons of efficiency and cost; however, as in the past microfilming practices in the United States were entrenched in a complex web of interrelated factors (cfr. Sect. 2.2), the impact of this criterion on the material included in the directory incorporates issues such as previous decisions of what was worth microfilming and more importantly, what was not.

Furthermore, to ensure consistency across the diverse assortment of institutions involved over the years and throughout the various grant cycles, the programme provides awardees with further technical guidelines. At the same time, however, these guidelines may cause over-representation of larger or mainstream publications; therefore, to counterbalance this issue, titles that give voice to under-represented communities are highly encouraged. Although certainly mitigated by multiple review stages (i.e., by each state awardee’s advisory board, by the NEH and peer review experts), the very constitutional structure of Chronicling America reveals the far-reaching net of connections, economic and power relations, multiple actors and factors influencing the decisions about what to digitise. Significantly, it exposes how digitisation processes are intertwined with individual institutions’ research agendas and how these may still embed and perpetuate past archival biases.

The creation of ChroniclItaly therefore ‘inherits’ all these decisions and processes of mediation and in turn embeds new ones such as those stemming from the research aims of the project within which it was created, i.e., OcEx, and the expertise of the curator, i.e., myself. At this stage, for example, we decided to not intervene on the material with any enriching operation as ChroniclItaly mainly served as the basis for a combination of discourse and text analyses investigations that could help us research to which extent diasporic communities functioned as nodes and contact zones in the Transatlantic transfer of information.

As we explored the collection further, we realised however that to limit our analyses to text-based searches would not exploit the full potential of the archive; we therefore expanded the project with additional grant money earned through the Utrecht University’s Innovation Fund for Research in IT. We made a case for the importance of experimenting with computational methodologies that would allow humanities scholars to identify and map the spatial dimension of digitised historical data as a way to access subjective and situational geographical markers. It is with this aim in mind that I created ChroniclItaly 2.0 (Viola 2019), the version of the collection annotated with referential entities (i.e., people, places, organisations). As part of this project, we also developed the app GeoNewsMiner (GNM)5 (Viola et al. 2019). This is an interactive graphical user interface (GUI) to visually and interactively explore the references to geographical entities in the collection. Our aim was to allow users to conduct historical, finer-grained analyses such as examining changes in mentions of places over time and across titles as a way to identify the subjective and situational dimension of geographical markers and connect them to explicit geo-references to space (Viola and Verheul 2020a).

The creation of the third version of the collection, ChroniclItaly 3.0, should be understood in the context of yet another project, DeepteXTminER (DeXTER)6 (Viola and Fiscarelli 2021b) supported by the Luxembourg Centre for Contemporary and Digital History’s (C2DH—University of Luxembourg) Thinkering Grant. Composed of the verbs tinkering and thinking, this grant funds research applying the method of ‘thinkering’: ‘the tinkering with technology combined with the critical reflection on the practice of doing digital history’ (Fickers and Heijden 2020). As such, the scheme is specifically aimed at funding innovative projects that experiment with technological and digital tools for the interpretation and presentation of the past. Conceptually, the C2DH itself is an international hub for reflection on the methodological and epistemological consequences of the Digital Turn for history;7 it serves as a platform for engaging critically with the various stages of historical research (archiving, analysis, interpretation and narrative) with a particular focus on the use of digital methods and tools. Physically, it strives to actualise interdisciplinary knowledge production and dissemination by fostering ‘trading zones’ (Galison and Stump 1996; Collins et al. 2007), working environments in which interactions and negotiations between different disciplines can happen (Fickers and Heijden 2020). Within this institutional and conceptual framework, I conceived DeXTER as a post-authentic research activity to critically assess and implement different state-of-the-art natural language processing (NLP) and deep learning techniques for the curation and visualisation of digital heritage material. DeXTER’s ultimate goal was to bring the utilised techniques into as close an alignment as possible with the principle of human agency (cfr. Chap. 3).

The larger ecosystem of the ChroniclItaly collections thus exemplifies the evolving nature of digital objects and how international and national processes interweave with wider external factors, all impacting differentially on the objects’ evolution. The existence of multiple versions of ChroniclItaly, for example, is in itself a reflection of the incompleteness of the Chronicling America project to which titles, issues and digitised material are continually added. ChroniclItaly and ChroniclItaly 2.0 include seven titles and issues from 1898 to 1920 that portray the chronicles of Italian immigrant communities from four states (California, Pennsylvania, Vermont, and West Virginia); ChroniclItaly 3.0 expands the two previous versions by including three additional titles published in Connecticut and pushing the overall time span to cover from 1898 to 1936. In terms of issues, ChroniclItaly 3.0 almost doubles the number of included pages compared to its predecessors: 8653 vs 4810 of its previous versions. This is a clear example of how the formation of a digital object is impacted by the surrounding digital infrastructure, which in turn is dependent on funding availability and whose very constitution is shaped by the various research projects and the involved actors in its making.

2.5 The Importance of Being Digital

Understanding digital objects as post-authentic objects means to acknowledge them as part of the complex interaction of countless factors and dynamics and to recognise that the majority of such factors and dynamics are invisible and unpredictable. Due to the extreme complexity of interrelated forces at play, the formidable task of writing both the past in the present and the future past demands careful handling. This is what Braidotti and Fuller call ‘a meaningful response move from the relatively short chain of intention-to-consequence […] to the longer chains of consequences in which chance becomes a more structural force’ (2019, 13). Here chance is understood as the unpredictable combination of all the numerous known and unknown actors involved, conscious and unconscious biases, past, present and future experiences, and public, private and personal interests. With specific reference to the ChroniclItaly collections, for example, in addition to the already discussed multiple factors influencing their creation, many of which date even decades before, the nature itself of this digital object and of its content bears significance for our conceptualisation of digital heritage and more broadly, for digital knowledge creation practices.

The collections collate immigrant press material. The immigrant press represents the first historical stage of the ethnic press, a phenomenon associated with the mass migration to the Americas between the 1880s and 1920s, when it is estimated that over 24 million people from all around the world arrived to America (Bandiera et al. 2013). Indeed, as immigrant communities were growing exponentially, so did the immigrant press: at the turn of the twentieth century, about 1300 foreign-language newspapers were being printed in the United States with an estimated circulation of 2.6 million (Bjork 1998). By giving immigrants all sorts of practical and social advice—from employment and housing to religious and cultural celebrations and from learning English to acquiring American citizenship—these newspapers truly helped immigrants to transition into American society. As immigrant newspapers quickly became an essential element at many stages in an immigrant’s life (Rhodes 2010, 48), the immigrant press is a resource of particularly valuable significance not only for studying the lives of many of the communities that settled in the United States but also for opening a comprehensive window onto the American society of the time (Viola and Verheul 2020a).

As far as the Italians were concerned, it has been calculated that by 1920, they were representing more than 10% of the non-US-born population (about 4 millions) (Wills 2005). The Italian community was also among the most prolific newspapers’ producers; between 1900 and 1920, there were 98 Italian titles that managed to publish uninterruptedly, whereas at its publication peak, this number ranged between 150 and 264 (Deschamps 2007, 81). In terms of circulation, in 1900, 691,353 Italian newspapers were sold across the United States (Park 1922, 304), but in New York alone, the circulation ratio of the Italian daily press is calculated as one paper for every 3.3 Italian New Yorkers (Vellon 2017, 10). Distribution and circulation figures should however be doubled or perhaps even tripled, as illiteracy levels were still high among this generation of Italians and newspapers were often read aloud (Park 1922; Vellon 2017; Viola and Verheul 2019a; Viola 2021).

These impressive figures on the whole may point to the influential role of the Italian language press not just for the immigrant community but within the wider American context, too. At a time when the mass migrations were causing a redefinition of social and racial categories, notions of race, civilisation, superiority and skin colour had polarised into the binary opposition of white/superior vs non-white/inferior (Jacobson 1998; Vellon 2017; Viola and Verheul 2019a). The whiteness category, however, was rather complex and not at all based exclusively on skin colour. Jacobson (1998) for instance describes it as ‘a system of “difference” by which one might be both white and racially distinct from other whites’ (ibid., p. 6). Indeed, during the period covered by the ChroniclItaly collections, immigrants were granted ‘white’ privileges depending not on how white their skin might have been, rather on how white they were perceived (Foley 1997). Immigrants in the United States who were experiencing this uncertain social identity situation have been described as ‘conditionally white’ (Brodkin 1998), ‘situationally white’ (Roediger 2005) and ‘inbetweeners’ (among others Barrett and Roediger 1997; Guglielmo and Salerno 2003; Guglielmo 2004; Orsi 2010).

This was precisely the complicated identity and social status of Italians, especially of those coming from Southern Italy; because of their challenging economic and social conditions and their darker skin, both other ethnic groups and Americans considered them as socially and racially inferior and often discriminated against them (LaGumina 1999; Luconi 2003). For example, Italian immigrants would often be excluded by employment and housing opportunities and be victims of social discrimination, exploitation, physical violence and even lynching (LaGumina 1999; Connell and Gardaphé 2010; Vellon 2010; LaGumina 2018; Connell and Pugliese 2018). The social and historical importance of Italian immigrant newspapers is found in how they advocated the rights for the community they represented, crucially acting as powerful inclusion, community building and national identity preservation forces, as well as language and cultural retention tools. At the same time, because such advocate role was often paired with the condemnation of American discriminatory practices, these newspapers also performed a decisive transforming role of American society at large, undoubtedly contributing to the tangible shaping of the country. The immigrant press and the ChroniclItaly collections can therefore be an extremely valuable source to investigate specifically how the internal mechanisms of cohesion, class struggle and identity construction of the Italian immigrant community contributed to transform America.

Lastly, these collections can also bring insights into the Italian immigrants’ role in the geographical shaping of the United States. The majority of the 4 million Italians that had arrived to the United States—mostly uneducated and mostly from the south—had done so as the result of chain migration. Naturally, they would settle closely to relatives and friends, creating self-contained neighbourhoods clustered according to different regional and local affiliations (MacDonald and MacDonald 1964). Through the study of the geographical places contained in the collections as well as the place of publication of the newspapers’ titles, the ChroniclItaly collections provide an unconventional and traditionally neglected source for studying the transforming role of migrants for host societies.

On the whole, however, the novel contribution of the ChroniclItaly collections comes from the fact that they allow us to devote attention to the study of historical migration as a process experienced by the migrants themselves (Viola 2021). This is rare as in discourse-based migration research, the analysis tends to focus on discourse on migrants, rather than by migrants (De Fina and Tseng 2017; Viola 2021). Instead, through the analysis of migrants’ narratives, it is possible to explore how displaced individuals dealt with social processes of migration and transformation and how these affected their inner notions of identity and belonging. A large-scale digital discourse-based study of migrants’ narratives creates a mosaic of migration, a collective memory constituted by individual stories. In this sense, the importance of being digital lies in the fact that this information can be processed on a large-scale and across different migrants’ communities. The digital therefore also offers the possibility–perhaps unimaginable before—of a kaleidoscopic view that simultaneously apprehends historical migration discourse as a combination of inner and outer voices across time and space. Furthermore, as records are regularly updated, observations can be continually enriched, adjusted, expanded, recalibrated, generalised or contested. At the same time, mapping these narratives creates a shimmering network of relations between the past migratory experiences of diasporic communities and contemporary migration processes experienced by ethnic groups, which can also be compared and analysed both as active participants and spectators.

Abby Smith Rumsey said that the true value of the past is that it is the raw material we use to create the future (Rumsey 2016). It is only through gaining awareness of these spatial temporal correspondences that the past can become part of our collective memory and, by preventing us from forgetting it, of our collective future. Understanding digital objects through the post-authentic lens entails that great emphasis must be given on the processes that generate the mappings of the correspondences. The post-authentic framework recognises that these processes cannot be neutral as they stem from systems of interpretation and management which are situated and therefore partial. These processes are never complete nor they can be completed and as such they require constant update and critical supervision.

In the next chapter, I will illustrate the second use case of this book—data augmentation; the case study demonstrates that the task of enriching a digital object is a complex managerial activity, made up of countless critical decisions, interactions and interventions, each one having consequences. The application of the post-authentic framework for enriching ChroniclItaly 3.0 demonstrates how symbiosis and mutualism can guide how the interaction with the digital unfolds in the process of knowledge creation. I will specifically focus on why computational techniques such as optical character recognition (OCR), named entity recognition (NER), geolocation and sentiment analysis (SA) are problematic and I will show how the post-authentic framework can help address the ambiguities and uncertainties of these methods when building a source of knowledge for current and future generations.