Overall the results present a nuanced picture of the TRAC audit process as one in which the actors involved agreed on a classical definition of risk, but differed about whether an audit process based on this definition can determine trustworthiness with regard to long-term digital preservation. My findings demonstrate that while standard developers, auditors, and repository staff generally shared an understanding of the major sources of potential risk that face digital repositories, and which are identified through a TRAC audit, they disagreed about whether and how these risks can be mitigated and whether the evidence required for TRAC certification was sufficient to demonstrate trustworthiness with regard to the long-term preservation of digital content.
Interviewees discussed risk in ways that were consistent with the classical definition discussed above. For example, when asked how confident he was in the accuracy and completeness of the risk information that he received from his own team members during his repository’s audit, Repository Staff 18 explained that he did not think that his colleagues understood what risk meant for digital repositories, and that while it is relatively easy to find information about risk mitigation strategies it is more difficult to understand the probability and magnitude of consequences of a potential risk. This explanation highlighted an understanding of risk as calculable, but consisting of uncertain elements:
“Do I think that large amounts of people really understand how risk is constructed and what it means? No. … I think it’s relatively easy to get information about solutions and how things are implemented, and it’s harder to put that in a framework where you’re measuring the likelihood of it happening against the potential of it happening, and what the downsides are there, and how you tie specific numbers to that.”
The view illustrated by this interviewee demonstrated an understanding of risk in digital preservation that assumes it is important to understand risk as a calculable figure, despite the uncertainty of being able to calculate the risk. As with the classical model of risk, this understanding is based on an underlying assumption that people are rational actors who will understand risk information in similar ways and behave predictably in response to that risk information.
Potential sources of risk
Standard developers, auditors, and repository staff members conceptualized risk in the TRAC audit and certification process in terms of specific potential threats or sources of risk, which I have organized into five main categories: finance, legal, organizational governance, repository processes, and technical infrastructure. In the following sections I will examine each category in greater detail.
Interviewees across all three groups described financial uncertainty as a potential source of risk to the long-term preservation of digital content and framed their understanding of this threat in terms of long-term business planning and risk identification, although each group understood this risk and appropriate measures of risk mitigation differently. While auditors and repository staff agreed with the conceptualization of financial risk presented by the standard developers, they thought that the types of evidence posited by the standard developers to mitigate financial risk were insufficient.
Standard Developers 01, 02, 03, 06, 07, 08, 09, and 10 described uncertainty about funding sources and the lack of stable long-term funding as a significant source of potential risk for digital repositories. For example, Standard Developer 03 argued that financial viability was a potential source of risk because so few repositories have managed to secure long-term funding and remain operational, “Well other than repositories that are institutionally mandated, a long-term business plan is very difficult to come by. You know, there are a few long-lived digital repositories that aren’t institutional repositories, but there aren’t many that have lasted very long. So just how do you ensure that you’ve got adequate funding over the long-term when people’s interests change so rapidly?” This explanation highlights both the importance of long-term funding for digital repositories as well as the difficulty in securing that funding without an institutional mandate.
The perspectives presented by standard developers about financial sustainability as a potential source of risk for digital repositories is reflected in the text of the standard itself, which governed the audit process (Consultative Committee for Space Data Systems 2012b). It is through the development process for this document that the standard developers constructed and shaped an understanding of risk that includes threats to financial sustainability, and set expectations about how repository staff could demonstrate to auditors that they sufficiently identified and addressed those threats.
Despite their emphasis on the importance of financial sustainability, standard developers also recognized that securing long-term funding was a significant challenge for digital repositories. Thus, the succession plan requirement represented a workaround, or an alternate way for repositories to demonstrate the longevity of their digital content, “All of those sorts of things, and other repositories, the difficulty is the long-term funding, so in OAIS, the 16363, we kind of get around that by talking about having a succession plan” (Standard Developer 07).
As with the standard developers, auditors described succession planning as an important and necessary measure for repositories to mitigate the risk of organizational collapse due to insufficient funding, “I think that, in terms of the organization, they need to develop a succession plan and be very explicit about what's going to happen if their grant funding dries up, and if the membership starts to drop” (Auditor 10). Taking that a step further, Auditor 01 said that while it was important to know that the repository had a succession plan, it was also necessary for the repository to have tested that plan to ensure that transfer of digital content was possible, “Has that been tested? How many times have they tested that? What kind of variety of data have they tested it with?”
Repository staff agreed with the standard developers and auditors that financial sustainability was a potential source of risk for repositories and their content, “There's always a risk in that, with whatever might happen to that organization. Either a calamity, or loss of interest, or will, or funding, or whatever. There is a succession plan it says in there, so that's obviously a significant mitigating tool for that kind of failure of the organization. I think succession is tricky” (Repository Staff 06). Echoing the sentiments of Auditors 06 and 08, Repository Staff 05 said that while funding challenges are a common and substantial threat to digital repositories, in his experience most repositories do not have a succession plan, “I think a lot of institutions have been facing significant funding challenges … Do you even have a succession plan? I think a lot of places don't.”
Repository staff disagreed with standard developers and auditors about whether a succession plan was sufficient evidence of risk mitigation. Repository Staff 03, 06, 07, 12, and 21 all expressed skepticism that having a documented succession plan would ensure the longevity of a repository’s digital content, “I wasn't necessarily convinced that writing that down necessarily meant that it would sustain it” (Repository Staff 03). Repository Staff 12 was quite blunt in her assessment of succession planning as a futile activity. In a discussion about the infrastructure and security risk management section of the vignette, she argued that succession planning did not make sense because it is unlikely that a second repository would be able to muster the funding and support the first was lacking:
“What is really going to be the reason repositories are at risk, is almost all around having enough money to take care of the material . . . a succession plan to move it someplace else, where the community isn’t going to have enough money to take care of it. Or there’s going to be a, someone who magically dumps money on the secondary repository. Why couldn’t they dump money on the first repository? … It doesn't make sense.”
Repository Staff 07 went a step further and explained that by their very nature succession plans are unenforceable because they are only enacted when a repository fails. When asked about the greatest specific risk for his repository at the time of their audit, he said that a succession plan does not ensure that the successor organization itself will be financially viable long-term:
“ ... it’s almost like that's a weak link too because if you have a succession by definition you're gone afterwards so you can put a plan in place but you're not around to make sure that it's going to be executed. Just like you're not around forever your successors aren't necessarily around forever. Our successors are primarily universities and government agencies which all claim and pretend that they will exist forever, but you can't guarantee that so the succession plan doesn't actually spell out what's going to happen from now until the end of forever it just says that there's an agreement in place, it's a time limited agreement.”
Standard developers, auditors, and repository staff all agreed that loss of funding and/or institutional support was a potential source of risk for digital repositories and their content. Standard developers and auditors viewed succession plans as more viable evidence that a repository was prepared to address financial risk than did repository staff. Repository staff understood the reasoning behind succession planning but did not agree that a succession plan provided evidence that the digital content would outlive the repository. While they were happy to provide documented succession plans in order to achieve TRAC certification, they felt that they were performing rather than demonstrating trustworthiness.
Interviewees described legal issues, such as contracts, agreements, licenses, and copyright, as potential sources of risk for digital repositories. Both auditors and repository staff members agreed with the conceptualization of risk presented by the standard developers in this area. However, the auditors and standard developers expressed a shared view that agreements among organizations governing relationships that would impact the long-term preservation of digital information should be the primary focus of concern. Repository staff members, on the other hand, were more concerned that intellectual property issues would threaten the repository itself. In short, repository staff members were primarily concerned with the ability of their own organization to carry out its work, while individuals external to the repositories were more interested in external relationships. As with the example of succession planning above, standard developers and auditors believed that it would be possible for digital content to outlive an individual repository, while repository staff were skeptical that this would be the case.
Standard developers framed legal risk to repositories in Section 3.5 of the TRAC standard as something that was of particular importance in relation to access. Through the process of creating this text the standard developers established an understanding of legal risk as one that was a threat to both the repository and the digital content, and communicated to both auditors and repository staff members that it was necessary and important to “ensure that the repository has the rights and authorizations needed to enable it to collect and preserve digital content over time, make that information available to its Designated Community, and defend those rights when challenged” (Consultative Committee for Space Data Systems 2012b, p. 31). Standard developers set expectations for auditors and repository staff that a repository could demonstrate that it met this standard through a variety of properly executed legal documents.
Standard Developer 01 explained that the legal repercussions of releasing protected data could threaten the continued existence of a repository, “There’s laws in place in the U.S. I don’t know about the rest of the world, but certainly in the U.S., depending on what your repository is storing you may have very severe penalties imposed on you if you release information that’s supposed to be protected. The HIPAA [Health Insurance Portability and Accountability Act] is one example. There’s a Title XIII, which is census data. Both of those are legal systems where keeping the data under security controls is tantamount to keeping your organization from being ground by the wheels of justice.”
Alternately, Standard Developer 07 said that standard developers were not concerned with threats posed to a repository by legal issues, but rather to the digital information, “…we didn’t care if the repository itself was sued out of existence. What we were concerned about is that they were sued out of existence before it could hand over its data, its information.” For this interviewee, the legal danger to the organization, which a repository would be shut down before they could enact their succession plan, was a significant threat to the digital content.
Auditors did not devote much attention to legal risk during the interviews, but their discussions tended to express a shared understanding of risk in this area with the standard developers. Drawing from their experiences conducting audits as well as their own professional backgrounds in digital preservation, they focused on one aspect of the legal risk communicated through the TRAC standard. Namely, that it was important for repositories to have the appropriate legal agreements in place in order to ensure that their relationships with partners and members were secure. For example, Auditor 01 said that repositories “should probably have some legal staff on hand” to manage contracts among partner organizations because negotiating and executing things like service level agreements are complex and time-consuming. He also said that when assessing a repository it is important to understand whether those agreements are reciprocal or not in order to fully understand relationships among organizations and the potential sources of legal risk that the repository faces, “Is this a reciprocal agreement and what kind of risks does that expose them to?”
While standard developers and auditors emphasized the importance of having the necessary legal agreements in place in order to allow a repository to carry out the work necessary for long-term digital preservation, repository staff were not convinced that these legal agreements would be enough. Indeed, they were more concerned that even if these legal agreements were in place, execution of the access permissions and/or restrictions specified in, for example, intellectual property agreements would somehow fail, “a lot of the complexity came from … being able to provide access in the right ways” (Repository Staff 01).
Repository staff presented a view of legal risk that included a great deal of concern about copyright and the threat posed to repositories that provided inappropriate access to digital content. Repository Staff 06 explained that “the risk of compromise to the content that's in copyright” was an area of vulnerability for repositories. For this interviewee, the threat of providing inappropriate access to materials with copyright restrictions was a potential legal threat to a repository. He went on to argue that access in general is an area of risk for repositories, and that the push to provide repository users with meaningful ways to access and interact with data can interfere with the core mission of preservation by pulling resources away from that work, “I think access in general is complicated and getting more complicated.”
Repository Staff 01, 02, 05, and 06 all described copyright as a potential source of risk for digital repositories. For example, intellectual property rights were described as a “ticking time bomb” by Repository Staff 02, who explained that repository cost models were complex sources of potential risk for repositories, “The way that national copyright factors into the cost model, which is two-dimensional and I think very complicated, but it probably needs to be multidimensional more than that because of copyright issues.” Despite this concern, he felt that the auditors who assessed his repository had an inflated sense of the threat that copyright issues posed to his repository. He said that he disagreed with their “sense of risk” with regard to in-copyright materials, but “didn’t feel it was worthy of dispute” in the final TRAC audit report.
While standard developers, auditors, and repository staff all found legal issues, such as contracts, agreements, licenses, and copyright, to be potential sources of risk for digital repositories, the groups focused on different types of legal risk (interorganizational agreements versus copyright) and different foci of risk (repository versus digital content). Standard developers and auditors focused on relationships among partner and/or member organizations, and argued that those relationships were a potential threat to both repositories and digital content, and that agreements were necessary in order to ensure and enforce a commitment to the mission of long-term digital preservation. Repository staff, on the other hand, focused primarily on intellectual property issues and the threat that violating copyright posed to their repositories. They also spoke about the complexity of the legal agreements governing relationships among partner and/or member institutions and expressed some skepticism about whether an external party would be able to understand the legal landscape of their repositories.
With regard to legal risks, repository staff were focused on TRAC certification as a marker of whether a specific repository could be considered a trustworthy home for digital content, while standard developers and auditors focused on certification as a marker of how likely it was that digital content could outlive the repository itself.
Interviewees described organizational instability as a potential source of risk for digital repositories and discussed the ways in which internal governance structures and the positioning of the repository within larger organizations (e.g., universities, consortia, partnerships, etc.) were possible threats to both a repository and its digital content. While standard developers emphasized the ways in which the requirements laid out in the TRAC standard would mitigate potential threats to organizational stability, auditors and repository staff members were skeptical whether policies and documentation were meaningful as risk mitigation tactics. There was additional disagreement between auditors and repository staff concerning the efficacy of mission statements and policies. Repository staff members cited TRAC-certified organizations without clear mission statements and where staff members lacked a clear understanding of the overall mission of long-term preservation.
Section 3 of the TRAC standard focuses on organizational infrastructure and includes several subsections that specifically target governance, including Section 3.1 “Governance and Organizational Viability” and Section 3.2 “Organizational Structure and Staffing” (Consultative Committee for Space Data Systems 2012b). The Governance and Organizational Viability section specifies that a trustworthy repository should have a mission statement that reflects a commitment to digital preservation, as well as a strategic plan, a succession plan, and a collection policy that all reflect the mission of long-term preservation. The Organizational Structure and Staffing section also focuses on the need for appropriate staffing, position descriptions, and ongoing professional development to carry out the mission of long-term preservation. The standard developers’ view of organizational infrastructure and governance as a potential source of risk for digital repositories reflects a view of digital repositories as organizations that are at risk of losing focus on long-term digital preservation either because of mission scope creep or because parent or partner organizations have goals that differ from the repository. In this sense, they articulated in the standard an expectation that repositories will need to defend their focus on long-term preservation and that repository staff members should all understand how their roles serve that mission.
Standard developers discussed three areas of organizational governance as potential sources of risk for digital repositories: (1) institutional support, (2) leadership changes, and (3) organizational structure. Loss of institutional support was described by several standard developers as a major threat to digital repositories. For example, Standard Developers 01, 02, 05, 06, 08, and 10 all emphasized the potential risk for repositories and digital content associated with loss of support for the mission of long-term digital preservation. Standard Developer 05 said that uncertainty about organizational structure and staffing was a potential source of risk for digital repositories, “I think that the main question of uncertainty is related to the low level of organizational infrastructure, more than any other thing. Because if you have good people, at the right point, and the responsibility is well developed, the uncertainty could be covered.” This attitude toward organizational infrastructure and the emphasis on appropriate staffing of people with expertise reflected the TRAC requirements.
TRAC auditors reinforced this conceptualization of governance as a source of risk for digital repositories, focusing primarily on institutional support, “the most important aspect of a repository is having an organizational commitment with a mission that aligns with the repository” (Auditor 06). Auditors 01, 03, 04, 05, 06, 07, 08, and 10 described governance and organizational stability issues as both complex and uncertain. When asked to discuss the most significant sources of uncertainty for digital repositories, Auditor 03 discussed the uncertainty of long-term institutional support:
“We don't know if libraries are going to survive. We don't know if universities are going to survive. These institutions that support the...repository, are also at risk...We've constructed this organizational structure that includes digital repositories... I don't know who's going to support it in 50 years. I don't know if it's still going to be a library or a university or it's going to be some crowd funded thing...so I think that is the biggest risk for almost everything that we're doing now is knowing what's going to happen to these institutions because a lot of things are at risk right now.”
Auditors expressed attitudes similar to the standard developers when discussing the importance of governance in a TRAC audit. Reflecting the requirement described in the standard that digital repositories should have explicit mission statements emphasizing long-term preservation, for example, auditors described ongoing organizational support for preservation as a challenge for repositories, “Bottom line is it's a tremendous amount of resources required to do long-term preservation. Organizational commitment to those types of resources often waxes and wanes” (Auditor 05). This auditor went on to say that he thought that the organizational infrastructure elements of the TRAC checklist were more aspirational than realistic because in practice repositories lack support for long-term preservation. Auditors described repositories as organizations with competing priorities who must continually fight for resources to support long-term digital preservation efforts, and whose parent and partner organizations may or may not share their commitment to preservation.
As with the standard developers and auditors, repository staff members described governance and organizational stability as potential sources of risk for digital repositories, “I feel like the funding, the organizational governance, all those things are inherently risky and problematic” (Repository Staff 02). Like the auditors, these interviewees questioned whether TRAC certification could assess the stability of repository governance over time, “I think it probably could be quite difficult for any kind of certification program to validate how functional a governance system is” (Repository Staff 05). Repository staff members described policies and practices at their organizations that were complex and continually evolving.
While all of the repository staff members described long-term preservation of digital content as important for their organizations, there was disagreement about whether this should be the central mission of the repository. One interviewee in particular reported that his repository did not have a mission statement, and that their long-term goals focused on meeting user needs, which happened to include providing long-term access to particular content that was of interest to their Designated Community. In the documentation that this repository provided to auditors, the goals of their preservation efforts were articulated in the description of their Designated Community as providing long-term access to specific digital content for that community, but these preservation efforts were not described as part of the repository’s mission. When asked if there were any particular parts of the checklist or of the repository documentation that were particularly time-consuming to prepare, Repository Staff 18 described the workaround that his repository used to address the criteria in the standard without creating a mission statement for the repository that focused specifically on long-term preservation:
“One thing that was interestingly difficult to get was a sort of mission vision statement.... But it turns out we and a lot of other organizations don't have that existing in that form. Rather our mandate and our vision comes out of ... well, mandate comes out of the fact that the schools continue to pay money to us to exist. And our vision comes from our governance structure. So on some level you can say that our vision is to do what our community needs us to do. But that's not really useful in the context of the audit, so figuring out a way to answer those questions with our strategic plan, which we do have, took some time and some conversation.”
By questioning a central premise of TRAC certification and asserting that a repository need not have a mission statement reflecting a commitment to long-term preservation of digital content, the repository staff conceptualization of risk mitigation ran counter to that of the standard developers and auditors.
Overall auditors shared the standard developers’ view of organizational instability as a potential source of risk for digital repositories. While the standard developers described, through interviews as well as in the text of the TRAC standard itself, strategies for repository staff to demonstrate that they had policies and procedures in place to mitigate this risk, the auditors took a more circumspect approach to verifying that repositories were mitigating this risk. They described institutional support for digital repositories as changeable and likely to decrease over time, and explained that it was easier for repositories to secure initial support for digital preservation than to maintain support. Auditor attitudes about governance as a potential source of risk questioned the notion that a one-time audit could assess whether a repository should be considered trustworthy in its ability to preserve digital content over the long-term. Auditors were enforcing requirements from the TRAC standard in order to certify a repository as trustworthy, but were also skeptical about whether long-term trustworthiness with regard to governance could be determined in this way.
While standard developers and auditors agreed that a clear mission statement supported by well-documented policies would offset potential threats to repositories and digital content by ensuring that the repository maintained a focus on the goal of long-term preservation, repository staff were skeptical about the effectiveness of this type of documentation to offset these potential threats. Indeed, repository staff members said that they were able to provide the necessary documentation to achieve certification despite the fact that their repositories lacked the governance structures that they knew the standard was meant to enforce. In the case of repository documentation such as a mission statement, the difference between standard developers and auditors on one hand, and repository staff on the other, was in part a difference in perspective of their functions. Unlike standard developers and auditors, repository staff did not see policies as necessarily reflecting actual repository practices. Repository staff characterized such policies as ideals, but also described their repositories as organizations that were shaped by power struggles and lacking in the social mechanisms needed to meet the ideals represented in their documentation.
Interviewees identified processes for digital object management as potential sources of risk for digital repositories and digital content. They discussed ways that metadata creation, file format management, and processes such as content ingest, threatened the longevity of digital content as well as the ability of digital repositories to carry out their mission of long-term preservation. Auditors tended to agree with the view of risk presented by standard developers, but repository staff members argued that the actual work of managing digital content over time was not as straightforward as the TRAC standard implied. Repository staff members described the section of the TRAC standard focusing on digital object management as the one that generated the most disagreement with auditors during their audits, although they were indeed able to sufficiently communicate their practices and policies, and the reasoning behind them to obtain certification.
Section 4 of the TRAC standard, “Digital Object Management,” addresses repository processes as a potential source of risk (Consultative Committee for Space Data Systems, 2012b). Subsections covering ingest, preservation, management, and access of digital content make clear that potential threats exist throughout the entire lifecycle of a digital object, and suggest that repositories can demonstrate that they have sufficiently identified and addressed those threats through documentation such as policies and procedures, workflows, and curation logs. Thus, it is not surprising that standard developers discussed these repository processes (e.g., digital object management, such as ingest, transformations, capture/creation and management of metadata, and content delivery) as potential sources of risk for digital repositories and digital content. They described the goal of digital object management as “selecting and preserving the information in a way that will be useful … as part of the long-term preservation goal” (Standard Developer 01). This interviewee further explained that digital object management in the context of OAIS and TRAC was about more than “just managing digital formats,” it was “concerned about preserving the information content, not just the format” (Standard Developer 01).
Metadata creation, capture, and maintenance were discussed by Standard Developers 01, 04, 08, and 09. They explained that it was important for repositories to understand their Designated Communities in order to know what type of representation information would be needed to preserve digital content for future use. In the words of Standard Developer 04, “The greatest risk is understanding what needs to be captured now so that the data can be understood in the future.” When asked if there were any checklist criteria that repositories were commonly unprepared to provide evidence for, this interviewee went on to explain that lack of understanding about how important metadata are for long-term preservation was a threat to the long-term viability of digital content:
“[W]hat metadata they have, whether it's representational information or context information, which is necessary for the use of data, oftentimes was ignored.” (Standard Developer 04)
For developers of the TRAC standard, having sufficient, appropriate metadata was crucial for long-term preservation of digital content, and this emphasis on representation information was reinforced through the standard.
Another common theme among standard developers was the challenge that file formats posed to long-term preservation, “the more formats that you are taking in and using for your AIPs [archival information packages], the more complex that gets, the combinatorics when you start talking about multiple file formats, multiple record types, compound records, software dependence of the records” (Standard Developer 03). Standard Developers 03, 04, 06, and 08 all discussed potential threats relating to file formats, including obsolescence, difficulties in sufficiently documenting unusual file formats, and the lack the expertise, staffing, and funding to sustain the amount of work necessary to support a large number of different file formats within one repository. “I think most archives have preferred formats and then they have other formats that don't get the support that they need” (Standard Developer 08).
Standard Developers 04, 05, 07, 09, and 10 identified ingest, migration, and storage, as well as processes to verify the fixity or integrity of content as potential sources of risk, “The fixity or the integrity of the data is critical” (Standard Developer 04). Indeed, when asked to identify potential sources of risk in the digital object management section of the vignette, Standard Developer 05 explained that a number of factors during the ingest process that could negatively impact the repository and/or the longevity of the digital content:
“You have to maintain, as much as possible, the control of what is going to be transformed. Some properties [have] to be transformed. And of course in this case you can accept the transformation. You must accept. Because the digital preservation is dynamic. Formats change, digital signatures cannot be verified. So you have to build a documentation system able to document which kind of transformations have been done, on which basis. Because many of [these] transformation[s] are not reversible. They are forever. You have change and you are going to lose the original things and what was.”
The standard developers presented a view of repository processes for digital object management as one that required substantial documentation in order to ensure that future custodians and users of digital content would be able to access and understand that content. While standard developers focused on the potential threat to digital content posed by repository staff failing to understand what information to capture, create, and maintain, I found that auditors were more concerned that even when repository staff knew what policies and practices they should have, repositories lacked the staffing, expertise, funding, or organizational will to carry out that work.
Auditors described the work of digital object management as something that takes place across different functional areas of a repository, and explained that coordinating and managing this work was difficult. “In terms of the actual getting the work done from ingest to storage to metadata to access and all that, those functions can be spread all across the organization, whatever kind of organization they are. Being able to coordinate those functions and have clear lines of authority about when a policy is put in place, who has to adhere to it, and where the responsibility lies, that can be very difficult to do” (Auditor 01). Auditors 05 and 06 argued that repository processes for digital object management were a potential source of risk because of the likelihood that they would be abandoned or scaled back over time, “They start off with the goal of having defined processes, workflows, and all that sort of stuff, and over time a lot of that stuff gets either dropped or the period between things like migration activities or even just repository auditing activities expands as the organizations are pressed for resources and staff” (Auditor 05). Auditor 06 referred to these processes as “a series of handoffs…That you have to continually be touching, and curating, and evaluating content and digital collections or else they really will just die.”
In terms of errors that could occur in these processes, auditors argued that the stakes were high for repositories that focused on long-term preservation because of the likelihood that errors would go unnoticed for very long periods of time. For example, Auditor 09 identified human error as the greatest threat to digital repositories, “I think human failure, or failure in human-driven processes, which include a lot of technical processes. I mean, technical processes are only as good as the humans that develop them.”
Overall, auditors understood the view of risk provided by standard developers through the TRAC standard, and agreed that repository processes for digital object management were a potential source of risk for digital repositories and the content. Yet, auditors were more focused on how lack of human resources and human error or loss of resources would impact a repository’s ability to carry out the processes necessary for long-term preservation, while standard developers were concerned about whether repositories would understand the needs of their Designated Communities well enough to capture appropriate representation information for preservation and reuse, and whether their workflows and procedures were comprehensive enough to capture all of the actions applied to their collections over time. Standard developers assumed that addressing this potential source of risk was a matter of having enough information and technical knowledge about digital object management, while auditors questioned whether that information was knowable and argued that it would not be possible over the short term to assess whether a repository’s digital object management processes were successful.
As with the standard developers and auditors, repository staff members also focused on metadata, file formats, and repository processes for digital object management as potential sources of risk for digital repositories. Repository staff identified metadata as an area that could pose a potential threat to both the repository and the digital content. Repository Staff 03 explained that poor metadata management practices could negatively impact the usefulness of a repository for its users, “The devil’s in the details. You can maintain preservation metadata and do it well. Or you could do it poorly. And so risks, I guess, implicit there are if it’s not normalized, if it’s not taking advantage of controlled vocabularies or authority, things like that, then the quality of the preservation metadata, if it’s poor, could present a risk to the usefulness of the repository.” In addition to maintaining metadata over time, Repository Staff 07 emphasized that metadata objects change over time and it is important for repositories to keep pace with the changes to digital objects and their metadata in order to preserve digital content: “In terms of the actual content itself what we're finding is that it all changes, and in particular the metadata about objects changes a lot more than the underlying objects themselves. Both in terms of being enriched and enhanced over time, but also in terms of just being corrected.” Repository staff agreed with the standard developers that the work of maintaining file formats over time could pose a potential risk to repositories because of the amount of time and resources required.
However, repository staff disagreed that file format obsolescence or lack of expertise would be a problem for repositories. They argued instead that as long as there was sufficient interest and knowledge in the repository or its Designated Community they would be able to make sense of the digital content. For example, Repository Staff 04 pointed to current successes with outdated formats as an example, “You know, we've worried a lot in the preservation community about Word Perfect is gone. We can't read WordPerfect anymore or these weird file formats are gone and it's actually never been the case. We've never not been able to figure out what we've got, as long as we've still got it.”
Repository Staff 08, 12, and 15 spoke at length about processes to ingest content as costly and time-consuming. “Ingest of content is the most expensive piece, and it is where almost all the resources are spent. And unfortunately,…the content that is most at risk is the most expensive to ingest” (Repository Staff 12). Similarly, Repository Staff 08 stated it was costly to ingest digital content in a way that would support her repository’s mission of long-term preservation, “More often, however, the data would come to [repository] that had not been very rigorously produced or managed, and so it was expensive and time-consuming for us to process it in a way that allowed us to be confident of our preservation commitment.”
Repository staff painted a picture of digital object management processes as ongoing, time-consuming activities that required regular actions with no guarantee of long-term success. Repository Staff 07 explained that digital content requires regular attention in order to ensure the integrity of each item and make it usable for the Designated Community, “There's so many items that can simply become obsolete as well as physically degrade and long-term digital preservation requires handling the data on a regular basis, so that you actually are continually testing your assumptions that it's not only still there but still usable and fit for a particular purpose.” On the other hand, Repository Staff 11 argued that in practice digital object management processes required making compromises in order to balance this with other repository priorities, “One of the interesting things about being a preservation organization is that on the one hand you often have very high lofty ideals, but you have to balance that. There's a risk to meeting them. You have to balance that with the practical decisions.”
These attitudes were in contrast to the attitudes expressed by standard developers and auditors, that it was difficult for repository staff to meet the criteria set forth in the TRAC standard for digital object management, and that the discrepancy between the ideal and what repositories was likely to be able to accomplish presented a potential threat to repositories and content. Repository Staff 07 explained that this was an area of risk because best practices for managing digital objects for long-term preservation have yet to be established, “It quickly gets mind numbingly complex and [we] have not come to any really good future-proof answers that we're comfortable with in terms of identifying objects uniquely, and perpetually, and persistently.”
This disagreement between repository staff and standard developers, and auditors about whether meeting the criteria described in the standard would ensure the longevity of digital content surfaced during the audit of Repository Staff 04’s organization, “there were a lot of revisions we had to do in our technical section because of that. I don't mean this as an insult, but they wanted clean, formulaic answers, and there just weren't any.” He emphasized that the auditors, following the TRAC standard, wanted his repository to provide clear responses to the criteria in Section 4, but the actual work of managing digital objects was complicated and messy. Indeed, several repository staff members identified this area as one where they disagreed with auditors, or where auditors required a substantial amount of additional information before they would agree to certify the repository.
Standard developers, auditors, and repository staff members all described processes for digital object management as a potential source of risk for repositories. While standard developers and auditors characterized digital object management as relatively straightforward and held that clear documentation of digital object management processes would mitigate risks in this area, repository staff argued that the actual work of managing digital content over time was not as straightforward as the TRAC standard implies. This was the section of the TRAC standard that repository staff reported as the most contentious during the audit process, because auditors wanted clear documentation communicating repository processes, and repository staff members viewed their processes for digital object management as complex and difficult to communicate via documentation in the way that the audit process demanded.
Interviewees identified threats to the technical infrastructure of digital repositories as a potential source of risk. Standard developers and auditors both viewed threats to technical infrastructure as identifiable and manageable, and argued that repositories that engaged in the environmental monitoring required by the TRAC standard would be able to understand and respond to these threats. While some repository staff members agreed with this perspective, others questioned whether their repositories would be able to identify actual threats, and thought that even if they did identify them they might not have the resources to respond.
Among standard developers, threats to the technological infrastructure of repositories were described as a significant but manageable source of risk for repositories and their content. These interviewees identified aging hardware and software, costliness of maintenance, and the ongoing work required to sustain trustworthy infrastructure over time as potential sources of risk and posed straightforward solutions, such as equipment replacement, software upgrades, content migration, and up-front investment in infrastructure.
Standard developers argued that the technical infrastructure of a repository was both complex and continually evolving as new digital preservation solutions emerged. “The already complex world of hardware and software platforms. The concept of the virtual computer has not proved to be very successful yet. We’re stuck right now in, really, taking baby steps in terms of our hardware and our software approaches to digital preservation. We’ve got to get some kind of more universal, more virtual approach, to how we can preserve all formats of digital materials” (Standard Developer 06). This complexity, they explained, required continual monitoring in order to keep abreast of changes in the environment. “For example, one of the areas of the audit and certification standard is concerned with regular monitoring of changes in the environment, and that's complex because it can mean hardware obsolescence” (Standard Developer 09).
Section 5, “Infrastructure and Security Risk Management” of the TRAC standard echoes this belief in the importance of ongoing monitoring in order to maintain up-to-date hardware and software and cites the importance of tracking “when hardware or software components will become obsolete and migration is needed to new infrastructure” (Consultative Committee for Space Data Systems, 2012, p. 65). Through this document, the standard developers frame threats to technical infrastructure as identifiable, often predictable, and as something that can be addressed before it becomes a problem.
While standard developers framed threats to technical infrastructure as manageable, they did point out that people were one of the biggest challenges in mitigating these threats, “The difficulty is always people. The hardware and software is always going to be much easier” (Standard Developer 07). For example, while it might be relatively simple to set a timeframe for hardware replacement, it may be difficult to secure the necessary funding to follow that replacement schedule, “You can't tell a resource allocator … that you're going to basically wipe out everything and replace it all in three years. That what is brand new and spiffy and perfect now will all be gone in three years because it will be inadequate. Resource allocators don't like to hear that” (Standard Developer 06).
In contrast, when asked what he considered to be the greatest risk or threat that digital repositories face, another standard developer argued that the cost of storage decreases exponentially over time, and that securing funding for long-term preservation was more about the ongoing work of digital object management rather than infrastructure:
“So whereas initially [a] petabyte may be on one or two, maybe it's two tapes, in three years time it’ll be on a small part of one tape. In another three years, it'll be on a tiny part of one tape, and in another three years it'll be next to nothing on a tape, and so the management of it will be negligible from then on because it's just this much of a tape and that's nothing in terms of the cost of the tape and the processing to check these things. So all of that is significant in terms of the costs, so then the costs come to actually making sure the data is usable.” (Standard Developer 07)
Standard developers framed threats to the technical infrastructure as ongoing and manageable. They articulated a view of long-term preservation in which digital content is expected to survive but the technologies used to store, preserve, and access it are not. Through the text of the TRAC standard, they communicated to both auditors and repository staff that a TDR should be able to demonstrate a firm understanding of the limitations of its infrastructure, and an ability to preserve digital content beyond the lifespan of any given part of that infrastructure.
Auditors agreed with standard developers about the importance of technical infrastructure and the notion that threats were significant but manageable for digital repositories, “A technical infrastructure is not difficult. It may cost you a bunch of money, but it's a solvable problem and you kind of assume it's robust given that there are processes and checks and all sort of things in place to verify that it's robust” (Auditor 09). The expectation that money could solve problems relating to technical infrastructure was shared by several auditors, “Everything else from a, comes down to the challenge of technological change, but a lot of the technological change can be mitigated with sufficient resources” (Auditor 05). Similarly, auditors argued that in addition to having sufficient resources, having appropriate staffing with the right kinds of expertise was also important for mitigating threats to the technical infrastructure of a repository, “The biggest thing I learned is that the human factors are more important than the technology factors. Because the technology factors, as long as you have good people and support for the technology, you can do that” (Auditor 08). Auditor 08 went on to explain that both the hardware and software of a repository require specialized knowledge and expertise, but that in general technologies for digital repositories are well known.
Implicit in this perspective is the assumption that with enough resources and the right kind of expertise, potential sources of risk to a repository’s technical infrastructure can be ameliorated. In the context of a TRAC audit, one auditor explained that an important goal of the site visit is to inspect the physical infrastructure, including equipment, software, and facilities in order to confirm that the documentation provided by repository staff accurately represents the repository, “You're there to gather evidence of facts, so yes, there is a data center and its doors are locked and under alarm. There is earthquake monitoring. So you know, one responsibility was to see things, okay? And I think that's really important. You see staff, you see equipment, you see servers, you're shown auditing software, and audit reports, and system logs, and all kinds of things. You see them live. So you're bringing evidence yourself, you're a witness” (Auditor 10).
Overall, auditors agreed with the view communicated by standard developers that although threats to the technical infrastructure of a repository were serious, they were also knowable and manageable. Both standard developers and auditors developed a view of potential sources of risk in this area as issues that repositories seeking TRAC certification should be able to identify and mitigate. While repository staff members agreed with standard developers and auditors that threats to technical infrastructure were potential sources of risk for digital repositories, repository staff expressed mixed attitudes about the manageability of those threats. Some repository staff members agreed with the view of technical infrastructure as a potential source of risk that was manageable while others argued that technical issues could not be separated from other aspects of repository management, such as funding and staffing, and that problems in those areas had the potential to make threats to technical infrastructure intractable.
Several repository staff members described examples from their own experience in which staffing issues compounded potential sources of risk relating to the technical infrastructure. For example, one interviewee explained that staffing issues, including turnover, created instances where repository software was not understood by repository staff. “[We have] various generations of software and they've been developed by different people. We're not a huge organization, obviously, so it's not that big a deal, but we certainly have pieces of software that people are like, I have no idea what that is. Or, I know what that is, but I didn't write it. So I think that's really where most of our complexity lies” (Repository Staff 04). The stakes of not understanding repository software can be particularly high in instances where repository staff think that they understand their infrastructure and fail to catch problems until it is too late, “So that's a vulnerability. Especially software. You think it's doing one thing. Everybody thinks it's doing one thing, and then you find out if it's doing something else, and then maybe it's too late” (Repository Staff 05). When asked how his role and experience influenced his understanding of the risks that his repository faced at the time of their audit Repository Staff 03, an IT manager, described his approach to managing technical infrastructure as being driven by a desire to prevent the repository from being affected by a failure:
“As far as the technical infrastructure too, I never wanted us to be impacted by failures. I never wanted to say, ‘We had some sort of system failure but, we think everything's okay.’ Or ‘Service was down for this time because of some unplanned thing that we didn't understand.’ I really tried to keep everything to a high bar in terms of those kinds of technical considerations. Redundancy for all of the – Also, I didn't want to respond to crisis. I didn't want my staff to have to respond to crises. You know?”
In addition to questioning whether breakdowns in technical infrastructure would be identified, repository staff also argued that repositories could not assume that they would always have the staffing levels to support their infrastructure and respond to potential threats, “We have three now. But we're still doing the work that we did when we were seven. So there's things that are not happening that I wish were happening. You know, even on a systems side” (Repository Staff 16).
Interviewees largely identified threats to the technical infrastructure of repositories as a potential source of risk that was straightforward and within the power of repository staff to address. While repository staff shared the understanding of this potential source of risk as communicated through the TRAC standard, they disagreed about whether responding to threats in this area would be as clear-cut for their repositories as the standard developers and auditors assumed it would be. While some repository staff members agreed with standard developers and auditors in their characterization of technical infrastructure as manageable, other repository staff members argued that other areas of repository management such as funding and staffing would prevent their repository from maintaining the level of expertise needed to identify and mitigate threats to their technical infrastructure.