Keywords

1 Introduction

A key records management standard ISO 15489-1:2016 defines records management (RM) being “field of management responsible for the efficient and systematic control of the creation, receipt, maintenance, use and disposition of records, including processes for capturing and maintaining evidence of and information about business activities and transactions in the form of records” (ISO 15489-1:2016). To ensure evidence and to control systematically records’ life span in digital environment, we need additional data about records’ background, history, and actions creating those records. In archives and records management transition from paper era to digital environment led to a paradigm shift when archives could no more be managed and considered as physical objects and entities. It was realized that it was not enough to manage existing records only. In digital environment the premise for managing records had to be the process that produced those records. Archival theorist Terry Cook (2001, 4) said that:

For archivists, the paradigm shift requires moving away from identifying themselves as passive guardians of an inherited legacy to celebrating their role in actively shaping collective (or social) memory. Stated another way, archival theoretical discourse is shifting from product to process, from structure to function, from archives to archiving, from the record to the recording context, from the “natural” residue or passive by-product of administrative activity to the consciously constructed and actively mediated “archivalisation” of social memory.

This shift to processes and functions can be seen in records management metadata. Its roots are in the 1990s. At that time, it had become obvious that recordkeeping professionals (records managers and archivists) must manage also electronic information and abandon their traditional role as custodians of physical documents only (Bearman, 1994; Cook, 1994; Gilliland-Swetland, 2005; Sprehe, 2000). This led to the question of what it means that something is record in an electronic environment and what are requirements for systems that manage electronic records. This was studied in research projects of the University of Pittsburgh and the University of British Columbia (for details, see, e.g., Marsden, 1997). The projects formed the basis for later national and international specifications for electronic records management systems (Gable, 2002; Wilhelm, 2009). The specifications typically set requirements for metadata and functionality that an electronic records management system must have (e.g., the system must capture date and time when information is stored in it and prevent unauthorized destruction and modification of records). Because of the projects it became an axiom in archival science that records systems must link records to business activity/transaction from which they arose (Lappin et al., 2021). This happens by assigning records a place in a functional classification scheme. Today international standard about records management metadata (ISO 23081-1:2017) shows broad consensus about content of metadata.

Creation of metadata is resource consuming. Adding metadata manually is for the person receiving or creating a record often a superfluous step for which there is no motivation, because it usually does not benefit the work task at hand and makes the process slower. One solution to this problem is to hide and automate records management processes. In this chapter, we examine how this has been done in Finnish public administration. Firstly, the chapter contributes to discussion about description of records management processes and adds understanding of possibilities for adding metadata to records. Secondly, we aim to stir up interest toward the use of a concept paradata in recordkeeping and invite discussion of benefits of understanding some of recordkeeping metadata as paradata.

2 Information Control as Part of Information Governance

The aim of controlling information via automated records management processes and metadata serves the goal of information governance (IG) or records/information management which is according to Brooks (2019, 14) “supporting an organization to manage, secure, access and exploit its information in complex digital environments across a myriad of locations.”

Today, alongside a relatively narrow concept of RM or records and information management (RIM) more holistic and broader-reaching view of IG has enhanced extensive interest among the recordkeeping professionals. When RM and RIM focus on control of the creation, receipt, maintenance, use, and disposition of records, the concept of IG represents more wide-ranging area of organizations’ information needs.

“In short, IG is about information control and compliance” (Smallwood, 2014, 6). Smallwood (2014) sees information governance as a subset of corporate governance. It is about standardizing and systematizing handling of information. It focuses on access, control, management, sharing, storing, preserving, and auditing of information. Organizations’ policies, processes, and technologies to manage and control information must be complete, current, and relevant. Further, including “[…] who is able to access what information, and when, to meet external legal and regulatory demands and internal governance policy requirements” (Smallwood, 2014, 6).

Yet, the concept of IG is still vague and there is no one commonly accepted definition of it. High-level nature and breadth of the scope characterizes the various definitions of the concept (Brooks, 2019). A decade ago, Hagmann (2013, 229) stated that “The RIM community tries to capitalize this term [information governance] in order to get a seat at the table of senior executives and to get out of the dusty image of records administration in a paper environment.” Lately, other aspects based on the genuine need for broadening the focus in current administrative recordkeeping have also been arisen (Brooks, 2019). The above presented IG definitions of Smallwood pointing to information control fit well in Finnish public sector recordkeeping context in which a proactive recordkeeping strategy that is based on organizational functions has been traditionally dominant.

The Finnish Act on Information Management in Public Administration (906/2019) that was published after and in part as consequence of the GDPR (General Data Protection Regulation) of the European Union has been crucial for the wider understanding of what records management is about. The act made it obvious that in a digital environment one needs a broader approach to information management in which records management is only one part. Although it is possible to manage records without automatic information control, information control allows automatization of records processes and connecting them to information resources of the organization.

In Finland, the National Archives has traditionally played a strong role in guiding public sector organizations’ records management. In 2005, National Archives’ SÄHKE specification started to stipulate the requirements and features for records’ digital archiving in information systems. Finnish SÄHKE2 specification includes a metadata model whose purpose is to ensure evidentiality, integrity, and usability of records (Mäkiranta, 2020). After GDPR and the following regulation, in accordance with the role of National Archives, starting from the beginning of year 2023, SÄHKE2 specification serves only as a recommendation for the agencies.

3 Records Management Metadata and Paradata

Concept of paradata is ambiguous. Current definitions of paradata often include a certain perspective, for example the context of education or research methodology (Pomerantz, 2015), surveys (Kreuter, 2013), or heritage visualization (Baker, 2012). Sköld et al. (2022) discuss paradata in different information domains as well as close connection and overlapping between concepts of paradata, metadata, and provenance data. Only recently the concept of paradata has been introduced to the archival and recordkeeping sphere in studies focusing on paradata in AI-based automation (Davet et al., 2022, 2023). As Davet et al. (2023) state, conceptual overlapping exists between the conceptual development of paradata for AI and those of contextual metadata and explainable AI.

Hence, in archives and records management, the concept of paradata is only emerging to theoretical and practical discussions and thus, it mostly represents an uncharted territory. The concept is barely mentioned in studies in this research area. Studies focusing on metadata or data processes (see, e.g., Bak, 2016; Sundberg, 2013) do not use the term paradata. In similar fashion, Finnish SÄHKE2 calls metadata all data describing the context, content, structure, management, and handling of information (Arkistolaitos, 2008a). Nevertheless, concept of paradata might be applied in this area, too.

In a digital environment, adding metadata is inevitable and an established practice in records’ handling and archiving. It is questionable, should we even call it adding, since in digital environment, most of the metadata are automatically added by the recordkeeping system. Some of it are still, though, explicitly added by a human. Metadata is part of the record, part of its content. Meta and data are not to separate any more the way they are/were in the world of paper records (Bak, 2016).

If metadata is defined as information that helps in semantic interpretation of the data, and paradata is all other information about the background, administration, and use of the data, records management metadata belongs almost exclusively to the category of paradata. Although records management metadata may help to interpret the records, it is not generally about the meaning or content of data.

The metadata can be broken into the following components (ISO 23081-1:2017, 16):

  1. 1.

    metadata about the record itself;

  2. 2.

    metadata about the business rules or policies and mandates;

  3. 3.

    metadata about agents;

  4. 4.

    metadata about business activities or processes;

  5. 5.

    metadata about records management processes.

Figure 1 gives an example of the diversity of metadata in archives and records management area. It describes entities in records management metadata. For example, there are metadata about agents, mandates, and business (McKemmish et al., 1999).

Fig. 1
A flow diagram. Agents or people, businesses, and records integrate with each other, further integrating with business recordkeeping. Mandates govern business and business recordkeeping and establish competencies for agents or people. Records provide an account for the execution of mandates.

Coverage of recordkeeping metadata (McKemmish et al., 1999, 15)

In records management, metadata has often a temporal triptych structure: the metadata gives information about current status of records, but also about future and past actions. For instance, metadata may tell that access to records is now restricted, but that the access restrictions will be removed in the future. Once when there are no more access restrictions, the metadata will show what restrictions there have been in the past. This reflects basic conception of records as evidence of past actions.

Metadata accumulates throughout the records’ life span. As considered above and shown in Table 1, there are different types of metadata in records management. Metadata is largely about context of records, their background, administration, and use of the records. Some metadata is added at point of capture, that is, when the records are stored in a records management system. After capture metadata is complemented and this continues even when the record has been archived. A study of records in an electronic records management system showed that 65% of metadata was about event history (Kettunen & Henttonen, 2010).

Recordkeeping metadata describes records provenance and relationships that define authenticity, reliability, accountability, and accessibility of digital records throughout the records’ life span (Fig. 1 and Table 1). All this data is called metadata. Some of it is various contextual information that is needed to understand the record’s provenance and its connections to other records. Much of the data, however, is something else, information about the process and various agents that are related to the record during its life span.

Table 1 Examples about types of metadata in records management. Created from ISO 23081-1:2017

4 Cost of Metadata Creation

Studies have shown that capturing metadata about data context is generally ex-pensive and labor intensive (Faniel et al., 2019). Records management metadata is no exception.

Metadata schemes in records management are broad. For instance, the first version of the Finnish SÄHKE metadata specification includes over 120 elements, many of which can be used at different levels of hierarchy (Records Creator—Collection of records—Record Series—Matters—Transactions—Records). Altogether there are about 280 possible metadata element—entity combinations. A study showed that in one electronic records management system more than half of the metadata elements were unused (Kettunen & Henttonen, 2010). A reason for this may be that while a records management metadata scheme must be prepared for all eventualities when it is concretely applied not all parts of the scheme are necessary. For instance, if the agency does not take part in eGovernment service processes, elements supporting eGovernment services are unnecessary.

The same study showed that optional metadata elements in the scheme were generally ignored, and only mandatory elements had values. Metadata values either come from the system, they are default values based on user selection, or free-text values given by the user. A closer inspection of the elements suggested that human intervention was minimal: it seems that users avoided inputting metadata (if they had a choice), and they also preferred to accept default values as such (Kettunen & Henttonen, 2010).

Altogether this—that only mandatory values were given, and that they were generally generated by the system—reveals the high cost of metadata creation. The metadata guarantees authenticity, reliability, and usability of records in long run, but generally its generation makes work processes slower and does not benefit the immediate work task at hand. In addition, users who are not professionals in information management or recordkeeping may find it difficult to assign records a place in the organization’s functional classification scheme. A Finnish study showed that even experienced professionals face difficulties in using functional classification schemes (Packalén, 2015). Therefore, one possibility is to hide records management processes from the users and automate them as much as possible. This can take place as part of integration of records and business systems which can take place in several ways (see, e.g., DLM Forum Foundation, 2011, 16–18).

5 Principles of Information Control

The conclusion of the research projects of the University of Pittsburgh and the University of British Columbia in the 1990s was that electronic records management requires (among other things) contextualization of records by preserving information about their functional context and relationships between records.

Finnish recordkeeping had elements supporting contextualization already before digitalization. Like in Nordic countries in general, practice of keeping registries has been common in Finnish administration for centuries. In other words, in- and out-going letters have been marked in a registry book, card file, or database. Registry entry has joined records together, linked them to a common process, and even to a function (if the registry classification scheme is function based). In short, a registry has given information a context. Another noteworthy characteristic of Finnish recordkeeping is that functional approach was adopted as a starting point for records management in the beginning of the 1980s. Thus, many agencies had a functional classification scheme even before digitalization. This classification scheme formed the core of records management plan that every agency was required to have by law, and which listed record types by function and gave instructions to their retention and management. First national specification for electronic records management systems (known SÄHKE1) in year 2005 required that this plan, now in digital form, was the source of metadata for electronic records (Henttonen, 2023). Besides registry information, functional classification scheme was another source of information about the context of the records.

Next phase took place in year 2008 when the National Archives Service of Finland (current the National Archives) introduced a new approach to improve information management processes. According to SÄHKE1 specification records management plan was to be included in the electronic records management system. The plan contained information about functions, and record types that were generated in them, and gave default metadata values for the retention and management of record types. The next phase brought two changes. Firstly, records management plan was now separated to a system of its own (Information Control System) and it was complemented with information about process steps that are taken in the function. Information control was defined as management of information management process in an information system (JHS 191, 2015). Secondly, the idea was that the plan—now called Information Control Plan (ICP)—would be the source for records management metadata across information systems in an agency: when the process goes forward information systems get metadata values from agency’s ICP. This is shown in Fig. 2 that gives an example of a process from recordkeeping perspective. Organizations’ ICP gives metadata needed in records handling in an organization. These metadata will then be stored in organization’s information system.

Fig. 2
A process flow diagram. It starts from the customer to the organization to the I C P, and then to the information system. It involves recording, receiving confirmation of reception, processing, creating a record, requesting rectification, storing metadata, and making a decision.

An example of a recordkeeping process using ICP. Translated from Arkistolaitos (2008b, 3, appendix 1)

Besides SÄHKE specification, legislation about information management, archives, and freedom of information affect information control.Footnote 1 There are guidelines and instructions that help to identify and describe business processes in an Information Control Plan. Legislation also defines what information must be included in a registry.Footnote 2

In the core of Information Control System is a plan with an enumerative functional classification scheme which lists all the functions of the agency, and, in addition, process steps and record types that are generated or received in the function. This plan has default metadata values for controlling access to information, managing information security and data privacy, and supporting e-services. Information Control System that contains the plan interacts via Application Programming Interfaces (APIs) with electronic records management systems, electronic archives, and other information systems that process and create records (Kuntasektorin arkkitehtuuriryhmä, 2016). As shown in Fig. 3, Information Control System may control several information systems that, to varying degrees, also interact with each other. Not every information system may store records in an archival system.

Fig. 3
An illustrated flow diagram. Information systems 1, 2, and 3 with information control functionalities integrate and create, receive, process, or store records. These further integrate with electronic archives, which store data. All are connected to an I C S that manages information management rules.

Interaction of the Information Control System with other systems. Translated from JHS 176, 2012

Consequently, in practice, content of an ICP consists of a functional classification describing functions of the organization and descriptions of the organization’s operational processes. Process description describes an operational process from recordkeeping point of view. It includes administrative (or other) process stages, and transactions that take place in the process phases as well as record types involved. Process stage is an entity that includes one or more transactions. In administrative processes there are common administrative process stages (such as initiation, preparation, and decision making) that are similar in every process. Transaction is a single task that takes place as a part of a process. JHS 191 recommendation for public agencies (JHS 191, 2015) gives three alternative ways for structuring the content of Information Control Plan. Thus, the plan may state what are

  • Record types by process stage,

  • Record types by process stage and supplementary transaction(s) specified by the organization, or

  • Record types by transactions that are grouped by process stages.

In short, the organization may choose the way it describes their processes in the ICP and how detailed process descriptions it will have. Process descriptions are manually generated data about the processes of the organization. When a process is changed the description must be updated accordingly. Information Control System at hand and agency’s other information systems may set limitations to descriptions and dictate what is their appropriate level.

Agencies do not have concrete instructions for creating an ICP besides general level requirements for metadata in legislation and sparse instructions, rules, and regulations given by the National Archives. SÄHKE2 specification used to require (currently it only recommends) describing agency’s processes but it does not define sufficient level of detail in the descriptions. This is in the discretion of the agency. It is generally assumed that agencies have already had a functional classification scheme and management data of record types that are generated or received in the functions. This information is in records management plans that precede information control. Thus, if an agency wants to implement information control, it basically only needs to add process descriptions. To create a plan for information control, one needs information from laws and statutes, regulations and standing orders, strategies, quality handbooks, and process descriptions. One needs to consult SÄHKE2 specification and recommendation for the structure of Information Control Plan, JHS 191. One must also discuss with professionals who are responsible for the functions and the processes.

If the agency has analyzed its business processes for other (i.e., business) purposes, those descriptions naturally help in drafting the ICP. However, in an ICP, processes are described from recordkeeping viewpoint. There are no studies about whether and to what extent the result differs from process descriptions that have been created for Business Process Re-engineering, for example. The process in an ICP follows the phases that are carried out in organization’s information system, including every possible record type, when handling the matter, e.g., in a recruitment process.

What this means in practice can be seen when we look at administrative procedures which have been the most common targets for information control. For instance, one thing that agencies do is that they give opinions. The ICP has hierarchy of classes in the functional classification scheme:

  • 00 General Administration

  • 00 00 Steering and Development

  • 00 00 02 Opinions

  • 00 00 02 00 Giving Opinions

The lowest level in the scheme is the name of the process. This lowest level groups together process stages, transactions, and related record types. Process stages are common for all administrative processes. Therefore, a process 00 00 02 00 Giving Opinions may be presented in ICP as exemplified in Table 2. Default metadata values governing retention time, access restrictions, etc. of the record types are omitted here. Process stages, transactions, and record types as such are one sort of contextual information (paradata) about records that belong to the process. They help users of the systems to proceed consistently and provide them essential information about the past and forthcoming steps in the process.

Table 2 A possible process description of the process of Giving Opinions

Ideally, the agency has SÄHKE2 compatible Information Control System with the appropriate Information Control Plan, the plan is integrated with the electronic records management system and other information systems, and these systems have been adapted to information control. In that case the records process would go as follows:

The user (or personnel in registry office, depending on agency policy) opens a new matter in the system and chooses right class for the matter from the functional classification scheme. To support this the Information Control Plan may include additional information (metadata) that helps the user to make the right choice. The user checks the default metadata values and corrects them when necessary. The user may also add supplementary information, like title of the matter, or civil servants handling the matter. Finally, the user adds the matter in the registry, and it is assigned a unique registry identification number.

When the Information Control Plan contains process steps, the only path forward allowed is to follow the process description. Although the Information Control Plan has default steps, the user may ignore some steps, if necessary, and, e.g., to go from Commencement directly to Decision making, if there is no need for preparatory phases. User cannot add any new steps.

The user then selects the transaction. After selection of the type of transaction the user creates or attaches a record to it (when necessary) and selects the appropriate record type from the selection list. The user checks the default metadata values and may update them (e.g., define a record that has by default no access restrictions as partly confidential). The user may also add some metadata, like date of receipt, if the system does not supply it automatically. Users’ capability to edit metadata is limited by their role. Most users cannot change retention time, for instance. This can be done only by recordkeeping professionals.

When the user creates a new document in the system, the document is first marked as draft, and its visibility is limited. When the document is signed electronically it becomes final and locked to prevent any further changes. The system may include rules that (for instance) mark the matter automatically as closed when the process reaches a particular phase.

During the process, two things may happen simultaneously. Firstly, the records gather paradata about their background and the function/process which they are result of. Secondly, this paradata is used as basis for further actions. For instance, if record’s retention time is calculated from the completion of the process, i.e., when the matter is closed, date of the process completion is recorded in para-/metadata and used to assign a date when the record is to be removed from the system.

An Information Control Plan can be just a guidebook to agency’s functions, processes, and information, but it is stated that full benefit comes only if the plan is used to manage and automate information systems. There are no studies on whether and to what extent this goal of information control has been reached, but SÄHKE2 specification and concept of information control have today established their position in Finnish public sector records management (Mäkiranta, 2020). Other benefits may include improved usability of information systems because of default metadata values and process descriptions (JHS, 2015). Metadata enables tracking of processes and facilitates answering to information requests. Entries in the registry/system document content of a record, processes (when it has arrived, who has created it, etc.) and what transactions (like answering to a request for opinion) have taken place. Information Control Plan not only describes the records accumulating in the course of organizations’ business activities but provides a tool for managing processing stages of the information and for information security measures. When fully exploited, several information systems would be controlled by one ICP via APIs. If the systems are used without exceptions, they provide trustworthy evidence of the flow of information in an organization. Persons who access information later may convince themselves about the authenticity and reliability of the information by looking at the para-/metadata about it.

A prerequisite for information control is existence of a functional classification scheme. Functional approach to records management is today widely accepted among recordkeeping professionals in the Finnish public sector (Packalén & Henttonen, 2016a). However, people understand basic concepts, like function differently, and the plans are sometimes difficult to use. Functional approach needs more rigorous theoretical basis (Packalén, 2017). Functional schemes are heterogeneous, and analysis of class names shows ambiguity and varying conceptual structures in the schemes (Packalén & Henttonen, 2016b).

Creating a workable ICP and integration of Information Control System with recordkeeping and other systems requires a collaboration between several stakeholders of the organization: records managers, data protection specialist, lawyers, IT personnel, system suppliers, and specialists on various subject areas like personnel management. In addition, collaboration with National Archives of Finland is necessary to define records with archival value. Once created an Information Control Plan needs constant updating.

Implementation of information control requires financial, technical, and human resources. Even with appropriate resourcing (which is often lacking) the goal is difficult to reach. An unpublished report on electronic archiving in municipalities three years ago revealed that implementation of information control—at least in a rigid form—has not been feasible in most information systems or databases, and that information in the municipalities is generally hybrid and only exceptionally complete digitalization has been achieved (Hänninen, Heli: Sähköisen arkistoinnin tilannekuvan selvitys kunnan toimialoilla 2020). A study in year 2020 found out that only in one state agency out of ten their ICP controlled more than one information system. For some, information control concerned only a part of organizations’ functions (Mäkiranta, 2020). No university had entirely digital processes for records with permanent value (Kokkinen, 2020, 10).

6 Discussion and Conclusions

Information control brings together different ideas. Information Control Plans incorporate traditions of Finnish recordkeeping: registry practice, and proactive planning of records’ life span. They fulfill internationally recognized requirements for electronic records management and, in addition, serve implementation of Freedom of Information legislation. An Information Control Plan can be accessed by anyone according to Finnish Freedom of Information legislation. Thus, the plan increases transparency by showing what information is gathered and processed in public administration (Lybeck et al., 2006). Information control is a combination of functional approach in records management planning, information governance perspective to information management, local recordkeeping traditions, and need to increase efficiency and automate processes.

SÄHKE2 continues to exist as a recommendation. In the future, The National Archives will primarily focus on records that have value for permanent preservation, archives, and have less authority in records management. For the same reason, in a recent draft for archival legislation there is no requirement for agencies to implement information control (Luonnos hallituksen esitykseksi eduskunnalle arkistolain ja Kansallisarkistosta annetun lain muuttamisesta, 2022). Thus, formally there will be less constraints and obligations in Finland for public sector records management to fulfil. Nevertheless, proper metadata and goals of information governance are still considered important. While agencies will have more freedom in their records and information management, it is likely that for practical reasons information control and SÄHKE2 specification still form the basis for future development. Although implementations may change, organizations still need to describe their processes and have procedural information about them. Metadata are to ensure later understandability and usability of digital records. Understanding paradata as part of it may bring new, useful insights to the future discussion of process descriptions.

Clearly, information control carried out through an ICP has benefits. Adding metadata (paradata) automatically to records and the processes they originate from accelerates management of information. It shortens the time used in handling a matter in an agency which might lead to increased customer satisfaction. However, there are no studies that would empirically show the benefits of information control. For instance, automation of records processes may save human resources, but there is no research showing how significant these savings are. However, on the other hand, ICP must be constantly kept up to date about information and processes in the organization which is a laborious and resource- craving task. Functional classifications are not without problems. Previous studies have shown several challenges in organizing records by function. Some of them are a result of conceptual confusion and heterogeneous classificatory structures which result from a lack of theoretical background and guidance for creating classifications (Packalén & Henttonen, 2016b). Information Control Plans involve similar challenges.

Information Control Plan as such is a record that is regularly updated. Old and new versions of the ICP are preserved. They provide enormous amount of information about organization’s processes and records management. Information Control Plans are not paradata themselves. They are only descriptions of organization’s planned functions, processes, and records. When the plans come to flesh in the organization’s daily operations, information in the plans becomes paradata about the records. The InterPARES research group defined paradata as “information about the procedure(s) and tools used to create and process information resources, along with information about the persons carrying out those procedures” (Davet et al., 2022). All this information is saved in records management and information control system used.

It is important to understand that when operating in digital environment the premise of the information gathered is the records creating process and not a single document/record. Typically, Finnish recordkeeping practices and procedures are not based on theories but are constructed as resolutions from practical needs (Kilkki, 2004). The same applies to Information Control and its applications. It rests on the rather weak theoretical base of the functional approach to records organization. One should base recordkeeping activities on theoretical and conceptual understanding and underpinnings. Therefore, in archives and records management discipline one needs to examine the potential of the concept paradata from various perspectives. Finding out what it is that paradata has to offer to archives and records management and contributing the recordkeeping viewpoint to paradata discussion is a start. While paradata is not an established term in archives and records management, it is a befitting concept to describe information that is gathered about records during their life span. How we name things forms our understanding. It is about understanding what kinds of data it is that we add to records and the records creating processes, and about understanding the foundation of our actions.