TVS: a trusted verification scheme for office documents based on blockchain

To realize the encryption of document information, authority authentication, and traceability of historical records, we propose a trusted verification scheme (TVS) for office documents to ensure security. Specifically, the scheme is realized by timestamps, smart contracts (or chaincode), and other blockchain technologies. It is based on the features of blockchain, such as security, credibility, immutability, and traceability of network behavior. And the TVS stores users and documents information through blockchain; it can monitor the state changes of office documents in real time by setting the trigger conditions of smart contracts. The experiment indicates that we have realized the real-time monitoring of data and the traceability of historical records. Moreover, we have achieved the purpose of document encryption and authority authentication, ensuring the authenticity and objectivity of data, avoiding the illegal tampering of malicious users to realize the trusted verification for documents.


Introduction
Office documents, realizing information sharing and paperless office, are used in all walks of life; it not only improves the accuracy in the office but also raises the working efficiency. However, hacker attacks, network intrusion, and Trojan horse will bring harm to document transmission. They can cause vital documents lost and being tampered, or even the entire transfer system breakdown, so ensuring document security is crucial to users [1].
Blockchain provides a shared, tamper-resistant, and transparent transaction record, supporting the construction of B The School of Computer Science, Qingdao University, Qingdao, Shandong, China applications that include trust, responsibility, and transparency. However, blockchain is mainly used for verifiable public transactions; it provides no privacy for individuals. In the existing works, the blockchain is used to develop a safe and reliable data sharing system to ensure the privacy, integrity, and fine-grained access control of the shared data. Mohammad et al. [2] established a data notarization system based on the blockchain, verifying the authenticity of shared documents in real time. Yuan et al. [3] proposed a new scheme that combines blockchain and CP-ABE; it stored data changes on the blockchain and implemented different access rights through ABE, realizing effective data supervision and privacy protection. Zhu et al. [4] used the decentralized feature of the blockchain, identity-based cryptography, and electronic signature to achieve electronic contract signing and certification of important documents. Wang et al. [5] proposed a secure and high-performance multi-party computing model, which combined the characteristics of the blockchain with security multi-party computation. It used proxy re-encryption for data sharing and an improved consensus algorithm for ensuring consistency between nodes, so that users could control data independently. Tian et al. [6] proposed a secure and efficient public auditing scheme for user operation behavior logs based on blockchain. It can remotely verify the integrity of log data in the cloud, resist collusion attacks between malicious users, and the cloud ser-vice provider, and prevent cloud data from being damaged and tampered. Datta et al. [7] combined an edge computing paradigm with a blockchain setting and proposed a detailed three-layer edge computing architecture to improve the security of Secured Data Delivery for Forest Fire Surveillance. In addition, it proposed an energy-efficient prospective Leader selection algorithm and an optimal and dynamic drone trajectory algorithm to minimize energy consumption.
Blockchain is also used in the fields of document certification, asset trading, medical information protection [8][9][10][11], and IoT interaction. It not only solves the security loopholes in traditional technical solutions and optimizes the business models, but also improves transaction efficiency and ensures security [12]. To ensure the integrity of medical records and prevent it from being tampered, Brihat et al. [13] proposed an innovative method based on Merkle Tree to protect the integrity of medical records, it avoided mining to simplify operations, and replaced traditional audit trails with encrypted and secure counterparts, which improved the robustness of the medical records storage method. Jia et al. [14] proposed an identity-based crossdomain authentication scheme for IoT, using blockchain as a decentralized trust anchor, and identity-based selfauthentication algorithms instead of PKI to achieve decentralized authentication; it ensured the autonomy and initiative of the security domain. Krishnan et al. [15] improve verifiable and immutable repositories by employing blockchain to design authentication in the IoT onboarding process. And they combined with blockchain and SxC security contracts, MUD-based behavioral fingerprinting, and Software-Defined-Networking (SDN), to design an integrated framework for managing the security of IIoT ecosystems. It can effectively ensure the proper functioning and performance of Industrial grade IoT devices (IIoT) in Industry 4.0 networks.
To prevent office documents from being tampered with by illegal users, monitoring the state changes of office documents in real time, and to achieve the purpose of traceability of historical records, we establish a trusted verification scheme for office documents based on blockchain. Specifically, we intend to use the Hyperledger Fabric as the experimental blockchain platform, node users, digitize users, and documents information, and divide users into groups by category. Then, the smart contract is designed according to the consensus rules reached between users. Through the uploading and calling of the smart contract, we realize the authority authentication, sharing, and historical records' tracing of documents.
The main contributions of this manuscript are summarized as follows: 1. This manuscript proposes a trusted verification scheme for office documents based on blockchain. In this scheme, we node users, digitize users, and documents informa-tion, and divide users into groups by category. In addition, we have designed a channel to provide privacy protection for members. It realizes the mutual cooperation and supervision of peer nodes in the blockchain network, and ensures the user's identity authentication and authority control. 2. The algorithms for initializing, querying, and modifying office document information using blockchain are proposed, and we implement it using smart contracts. Since the operation of the smart contract is not affected by external factors, it can obtain objective and accurate results, and realize the real-time storage of office documents and historical records tracing. 3. We use the Hyperledger Fabric as an experimental platform to verify the proposed scheme and algorithms. First, we successfully built the experimental environment to provide sufficient conditions for algorithm verification. Second, we deploy the chaincode in the blockchain environment and initialize the office documents. Finally, we realize all the functions of the proposed scheme and achieve the purpose of monitoring the state change of office documents in real time.
The rest of this manuscript is organized as follows. Section Related work reviews the related works. Section Overall model introduces the proposed overall model. Section Function modules gives the main function modules in detail. The experiment testing and evaluation are given in Sect.Experiment and evaluation. Section Conclusion and future work concludes this manuscript.

Related work
While realizing information sharing, sensitive information of office documents may lead to information leakage, the existing studies adopted identity authentication and access control technique to guarantee document security. In this section, we will survey the correlation technique of controlling users access information based on data processing and identity verification, and then briefly state the advantages and challenges of these technologies.
Due to the explosive growth of data, most users store data in the cloud or public organizations. For this application environment, Shen et al. [16] proposed a remote data integrity audit scheme based on identity encryption, so that documents stored in the cloud can be shared and used by others when sensitive information is hidden, while remote data integrity audits can be effectively performed. In data processing, security and confidentiality are important for database administrators and users. Segundo et al. [17] proposed a hybrid blockchain model, solving the problem that the secu-rity model suffered computer attacks because of security management vulnerabilities.
Personally identifiable information (PII) is served as a gateway to personal and organizational information when performing identity verification. Rima et al. [18] evaluated different PII options for user authentication in blockchainbased solutions, helped users make informed decisions based on the identity ecosystem model, and encouraged providers to introduce better choices to users. Wang et al. [19] designed a practical authentication scheme for mobile devices, which solved the problem of user authentication in mobile networks. An identity-based cryptosystem means that the public key could be obtained from the user identifier. Lin et al. [20] constructed a linear homomorphic signature scheme based on newID; they used a random oracle model to perform linear calculations on authenticated data, and effectively prevented the forgery of messages and IDs by avoiding the use of publickey certificates.
The above current technologies can prevent user identity from being maliciously tampered and ensure data security, but could not guarantee the authenticity and objectivity of data or achieve the purpose of traceability of historical data. From the perspective of data management, blockchain could be regarded as a tamper-resistant ledger maintained by many untrusted nodes in a distributed environment [21]. Its characteristics generally include decentralization, trustlessness, publicity and transparency, tamper-resistant, and anonymity [22].
The blockchain realizes the credible sharing of data among the parties that distrust each other, and the smart contract running on the blockchain realizes the credible execution of business logic [23]. In 1994, Szabo N first proposed the concept of smart contracts [24]. The essence of the smart contract is a modular, reusable, and automatically executed script code that runs on the network, allowing developers to personalize development based on specific protocols, businesses, or logic. And the purpose of a smart contract is to provide a safer method and reduce transaction costs [25,26]. The blockchain provides a good platform for the application of smart contracts because of its decentralization and distributed features. However, due to limited capabilities, insensitive to information changes of smart contracts, there are many difficulties when it is used as a security mechanism in a security scenario. Fan et al. [27] classified smart contracts according to the operating environment, discussed the application scope of their operating platforms, and analyzed the current situation and challenges of smart contracts in terms of security, scalability, and maintainability in-depth. To promote multi-user collaboration and trace document modification records in a trusted and secure way, Nizamuddin et al. [28] took advantage of PFS to store documents on the decentralized system. At present, the application scenarios of smart contracts are not enough, more in-depth discussion and analysis are needed for the challenges faced by the smart contract, as well as its security [29,30], scalability [31], and maintainability.
The existing studies seldom mentioned document encryption, authority authentication, and historical records trace, which makes it impossible to effectively guarantee the realtime sharing of documents and the authenticity and objectivity of data. In response to the above problems, we proposed a trusted verification scheme for office documents based on blockchain. The scheme discusses the authenticity and objectivity of data from the perspective of document encryption and authority authentication. We use blockchain to store users and documents information, formulate the conditions for triggering the smart contract, and design reasonable document uploading and monitoring process. Finally, the scheme can monitor the state change of office documents in real time, and achieve the purpose of document encryption and authority authentication.

Related concepts
Blockchain is a new application pattern of computer technologies, as a distributed shared ledger, it has the features of decentralization, tamper-resistant, maintain collectively, transparency, traceability of throughout, and data. It uses the core technologies, including distributed storage, peerto-peer transferring, consensus mechanism, and encryption algorithm, to provide participants with a credible, reliable, and transparent logic framework. The block-chain structure diagram is shown in Fig. 1.

Blockchain-
The blockchain is a sequence of blocks, which holds a complete list of transaction records. The block, composed of header and body, is used to store transaction summaries, and it is also a structural unit of data storage in the blockchain. The header includes the version number, hash of the previous block, root, and timestamp. The body is composed of transactions and counters; it is mainly used to store transaction summaries and verify transactions. The chain structure is formed by connecting each block through hash pointers in the order of generation time. The timestamp is used as proof of existence and notarized; it will prove that unauthorized modification could be detected under this condition as long as the document exists.
Smart Contract-The smart contract, or chaincode, is a piece of code written on the blockchain, which is deployed in the blockchain to describe related business logic. The deployed smart contract is unmodifiable in the blockchain, and its execution is completely determined by the code and not interfered with by human factors. It will periodically check whether the related events and trigger conditions exist; the events that meet the conditions will be pushed to the verification queue. The verification nodes on the blockchain will first perform a signature on the event to ensure its validity, we can successfully execute the smart contract and notify users after the nodes reaching consensus. Generally speaking, participants use smart contracts to specify their rights and obligations, conditions, and results that trigger the contract. Once the smart contract runs in the blockchain, objective and accurate results can be obtained.
The blockchain guarantees the authenticity of the authentication process, and the smart contract ensures that the certified contract is executed objectively and compulsively without any interference from external factors. Authority authentication and document encryption are realized through blockchain and customizing smart contracts, and the authenticity and objectivity of data are effectively ensured.

Model introduction
In this manuscript, we propose the TVS for office documents based on the features of blockchain, such as security, credibility, immutability, and traceability of network behavior. The scheme monitors the state changes of office documents by setting the trigger conditions of smart contracts, and it uses blockchain to store users and documents information in real time.
The overall model of the TVS for office documents is shown in Fig. 2; it includes the following modules: • Node network-Establish a complete blockchain system node network topology. • Information initialization-Provide a method to create and record users' access information. • Information query-Provide methods for querying users and documents information and verifying correctness.

Function modules
To precisely describe the technical solution proposed in the manuscript, the main function modules will be described in detail below.

Node network
The node network module nodes users, digitizes users, and documents information and divides users into groups by category. Each user acts as a peer in the network; they supervise and cooperate to establish a complete blockchain system node network topology. The system includes multiple nodes; each node stores complete data and establishes a communication connection between each other. These nodes include the endorser, the orderer, the committer, and the certificate. The endorser and the committer are peers in the decentralized blockchain network. The responsibility of the endorser is to verify the transaction plan, simulate execution, and endorse. The orderer sorts the transactions sent by each node, determines the transaction sequence and reaches a consensus, sends the result back, and saves it in the ledger. The committer checks the legitimacy of transactions, and updates and maintains the data and ledger status of the blockchain. The certificate is responsible for providing system members with identity based on digital certificates, and realizing the authority control of the blockchain system on the basis of clear membership. The topology of the node network is shown in Fig. 3. In this module, different nodes belong to different organizations, forming a peer-to-peer decentralized network. Each organization has its clients, peers, and CA, it can create one or more different types of nodes as needed. The orderer is a component maintained by all nodes in the organization. In addition, we have designed a channel to provide privacy protection for members. The channel is an independent communication way between network members, and only the members belong to the channel could see the transactions sent in it. Each channel has its member organization, the anchor, and the orderer, and each organization has multiple nodes joining the same channel, and the peer0 is the anchor by default. The anchor conducts node interactions between organizations to discover all nodes in the channel. Besides, each channel elects or appoints a leader, which is responsible for receiving blocks sent from the orderer, and then forwarding them to other nodes in the organization. The leader ensures that the entire network could be maintained stability even the number of nodes is constantly changing.
In this manuscript, user nodes are added to the node network module by creating a channel. Multiple nodes in the channel jointly maintain a ledger, thus constructing a complete node network topology of the blockchain system. The procedure of building a node network is detailed in Algorithm 4.1.

Initialization
The initialization module provides method for creating and recording users and documents information. Users information refer to userID and individual user profiles. Documents information includes title and content. The above records form a blockchain transaction, which can generate a hash. And a new block could be generated using the block hash, block identifier, and timestamp. We define the initialization of information as an event; the smart contract will discover the event and automatically execute the storage request. And then, there will generate a new block added to the blockchain. The smart contract will periodically check whether the related events and trigger conditions exist. If reach the conditions, the event will be pushed to the queue to be verified. In the mean time, the verification node will first perform signature verification on the event to ensure its validity, the smart contract will be executed, and the user will be notified when most verification nodes reach a consensus. The basic verification and consensus process is shown in Fig.  4.
Through the above operations, each newly created record is hashed and stored in the blockchain. Meanwhile, the operations create a specific space for the hashed document and store it combined with the timestamp. This can protect not only document privacy and data security but also save data space and transaction overhead. Once the transaction Fig. 4 Verification phase sequence diagram is completed, there will return a transactionID, including the transaction hash, block number, and timestamp, which is used to verify the procedure of specific operation. Since the smaller hash is stored in the blockchain, it will take up little time during retrieval, so the verification could be completed in a few steps. In short, the initialization module can quickly realize storage and verification, and effectively guarantee the authenticity and objectivity of data. The algorithm used for initializing information is detailed in Algorithm 4.2.

Query
The query module provides method for querying and tracing historical records based on the traceability of blockchain. The information can be queried users information, document title, and content. The decentralized feature of the blockchain makes the records stored without a central node. We see all participating devices as nodes; each node is stored independently and has the same status. All nodes rely on a consensus mechanism to ensure the consistency of storage, while there is no single-node record or modifies data individually. These settings ensure data security and greatly avoid information loss caused by a single equipment failure.
When users create, modify, or query information on office equipment, the system can monitor the office document changes in real time and record each status completely. Meanwhile, the system links the generated blocks in chronological order based on the chain structure. In this way, the entire network data are backed up to provide a complete record of users and documents information. The query module can make users query documents at any time and realize the traceability and auditability of historical data. The algorithm used for querying is detailed in Algorithm 4.3.

Modification
The modification module provides method for modifying documents based on the tamper-resistant feature of the blockchain. The information can be modified includes the document name, title, and content of a certain user. Since each block is linked by a hash pointer, which is obtained by hashing the previous hash value and the block body, it will change once there is a data change. Therefore, the system realizes the modification by adding a block at the end of the blockchain, thereby the historical data are still stored. We can view the historical records from logs. The above process ensures that documents could not be illegally tampered with based on the premise of traceable historical documents. The algorithm used for modifying is detailed in Algorithm 4.4. In the above function modules, the blockchain is responsible for saving user personal information, document content, and other data. Every initialization, query, and modification operation of the user will be regarded as a transaction. After we digitize the operation, the blockchain system will save it to the root of the Merkle tree by calculating the hash value and adding it to the block header of the blockchain. The specific block content is shown in Fig. 5.

Experiment and evaluation
The Hyperledger project was initiated by the Linux Foundation in 2015; the institutions and companies involved include financial industry, manufacturing business, and logistics industry. The Hyperledger Fabric blockchain platform is a distributed ledger framework for enterprise solutions and applications [32,33]. Fabric is based on a modular design; it provides the architecture that in the order of executing first, sorting later, and verifying again. Nodes are divided into endorser, orderer, and committer, the transactions are executed on trusted endorsers and the results are propagated to all nodes to achieve a consistent state. The smart contract runs in parallel on different endorsers to improve the transaction throughput of the system [34,35]. In this manuscript, we use the Hyperledger Fabric as the experimental platform to conduct simulation experiments. During the experimental test, we set the consensus schema to Solo, which is a single-node communication mode of the orderer node, and it has only one orderer serving all nodes. Although the Solo schema has no high availability and scalability, and not suitable for the production scenarios, it is completely feasible to use in the development and test. The Gossip protocol has been used to achieve consensus among blocks in the Blockchain. At the same time, we use the Gossip protocol to implement algorithms that can finally make ledgers of different nodes in the organization consistent. Below will introduce the simulation experiments from four aspects, including build environment, system configuration, deploy chaincode, testing, and evaluation.

Build environment
To successfully build the Hyperledger Fabric platform, we need to prepare the runtime environment of the system and install the software for Fabric according to the fundamentality and dependency. The system uses the Hyperledger Fabric v1.4.0. We pre-install the environment dependencies according to the requirements, as shown in Fig. 6.
Due to the operating habits of Ubuntu, the installation of related software is performed at the command line; we rely on shell scripts during installation, call each other between multiple scripts, and interact with users as far as possible. After the dependencies are installed, we switch the operating directory to the first-network folder, and run the byfn.sh to start the fabric network. If the interface is shown as Fig. 7a, b, it indicates that the Hyperledger Fabric environment is set up and the fabric network is started successfully.

System configuration
Building environment is based on a series of configuration files, including the configuration file for generating cer- Fig. 6 The environment dependencies that need to be deployed to install Hyperledger Fabric in Ubuntu tificates, the system genesis block, blockchain nodes, and databases.

Crypto-config.yaml
The cryptogen tool generates encrypted materials for network entities and uses certificates to represent various identities. The crypto-config.yaml is used in conjunction with the configtx.yaml. It is used to generate and verify certificates Fig. 7 The first-network start interface. a Interface that just started. b Interface that successfully started and keys and further set user-defined organizations. The configurations include organization name, domain name, CA settings, and the number of nodes under the organization. The parameter settings of the crypto-config.yaml are shown in Table 1.

Configtx.yaml
The configtxgen tool is used to create four network artifacts, including genesis.block, channel.tx, Org1MSPanchors.tx, and Org2MSPanchors.tx [32]. The parameter settings of the configtx.yaml are shown in Table 2.

Orderer and peer
The orderer configuration file defines basic information, including the name, environment parameters, working directory, and ports of the orderer. The peer configuration file defines the general configuration of the endorser and parameter settings of the dockers. The parameter settings of the orderer and the peer configuration files are shown in Table 3.

Deploy chaincode
Firstly, we formulate a reasonable chaincode according to user requirements, including conditions for initializing users, and creating, saving, and accessing documents. Given initialization conditions, after the chaincode is uploaded, users' numbers and permissions are set by calling the initialization function. Given creation conditions, the system will automatically trigger the creation function. The contents to be stored are documents, users' access information, and block location.
The storage request will be automatically executed after the chaincode detecting it; this operation will create a block and send the hash of records to the blockchain. Given query conditions, when the chaincode is formulated, the users obtained the documents will be set in advance by setting permissions of users. The system outputs data according to the constraint of the chaincode, transmit the data to the corresponding users, and further complete user access to documents. The procedure of chaincode deployment is shown in Fig. 8. After the chaincode is installed, instantiated, and invoked, it can realize functions such as initialization, query, creation, and modification. However, it should be noted that, when a new chaincode is uploaded, the latest code needs to be reinstalled, renamed, and instantiated on the nodes of the channel, and then realizes the function. In this way, the source chaincode is still stored in the blockchain even if it is modified.

Testing and evaluation
Next, we will test the system functions, including initialization of users and documents information query all information, and create and modify documents of a certain user.
After initializing the users and documents information, the operation is recorded in the ledger, and we can output the information by calling the query function. We need to indicate the chaincode name, channel name, and the function name. The specific input as follows: peer chaincode query -C $CHANNEL_NAME -n mycc -c '{"Args":["queryAllUsers"]}'. The output result of the ledger is shown in Fig. 10.
Similarly, when querying a certain user documents in the system ledger, it is necessary to indicate the channel name, the chaincode name, and the certain userID to call the query function. There will be two different situations, one is that the userID exists, all the documents information of the users is printed, shown as Fig. 11a, the other is that the userID not exist, the error prompt is printed, shown as Fig. 11b. The specific input is as follows: peer chaincode query -C $CHANNEL_NAME -n mycc -c '{"Args":["queryUserandDoc", "USER1"]}'.
Based on the features of blockchain, a series of operations on users and documents information in the system can be queried. We can view the historical records from the ledger in system logs. Some logs are shown in Fig. 14.
In summary, all functions of the system could be successfully implemented. The evaluation is shown in Table 4.
After analyzing the experimental results, the proposed model has high security and can effectively prevent documents from being tampered. Specifically, when generating new blocks, as shown in Fig. 1, based on the generation principle of the Merkle tree and the irreversibility of hash operation, when the document information is modified, it is necessary to save the new information in the new block instead of directly modifying the original records, so as to ensure that all historical records are recorded and traceable. Accordingly, as shown in Figs. 12 and 13, when creating and modifying the document information, all endorsers in the user group are required to verify and simulate the execution and endorsement of the transaction, so as to enable the mutual cooperation and supervision of peer nodes in the blockchain network, and ensure the identity authentication and authority control. And as shown in Fig. 14, based on the characteristics of blockchain, decentralized, non-tampering, and traceable history records, a series of operations for document information in the system can be queried in the system log. It can effectively ensure the security of document information.
Additionally, in the complexity analysis of the proposed method, the algorithm time complexity of document information initialization, query, and modification is O(n). During the simulation experiment, we mainly focus on the query and modification of document content in the blockchain, and realize the corresponding functions through the formulated smart contract. The completion time of initialization in Algorithm 4.2 is mainly related to the number of document information keywords set, and the completion time of information query and modification in Algorithms 4.3 and 4.4 is mainly related to the number of users need to be queried.

Conclusion and future work
In this manuscript, we propose a trusted verification scheme for office documents based on blockchain. The scheme sets the trigger conditions for smart contracts and uses the blockchain to store users and documents information. Due to the data are stored in multiple distributed nodes, the decentralized feature of the blockchain enables each node to have a complete record. Therefore, the scheme can greatly avoid the information loss caused by a single equipment failure, and it also provides complete records for users and documents UserID not exist ERROR information to achieve the purpose of post-audit and traceability. Meanwhile, the execution conditions of the smart contract will be triggered once users operate on the document, so that the smart contract is automatically executed. Consequently, the scheme can monitor the state change of office documents in real time, and achieve the purpose of ensuring the authenticity and objectivity of data.
In future work, we will implement the proposed scheme in the application platform of office documents to ensure the historical records' tracing. Through the application in specific scenarios, we will continue to optimize the relevant smart contracts, and compare the proposed approach with other schemes in terms of block generation time vs. number of blocks, computational cost vs. number of blocks, and verification time vs. number of users, to further improve the security of the scheme and the query efficiency of document information.
Funding The study was funded by The Tai Shan Industry Leading Talent Project (Grant number tscy20180416).

Conflict of interest
There is no declaration of conflict of interest by the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.