HistoTrust: tracing AI behavior with secure hardware and blockchain technology

In areas of activity where the notion of accountability is strong, the adoption of artificial intelligence (AI) is limited by the opacity and lack of understanding of its behavior, all the more so in the embedded domain where neural networks are compressed and executed on microcontrollers. While the NIST introduced in 2021 several principles allowing the AI explainability, this paper introduces a novel scheme, HistoTrust, combining secure hardware and blockchain technology to bring trust in the traceability of AI behavior and allow its explainability. HistoTrust attests in an Ethereum ledger all the relevant data produced by a physical device, especially the heuristics inferred by AI. Thus, the audition of the ledger allows security verifications and AI behavior analysis.


Introduction
From the perspective of the factory of the future, smart robots are increasingly incorporating vision capabilities based on an on-board camera. From the pictures, embedded artificial intelligences (AI) make decisions impacting the tasks performed by the robot within the industrial process. The AI is previously trained to recognize learnt patterns in the image. The classifier built is a Neural Network (NN), Christine 1 Univ. Grenoble Alpes, CEA-Leti, Grenoble, France which given an image as input infers a probability for the recognition of the learnt pattern. A high probability provides trust in the recognition of the learnt pattern. With AI, this trust is based on a probabilistic process.
The adoption of AI in the industry is being slowed down by the opacity of the decision making when an AI is involved in the decision process. That's why in September 2021, the NIST published the report [1] that promulgates four principles to enable the AI explainability. Among these principles, the transparency of the AI behavior is a key factor of trust along with accountability and resiliency.
When an anomaly is detected on a production line, the causes and accountabilities must be determined. When the production process involves AIs, implementing the means to trace events and audit the digital system is a requirement. The solution HistoTrust [2] aims to provide such a tool to ensure the protection of embedded AIs against malicious intentions and to enable the explainability of the AIs behavior. HistoTrust combines the probabilistic trust provided by AI with the deterministic trust provided by the blockchain. The notion of trust in the blockchain is based on a consensus protocol between the actors involved, enabling them to agree on the transactions recorded in the ledger [4]. Once recorded, the transactions form an history considered as immutable. They can no longer be deleted, swapped or modified. Also, the integrity of the information recorded in the ledger is ensured by design, as well as the ordering of events and the authentication of issuers. The blockchain technology is relevant to trace, in a non-repudiable way, the activity of smart robots, and embedded NN.
HistoTrust introduces a device-centric [5] solution based on Ethereum technology that conciliates the need for security and privacy with the trust required between stakeholders. HistoTrust provides an architecture that ensures end-to-end security and privacy by design while enabling the traceability of embedded NN inferences. The authenticity of the issuer device is attested through secure hardware components such as Trusted Platform Module (TPM) and ARM TrustZone technology as Trusted Execution Environment (TEE). Hardware component serves as root-of-trust for the digital data processed by the embedded NN.
Thus, each of the smart robots operating on the production line sends to the ledger the attestations of the digital data it produces. An attestation includes the cryptographic fingerprint of a set of raw data, the authentication of the issuing embedded applications, and the timestamp of the record. The ledger maintains the history of transactions received from the smart robots distributed around the production line. In a context where several stakeholders cooperate in the manufacture of a product, each protecting its own interests, business and personal data, sharing attestations through the ledger brings trust between them. While each one keeps and protects its raw data, and must be able to explain the behavior of its embedded AI if requested.
HistoTrust has several objectives: (1) to protect the embedded NN from logical and physical attacks by ensuring the cyber robustness of the AI, (2) to protect the data produced by the embedded applications and processed by the NN in order to allow the explainability of the AI behavior, (3) to attest and trace the data produced in a blockchain in order to provide authentic non-repudiable attestations shared between the different stakeholders.
The following section positions the work done in HistoTrust in relation to existing solutions. The use case is described in Section 3. Section 4 presents the embedded NN used in HistoTrust. Section 5 outlines the attestation process of the data produced to the ledger. The integration with the embedded NN and the deployment are discussed in Section 6. A security analysis is led in Section 7 followed by the audition process in Section 8 before concluding this work.

Secure data history with trusted hardware
The added value of blockchain technology to meet the specific features of a smart manufacturing use case has been shown in [6]. Compared to a centralized solution based on digital certificates and PKI, the Ethereum-based solution offers a more refined management of security and privacy at the expense of performance. In [2], HistoTrust demonstrates that performance can meet the needs of a real-time usage when using a blockchain.
The EmLog framework [7] is presented as "the first attempt at preserving off-the-shelf ARM development board hosting OP-TEE". EmLog implements a secure logging system from end-to-end between embedded constraint devices and a remote database. HistoTrust introduces an architecture design and an on-board implementation design using off-the-shelf secure hardware components, as OP-TEE and TPM 2.0 [8], that goes beyond EmLog solution and achieves the EmLog perspectives. Preserving forward security thanks to the one-way hash chain scheme introduced by Shneier and Kelsey [9], EmLog and SGX-Log [10] are not designed for multi-stakeholders contexts and may suffer of data losses in case of power failure.
In the Logs system EngraveChain detailed in the paper [11], the data history is ciphered, then registered in an Hyperledger Fabric ledger. This implementation lacks agility because the blockchain is not designed to store large volumes of data, nor confidential data even encrypted. Moreover, the ciphering of recorded data in a ledger implies a complex key management. The blockchain technology provides by design the tamper-resistance of the recorded transactions history forming the ledger. HistoTrust provides an attestation scheme securing the history of data issued from distributed devices.
An Ethereum ledger maintains the history of cryptographic attestations of data produced by distributed devices owned by multiple stakeholders. The blockchain technology allows to share these cryptographic evidences between the stakeholders, ensuring mutual trust. In addition, the raw data is kept by its owner who ensures its persistence and confidentiality.
Based on an Ethereum blockchain, BlockPro [12] presents a decentralized architecture of IoT devices. The authenticity of the data-emitting devices is achieved through a challenge to the IoT device submitted to its PUF (Physical Unclonable Function). But it is not mentioned how the address of the account issuing the transactions is built and how it is linked to the PUF. The paper [13] shows that the dissociation between IoT devices and validation nodes is a powerful architecture that HistoTrust exploits.

Attestation scheme
Attestation schemes based on the use of a TPM offer standard solutions allowing the authentication of a platform by a remote device [14,15]. The authors of [16] highlight the question of the certification of sensor data, even by a trusted platform. The tension is tangible between privacy on the one hand and trust on the other. Privacy requires the protection of confidential data, while trust requires guarantees between the stakeholders working in a given ecosystem.
The principle of remote attestation is described in depth in [15]. The Trusted Platform Module (TPM) is the targeted device enabling the endorsement of attestation keys that the manufacturer, the vendor or the owner may own. The attestation scheme follows recommendations and standards provided by the Trusted Computing Group (TCG) [14]. Attestation aims at proving to a remote verifier the property of a target by supplying an evidence over a network. It consists in three stages: (1) key provisioning, (2) attestation process and (3) verification process.

Explainability of embedded artificial intelligence
The field of eXplainable AI (XAI) raises major attention as an important concept that increases the trust in AI-based systems and applications. The need of both interpretability and explanation methods has been recently highlighted by the NIST [1]. A large variety of approaches have been proposed to enlighten the blackbox paradigm of deep NN models [17] even for modern architectures.
The purpose of our work is not to introduce a new methodology to explain the intrinsic behavior of a Machine Learning (ML) model, but to frame the implementation of an AI in an embedded device in such a way that confidential data, presented to a third party, can be trusted to explain the behavior of an embedded NN. Our contribution is rather in the area of cyber robustness of embedded AI in the presence of multiple distributed NNs.

Context
In a factory, many actuators participate in the assembly of a product on a production line (see Fig. 1). Physical devices that embed inference engines, i.e., a NN previously trained to recognize determined patterns in an image, generate the digital commands sent to the actuators. The device may integrate several sensors and a camera. A picture of the product is taken before acting. This picture is presented in input of the NN to request an inference that contains heuristics, i.e., probabilities that the pattern recognized in the image corresponds to the learned patterns. This inference will guide the decision about the next action the actuator should perform.
In the event of an incident creating a financial loss, it is necessary to find the causes and eventually to charge the costs to the accountable stakeholder. However, the presence of AI makes difficult the reproduction of the decisions. So, how to determine who is accountable for the damage? In particular, who is accountable for the decisions that command the actuators? If the NN recognizes the digit "2" instead of the digit "8", is the error attributable to the learning quality? A configuration and/or system integration fault? A lack of operator guidance? Noisy input data? A physical or logical attack on the electronic devices? A network attack?

Digit recognition
Smart robots are often equipped with cameras that allow them to photograph the part of the product on which they  will operate. The image is then analyzed, potentially with a classifier, and depending on the patterns recognized, the action is determined. For this work, we use a classical digit recognition task with the MNIST dataset [18] as it represents one of the most popular benchmarks in the ML literature with which many architectures can be tested (from shallow fully-connected networks to deeper convolutional NN). MNIST is composed of 60,000 training images of grayscale handwritten digits and 10,000 examples for test. Each sample is a grayscale 28x28 image (784 pixels) with the associated label "0" from "9". This dataset offers a school case with a known and qualified open-source model. The integration made for the use case can be generalized to other computer vision tasks, specific to the problem to solve.
For a given input image of the NN, the output inference is composed of 10 heuristics that correspond to the recognition probability of each digit from "0" to "9". An example is shown in Fig. 2 with the recognition of the digit "2" with the probability 0, 99 (99%).

Formalism
In this work, we consider a deep NN model that performs a supervised classification task with the following formalism. Input-label pairs (x, y) ∈ X × Y are sampled from a distribution D. The NN model M : X → Y, with parameters , classifies an input x ∈ X to a label M (x) ∈ Y. The parameters are optimized during the training phase in order to minimize a loss function L M (x), y (e.g., the cross-entropy loss) that evaluates the quality of a prediction compared to the ground-truth label. For the sake of readability, the model M is simply noted as M.
We distinguish a model, M, as an abstract algorithm from its physical implementations M. One model M (e.g., a CNN trained on MNIST for digit recognition) can be implemented for inference purpose in a microcontroller or in FPGA. Functionally, the embedded models rely on the same abstraction M but strongly differ in terms of implementation along with their respective hardware environments. Thus, there is no equivalence between M and its embedded variants.
Embed deep NN models on a constrained platform such as a 32 bits microcontroller usually needs model compression techniques to fit the model complexity to the hardware requirements [19]. More particularly, memory footprint is usually an important challenge: for a typical Cortex-M MCU, the trained parameters are stored in the Flash memory and, at inference time, the internal computations (mainly multiply-accumulations and non-linear activations) are processed in SRAM. Two classical approaches are used to fit state-of-the-art models: quantization and pruning. Although the learning process may require 32 bits floating point computations, at inference time, a low bitwidth representation of the parameters is sufficient and does not alter the performance of the model. Thus, most of the tools that enable NN embedding on MCU (such as STM32Cube.MX AI 1 ) propose a 8-bit quantization of the parameters. Pruning refers to techniques that cut useless connections in the network and rely on the fact that most of the models are over-parametrized. Both approaches can also help speeding up the inference process.

Neural network
Two different architectures of model working on MNIST dataset have been used, a MLP and a CNN. Both needed to be small to fit hardware material limitations. As such, MLP is composed of an input (784 points due to the fact that the images must be flattened to be used ) and an output layer (10 neurons corresponding to number of label). This model has only 7850 trainable parameters which makes it a quite small model compared to others doing same task with additional intermediate hidden layers. However, model accuracy is just below 92%. Despite that state-ofthe-art MLP model can reach higher accuracy on MNIST classification, this accuracy remains acceptable in light of model reduced architecture.
On the other hand, a CNN is also considered. This kind of model is divided in two parts with distinct goals. First layers and made for feature extraction (convolution, max pooling CNN are particularly efficient and adapted for image recognition and classification as shown in Fig. 3. Indeed, despite its reduced size, model reaches accuracy slightly over 96% for MNIST image classification.

Learning
In order to implement deep NN models on microcontrollers such as STM32, we previously generate the model with Google Tensorflow [20]. The model architecture (number of neurons, layers, used activation functions) is created according to the target specification, an ARM Cortex-M4. Then, an empty model is trained with labelled data corresponding to the task to perform, the digit recognition, following a supervised learning paradigm. Validation and test of the dataset complete the training. The validation adjusts the hyper-parameters value and distinguish overfitting. The test qualifies the model performance with examples that have not been seen during the training phase. This allows the simulation of real model behavior while having ground-truth class for each example of the dataset. At the end, TensorFlow provides an accuracy score. The trained model characteristics (architecture, parameters and hyperparameters values) composes the embedded NN in a ".h5" file.

Attestations to ledger
The attestation scheme follows the 3 phases depicted in Fig. 4: 1. The secrets and the trusted applications (apps) are provisioned in the embedded device by the device's owner in its private office. Once the secrets protected by secure hardware, the device is delivered to the factory. 2. On the factory floor, during the execution, the device is supervised by an operator. It produces data attested by a trusted app to a distributed ledger. 3. Any stakeholder may perform the verification of the authenticity of the involved devices, thanks to the information registered in the shared ledger, available to all. An accredited and independent auditor may also verify the tamper-resistance of the data produced.

Provisioning of the secret keys
The goal is to provision the private key sk in the TPM2 vault, while enabling its secure access from the TrustZone for the attestation phase, and the verification of its authenticity for the verification phase. Thus, the private key sk is created by the device's owner in a private location. sk should have a high entropy and be on the elliptic curve secp256k1. To endorse sk, the owner generates sk certificate signed with its owner's master key ok. Previously, the owner has created its owner's master key ok, which may be supported in a PKI. Both owner master key ok certificate and endorsed device key sk certificate are in the ledger and available to all the stakeholders.
To avoid the eavesdropping of sk when it is accessed from the TrustZone, sk is ciphered with a symmetric key noted symKey. Once ciphered, the key sk c is written in the TPM permanent memory. The symmetric key symKey is also hidden in TrustZone, in order to decipher sk c in a TEE when used.

Provisioning of the trusted apps
The Ethereum technology requires that the incoming transactions are signed with a private key of the elliptic curve family secp256k1. However, this asymmetric cryptosystem is not supported by the TPM 2.0 standard and is not integrated in the TPM crypto-accelerator. That's why, for HistoTrust, the cryptographic functions, dedicated to the compliance with Ethereum technology, are implemented in TrustZone of an ARM microcontroller.
Two trusted apps are developed in HistoTrust: • industrial app: This application is the "business" application as it realizes the task required. It produces digital data that may be a huge value. • attestation app: This application builds the cryptographic elements included in the transactions sent to the Ethereum blockchain to attest the data produced.
The attestation app is composed of a part executed in the normal world of the microprocessor, and another part protected during the execution in the TrustZone. In order to carry out the measurement process (Section 6.2), a fingerprint of the binary code of each app is computed and stored in the TPM Platform Configuration Registers (PCR).

Attestation of the data produced
During the production phase, the cryptographic attestations are registered in Ethereum ledger through a smart contract. The attestation process, detailed in Fig. 5, consists in computing the fingerprint of the latter dataset produced, which is included in the data field of an Ethereum transaction (Fig. 12). This transaction is signed in the TrustZone with sk which is also used to build the account address of the issuer device. To achieve the signature, the private key sk c is accessed in the TPM permanent memory through the SPI bus and is deciphered in the TrustZone. The signed transaction is sent to the blockchain and a receipt is returned if the registration in the ledger is confirmed. The implementation of this attestation process is tricky because it must respect both temporal constraints and the real-time of the industrial app that produces new data. No data should be lost, due to processing time of the attestation app, power failure of the physical device, or latency of recording in the remote blockchain. In fact, the use of secure hardware components, as TPM and TEE, adds an overhead on the computing time to generate the attestation. The paper [2] presents a detailed study of the performance of HistoTrust according to the security level of the private key sk. On the one hand, on the blockchain side, a huge latency may be observed due to the time interval between two consecutive blocks. The delay between two blocks is very different from a blockchain to another. Ethereum implemented in private blockchain with the Clique algorithm [3] as consensus protocol provides by default a time interval around 12 s between two consecutive blocks. As a comparative example, two consecutive blocks are 10 min apart in the Bitcoin blockchain. On the other hand, the rate of data production by the real-time industrial app can be very high. To circumvent this problem, HistoTrust uses the receipt that confirms the registration of an attestation in the ledger to trigger the reading of a new dataset from the industrial app.

Verification
The attestation history is available in the shared ledger and transparent to all stakeholders. It does not include confidential information, only cryptographic attestations enabling the verification. Each record is a transaction signed with sk, emitted from the account of the issuing device, and sent to the smart contract. It includes the fingerprint of the attested dataset.
Two types of verifiers are distinguished: • involved stakeholder: any actor is able to access the information present in the shared ledger. The registered attestations enable to authenticate the acting devices and their owner in a given time interval. • independent auditor: an independent auditor, such as an insurance expert or a bailiff, may be accredited to request the raw data, to the authenticated device's owner, from the information registered in the shared ledger.
6 Embedded design

The IoT device: a system-on-module
This section briefly presents the IoT platform design. A STM32MP157-EV1 evaluation board is associated with a STPM4RasPI TPM Expansion Board. The STM32MP157 is a single board computer composed of a dual-core ARM Cortex-A7 core processor operating at 650Mhz forming a System-on-Module (SoM). The processor also integrates an ARM Cortex-M4 coprocessor, which makes it suitable for real-time tasks. The dual-core ARM Cortex-A7 is very low-power processor designed for smartphone or edge devices. It includes both a normal world operating with a Rich OS and a secure world with a TrustZone operating with OP-TEE OS. The transition from the normal world to the secure world is done by setting the NS bit in the SCR register to 1. The executed code remains confidential and is protected against logical attacks.
The coprocessor ARM Cortex-M4 offers a real-time environment accessible from the normal world of the ARM Cortex-A7 to extend its computing capabilities and increase its performance while preserving low-power consumption. The functions embedded in the ARM Cortex-M4 are built upon the dedicated Hardware Architecture Layer (HAL). STMicroelectronics provides a protocol called RPMSG [21] to ensure the communication between the ARM Cortex-A7 micro-processor and the ARM Cortex-M4 micro-controller.
The daughter board STPM4RasPI completes the STM32MP157 with a TPM 2.0 from STMicrolectronics. This board is connected through the GPIO making the TPM accessible from the OP-TEE environment via the SPI bus. An Ethernet connection and a serial link enable the monitoring of the SoM. A small screen displays some information about the hardware configuration.

Secure boot and measurement
The ARM Cortex-A7 includes an open-source Trusted Execution Environment (OP-TEE) implementing the ARM TrustZone technology. At start, a secure boot is achieved according to the application note [22] relying on Elliptic Curve Brainpool-256 crypto-system. At start and during the execution in production mode, the integrity of the two embedded trusted apps is checked through the measurement process. To enable this, the fingerprint of the apps binary code is previously provisioned in the TPM PCR as explained in Section 5.1.2.

Integration
The integration consists to make the industrial app and the attestation app working together in the SoM as depicted in Fig. 5, while respecting the real-time constraint of the industrial app.
The industrial app is embedded in the normal world operating on a linux kernel as rich OS of the ARM Cortex-A7, with a part including the NN insulated in the ARM Cortex-M4. It handles the pictures coming from the attached camera in the ARM Cortex-A7. The pictures are transmitted to the NN in the ARM Cortex-M4, to request an inference. As output, the NN provides 10 heuristics, one by digit from "0" to "9". The heuristics are carried to the ARM Cortex-A7. Generally, the recognized digit corresponds to the highest probability.
The communication protocol between the ARM Cortex-A7 and the ARM Cortex-M4 microcontrollers is suggested by STMicroelectronics in [21]. It implements a virtual interface, noted ttyRPMSG, that enables the exchange of small size messages and low data flows. The transmission of small images to the ARM Cortex-M4 with this protocol leads to a loss of information because the throughput is not sufficient. That's why, HistoTrust implements a new communication scheme between the ARM Cortex-A7 and the ARM Cortex-M4 on the SoM. The virtual interface ttyRPMSG is used to notify the presence of data in a shared memory, accessible to both microcontrollers, and the direction of the communication.
Several buffers are implemented in the shared memory in order to handle full duplex communications without data loss. The data to attest composes the new entry written in the file #1. For the use case considered, the format of each new entry is as follows:

[index timestamp url hash inf erence]
The field url is a pointer to the raw data in entry of the NN, while the field hash is the hash of the raw data. The field inference is composed of the 10 values of heuristic, one for each digit from "0" to "9". Each heuristic is a floating value coding a probability between 0 and 1.
The industrial app writes in real-time in the file #1 all the data produced that needs to be attested. The size of this buffer is not limited, as it is stored on an SD card of several GB. Only the industrial app is authorized to write in this file, while attestation app is authorized to read it. The receipt received from the blockchain confirms the registration of the attestation of the previous dataset in the ledger. This receipt triggers the read of the next dataset in the file #1. The file #1 is stored in persistent memory. If a power failure occurs, the data is saved and the attestation process resumes where it left off when the power returns. The file #1 may be ex-filtrated by its owner.
The attestation app includes a part located in the normal world and another part located in the secure world of the ARM Cortex-A7. The TPM is only accessed from the secure world, thanks to the integration of the SYS layer of the TPM stack in the OP-TEE environment. The lightweight mbedTLS library is also integrated into the OP-TEE environment. It provides cryptographic primitives and allows to build dedicated functions such as the Ethereum digital signature. In the normal world, low level commands

Deployment
All the devices are distributed on a local network following a star topology around an access point. A proxy allows the communication with the outside to enable raw data ex-filtration. A consortium Ethereum blockchain is locally deployed. Each stakeholder of the use case owns a validator node with a complete copy of the ledger, and has one vote in the consensus protocol. The validator nodes are depicted with a computer in Fig. 6. Thus, the governance of the system is ensured with equity and fairness by all the stakeholders.
The devices acting in the production line, are provided with the embedded apps, enabling to send transactions to the validator nodes. Thus, each device is the root-oftrust of the data it produces, forming a distributed rootof-trust network. The provisioning is done, independently by each device's owner, prior to the deployment of the hardware in the factory. The management of the access rights and authorizations is done through smart contracts.

Performances
The blockchain is a time-stamping system consisting of a sequence of blocks spaced out in time. The recording of new transactions in the ledger is performed at low rate. We want to show that with our implementation of HistoTrust, the security and privacy properties brought thanks to the use of the blockchain have no impact on the industrial process flow and on the rate of inferences of the embedded NN.
To carry out the performance measurements, we consider the processing time of a transaction and its recording in the ledger by using the Ethereum Ganache simulator configured in automining, i.e., the transaction is recorded as soon as it arrives, without any latency due to the consensus protocol between the network validator nodes. The processing time of a transaction from the computing of the hash of the dataset to the receipt of the registration proof is estimated at 156ms.
We also considered the rate of inferences of the NN, and determined the processing time of an inference at the output of the NN, given an image presented at the input. This is estimated at 12.3ms.
So, the highest input rate of the NN is 1 picture every 12.3ms. We then chose the following measurement points: 1 picture every 20ms, 30ms, 50ms, 100ms, 150ms, 200ms, 300ms. For each rate considered for the input data, we determined the number of inferences contained in a transaction recorded by the Ganache simulator.
The results are presented on the graph in Fig. 7. For each measurement point on the x-axis, we presented 5000 images as input to the NN, and averaged the number of inferences contained per transaction on the y-axis.
This graph illustrates that the security and privacy properties, presented in the following paragraph and ensured with the blockchain technology, are implemented without impacting the performance of the industrial application.

The security model
In 1990, Reason [23] introduces the swiss cheese model to analyze the causality of an incident and manage risks. The physical device, that embeds the NN, integrates several security layers to protect and detect attacks or malfunctioning, as depicted in Fig. 8.
The first layer is a physical protection that prevents access to the components embedded in the smart robot, and that remains physically damaged in case of intrusion. By this way, succeeding in a physical attack on the electronic components that support the NN is difficult and leaves marks. The second layer is the cyber protection against logical attacks. The use of secure hardware components like TPM and OP-TEE to protect the cryptocraphic keys and seeds, is the foundation of this protection. The third layer is the detection of intrusions or tampering. At this layer, secure boot and measurement are deployed to monitor the integrity of embedded firmware and software. The fourth layer concerns the traceability to be able to understand what happens when the previous layers are bypassed. A blockchain is used to register the traces as attestations of the logged data produced by the embedded apps.

The asset
The assets to protect are the business-relevant data for the stakeholders. It is the logged data including all the relevant data produced by the physical devices, which contributes to make decisions of the digital command sent to the actuators. This includes inferences produced by the embedded NN (see Fig. 9). The authenticity must be ensured, as well as integrity and completeness. The traceability is a valuable service to understand the origin and sequence of the events, while the raw data produced remains confidential to its owner. In order to reduce the attack surface on the electronic board, the different protection layers of Fig. 8 integrate several countermeasures. The goal is to fulfill these security requirements: • R1: AI explainability: The behavior of the embedded AI should be explainable. • R2: forward integrity: The data attestation history must be immutable and transparent to the stakeholders. The raw data must be persistent and of integrity. • R3: public authentication: Any stakeholder should be able to authenticate the devices issuing data in a given time interval through the attestations history. • R4: power failure: No raw data or attestations should be lost in the event of a power failure. • R5: privacy-preserving data: The raw data shall not be exposed to the other devices. • R6: verifiability: An accredited auditor must be able to verify the data attestations. • R7: multiple stakeholders: The scheme shall support multiple stakeholders owning multiple devices issuing data concurrently.

The threat
The threat events are the tampering of the data produced, the production of fake or dysfunctional data, the spoofing of data or issuing devices, and the theft of data. The main sources of risks come from the following profiles: • Negligence: this threat arises from unintentional human error, but causing a failure, • Ransacking: this threat corresponds to a malicious action with the intention to destroy, tamper, spoof, modify value data, • Concurrence: this threat may seek to destroy data like the ransacker, but also to steal valuable data for analysis. The main stakeholders, involved in the smart manufacturing use case, are: • the provider of the smart robot, by default he is the owner of the logged raw data produced by its devices, • the expert who trains the embedded AI, • the manufacturer of the product (e.g., the car) for which the robot performs tasks, • the operator of the smart robot during production, • the maintenance agent who intervenes on the smart robot, • the accredited and independent auditor mandated in case of litigation. Table 1 shows the role that each stakeholder can play. The provider of the smart robot may be negligent in providing an unreliable device, poorly configured, or in which bugs remain. In the event of a litigation, he must provide the integrity of the data requested by the auditor. Thus, it is the provider's responsibility to maintain the tamper-resistance and confidentiality of his data. As there are usually several suppliers of smart robots in a factory, they are potentially concurrent. This may be an incentive to obtain confidential data from their concurrent for analysis to gain market share. The expert is responsible for the learning of the AI and the decision of the embedded NN. He must be able to explain how the heuristics are derived. The manufacturer is physically present in the factory and has access to the smart robots. He may take any profile of attacker in order to hide a problem for which he is responsible and pass the blame on to another stakeholder. An operator or a maintenance agent may make a human error, and possibly seek to cover it up by destroying elements.
The auditor's mandate is in the legal field, which gives him legal accreditation and independence from other stakeholders.

Security and privacy review
R1: AI explainability. Explaining the behavior of an AI requires measures to be implemented at the design stage. The blockchain technology provides obviously and by design the property of traceability. However, the blockchain does not manage the confidentiality of the traced data. This is why HistoTrust proposes a scheme combining the use of a blockchain to transparently guarantee the properties of immutability, authenticity and ordering, and the use of private storage of raw data, under the responsibility of their owner.
R2: forward integrity. The blockchain ensures by design the forward integrity of the information recorded in the ledger. The ledger maintains the history of cryptographic attestations, each one being a pointer to a raw dataset stored outside the blockchain. Thus, any tampering or removal of raw data is detectable.
R3: public authentication. The recorded attestation authenticates the device issuer, and all genuine devices are endorsed by their owner. The consultation of the ledger allows any stakeholder to know the devices acting in a given time interval, and the order of the performed actions.
R4: power failure. Resilience when a power failure occurs implies that no raw data or cryptographic attestations are lost. The use of a file buffer stored in permanent memory ensures data persistence in case of power failure.
R5: privacy-preserving data. This requirement covers raw data at storage and during transportation. The physical protection of the device in the factory makes access to the board peripherals difficult and detectable. The ex-filtration of the raw data is performed through VPN.
R6: verifiability. HistoTrust distinguishes two roles of verifiers. All the stakeholders can play the first role, having access to the attestations recorded in the ledger. The second role is reserved to an accredited auditor, under a legal mandate, to request the raw data.
R7: multiple stakeholders. HistoTrust brings a solution where the number of stakeholders is not limited by using blockchain technology as a complement to existing technologies. The stakeholders ensure the governance together, each having a validator node.

Audit
The audit is launched when an incident occurs. The goal of the audit is to determine the cause and the accountabilities with the maximum of transparency for the involved stakeholders. The audit takes place in two phases: the first to trace the events in a given time interval before the incident. The second is to analyze the behavior of the AIs involved.

Traceability of the events
The blockchain provides an immutable history, shared among all stakeholders, of all past events. Figures 10 and 11   Each block includes a tree of recorded transactions, as shown in Fig. 12. The sender address authenticates the issuer device, while the contract address authenticates the recipient smart contract. The field data includes the fingerprint of the raw dataset produced by the issuing device at the given time, whereas the field gas indicates the computing power required to execute the targeted smart contract in the blockchain. This value is an indicator of the energy consumed to execute an instance of the smart contract.
Until the request of personal data, any stakeholder member of the ecosystem can achieve the verification. The first step consists to get the recorded attestations of the considered time interval in the shared ledger. Each attestation authenticates the issuer device, as well as the owner who has endorsed his devices.

Explainability of the AI
Each device's owner may be requested to provide the raw data associated to the recorded attestations. As this data is confidential, only an accredited and independent auditor is authorized to perform this task regulated by the legal.  The provided data must be complete and of integrity; otherwise, the accountability of the owner is engaged, with the suspicion of hiding a fraud. Each owner is responsible for keeping and protecting its logged raw data.
Once the completeness and the integrity of the attested data established, the analysis of the raw data is conducted, in particular the analysis of the AI behavior. Each owner is responsible for providing an explanation of the behavior of its embedded NN.
At this stage, the analysis relies on tools and methods the expert used to explain the behavior of the NN, and on human expertise. For example, the picture presented in Fig. 13 is labelled "9". However, the inferences of the embedded NN recognize the digit "4" with a probability of 68%, the digit "7" at 16% and the digit "9" at 5%. With a school case and a labelled image, one knows that it's a "9". But the pictures acquired by the smart robot's on-board cameras are not labelled. And, only the explainability of the learning model and human expertise can remove the doubt on the most likely pattern. In a factory, the smart robots are supervised by human operators. Thus, one can consider that if the inference does not return any heuristics above a certain threshold, e.g., 71%, the decision is the accountability of the human operator. On the other hand, when the error is obvious, for example, the NN recognizes a "3" with 95% certainty when it is a "0", the human operator will not be solicited, which can potentially lead to an incident on the production line. This may be due to an adversarial attack, i.e., an attack on the NN affecting the cyber protection layer (see Fig. 13), and not detected by the embedded system. The traceability implemented with HistoTrust allows to discover the cause.

Conclusion
This paper introduces HistoTrust, a robust scheme using TEE and TPM secure components to trace the behavior of embedded AI. It begins with the challenge of embedding a learnt NN in an ARM Cortex-M4 microcontroller. Next, based on an attestation scheme to an Ethereum ledger, an embedded design is proposed to secure the NN, ensure its robustness and enable the explainability of its behavior. Then, several devices, following a distributed architecture, are deployed around a blockchain. The security analysis and the audit process provide verification tools that brings trust and fairness between the stakeholders involved in the use case. In future work, the preservation of data privacy will be deepened, and some cryptographic process will be ported to the TPM.
Funding This work is a collaborative research action that is partially supported by (CEA-Leti) the European project ECSEL InSecTT 2 and by the French National Research Agency (ANR) in the framework of the Investissements d'avenir program (ANR-10-AIRT-05, irtnanoelec).

Conflict of interest Not applicable
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.