AURIX Architecture
The multicore microcontroller Infineon AURIX TC399XP is designed for safety-critical applications [1, 14, 15]. Figure 2 shows its schematic structure. The used microcontroller is a derivative with six proprietary TriCore processor cores, which operate with 300 MHz clock frequency and are based on the modified Harvard architecture. Accordingly, each processor has separate interfaces for code and data. Each of these interfaces consists of a scratchpad, which must be managed by the developer, and a two-way associative cache, which is managed by the core itself. In addition to these local memory units, each processor core has a Static Random-Access Memory (SRAM) and a separate flash memory. All memory units of the Infineon AURIX can be used for both data and code, but in the case of the local memory units the maximum access speed is only achieved when used as specified. The size of all memory units in the system is listed in Table 1. A crossbar is used to connect the remaining memory units and to enable the communication between the processor cores. The connection of various peripheral modules is realized via a bus system which is accessed competitively by all cores. This bus system also provides the connection to the HSM which thus gains access to all memory units in the system.
Table 1 Memory sizes of the Infineon AURIX TC399XP
Memory Protection
Due to the focus of the AURIX microcontroller family on safety-critical applications, the TC399XP has a large number of MPUs that protect against unauthorized access. The MPUs were primarily integrated into the chip to isolate functions of different criticality from each other, but they can also be used to implement access authorization. In general, each memory unit in the AURIX has a separate MPU that can exclude processor cores from access. Access authorization can be defined for the complete memory unit or for selected memory areas. The type of access can be restricted, e.g., only allowing read access. The MPU is configured via specially protected registers, which can also be protected against manipulation. In addition to the memory unit’s MPUs, each processor core has its own MPU with which different tasks running on that core can be isolated from each other. For this purpose, different configuration sets can be stored for these MPUs, allowing to change the configuration, e.g., at a context switch. Another way to prevent memory manipulation is to use the irreversible One-Time-Programmable (OTP) feature of the AURIX. Using this feature, memory areas can be permanently configured in such a way that they are locked for further modification. A special feature of the AURIX’s OTP capability is being able to differentiate between write and read operations.
Privilege Modes
Each processor core of the Infineon AURIX has a rudimentary rights management system consisting of three authorization levels. The three modes differ in terms of access to critical registers. In User Mode 0, only tasks, which do not require interaction with peripherals or configuration registers, can be performed. In User Mode 1, access to peripherals is possible, but system-critical properties such as the MPU cannot be manipulated. Only the supervisor mode allows unrestricted access to all registers and peripherals. However, it should be noted that critical configuration registers can also be locked in such a way that even in supervisor mode access is only allowed after a restart.
Hardware Security Module
For the calculation of security-relevant functions, the TC399XP has an integrated HSM, based on a Cortex-M3 with 100 MHz, which corresponds to the full standard according to the EVITA classification [6]. The HSM is connected to the host system via the peripheral bus and has separate, specially protected memory areas which are provided by the host system. Communication between the HSM and the host system is realized via a special bridge module, which enables the transmission of commands. Furthermore, the HSM can trigger interrupts which are routed to the host system. A special feature of the HSM is that it has full read access to all memory units in the host system. It is not possible to limit access by an MPU. To accelerate cryptographic operations, the HSM has special hardware units for the calculation of AES 128, PKC ECC 256 and SHA2 256 as well as an AIS 31 compliant True Random Number Generator (TRNG). Finally, it is important to note that the HSM provides no integrated protection against possible side-channel attacks. The realization of this protection is the task of the used firmware.
Secure Boot
The system start of the Infineon AURIX is performed on processor core 0, which starts the HSM if configured accordingly. The HSM is used to implement the secure boot process, which validates the memory contents of the whole system. Only after confirmation of the HSM, the start process of the host system is continued and, depending on the configuration, the debug interface is initialized. The processor core 0 also activates the other processor cores.
Implementation of an FSM
The core that is chosen as the FSM is core 0. The advantage of this choice is that it also handles the initial start-up of the whole system and is thus the first core to become active. This means that no other core can change the FSM’s configuration before it is even started or prevent it from booting at all. This section will discuss the implementation of all components that are defined in the framework.
Secure Boot
Secure boot is implemented analogously to the concept used by Infineon in the HSM. For this purpose, the function for verifying the memory content is called directly after system startup. Since the FSM is a software solution, different variants for verifying the memory are possible. In addition to simpler implementations such as verification using hashes, variants based on signatures are also supported. It should be noted, however, that the start time is significantly extended for complex cryptographic operations, which is only possible to a limited degree in automotive applications. To effectively protect the secure boot functionality from manipulation, the corresponding memory area is marked as OTP. The same also applies to the memory area that defines the start address of the processor core 0, so that skipping the secure boot is prevented. In addition to validating the memory contents, the secure boot also checks whether the debug interface and the MPU are configured correctly. Only if all boundary conditions are fulfilled, the secure boot starts the application code.
Secure Update
Secure software updates are implemented in the same way as the secure boot procedure. The required boot loader is also located in a memory marked as OTP. Again, different variants of a secure software update are possible, because all cryptographic operations can be provided in software. The variant with the highest security level is the authentication using asymmetric cryptography. For this purpose, the manufacturer of the ECU signs the software update with a private key, whereby the public key is stored in the protected memory of the FSM. To ensure that the public key is not altered, the OTP feature can also be used here. After a successful software update, a restart is executed, during which the secure boot validates the updated software.
Permanent Lock
The implementation of this feature can be realized by using the OTP functionality of the flash. A schematic illustration can be found in Fig. 3. The secure boot code is stored in sector 0 of core 0, which verifies all memory units directly after the system start. Since only core 0 is started at this time, the other processor cores are not able to manipulate this process. If the check is successful, the software jumps to the normal startup code, which is located in sector 2 and also initializes the other processor cores. If, however, an error is detected during the check, a jump is made to the following sector 1, which contains an endless loop. It is important to note that sector 0 and sector 1 are implemented as OTP memory. As flashing these memory areas is irreversible, they can be assumed secure. Since the verification of the memory is not only executed during the system start, but is also repeated cyclically at runtime within the scope of tuning protection, it is necessary that a detected manipulation is not forgotten after a system start. For this reason, a flag is set in a free and FSM-exclusive memory area and then marked as read-only using the OTP feature. This flag is also analyzed by the secure boot and the endless loop is started accordingly. As the authors have shown in [13], specific security checks can be skipped by means of targeted failures, which can also bypass the implementation of the permanent lock. The safety functionalities of processor core 0, the dedicated lockstep core and the memory correction, help to detect and prevent this kind of manipulation. If a corresponding failure is detected during the execution of the permanent lock, a restart is initiated, which results in the execution of the endless loop.
Like explained in the framework, the system’s core is more powerful than that of comparable HSMs. Further, that of the AURIX has a direct connection to the crossbar allowing quicker memory accesses for the FSM than it is the case for a peripheral HSM.
Cryptographic Methods
The cryptographic algorithms chosen for this FSM implementation are AES 128, SHA 256 and ECC 256, which represents 128-bit (pre-quantum) symmetric security. These algorithms are also implemented in the AURIX HSM, easing a comparison of the achievable performance in Sect. 5. For asymmetric encryption elliptic curve cryptography is used and Curve25519 is implemented. That particular curve is considered secure and allows for efficient computations [16], making it well suited for the deployment in smaller processors [17].
Private Keys
To protect the device’s private keys, all FSM services run in User Mode 0, with the only exception being the Advanced Encryption Standard (AES) module, which runs in Supervisor mode. This does not restrict any computations done by those services, since no accesses to peripherals or system configuration registers are required. By default, any access to the device keys is prohibited by the core’s MPU, making them inaccessible to all User Mode 0 services. When the AES module requires access to one of those keys, it changes the MPU’s configuration, then reads the key and afterwards resets the MPU’s configuration to its default setting.
Random Number Generator
Deterministic RNGs require a seed, which is used to generate pseudo-random numbers. For the FSM, a truly random seed is stored in its protected flash memory during the time of programming. To avoid the output of the same sequence of numbers after each reboot, the number of times the system was booted is stored in non-volatile memory and incremented during the boot process. This number and the random seed are then used to generate a temporary seed used by the deterministic RNG until the next reboot. Additionally it is possible to replace the truly random seed with encrypted updates to introduce new randomness.
Memory Protection
The local memory units of core 0 are exclusively assigned to the FSM by using their MPUs, making them inaccessible to all other cores. This configuration is done and locked during the start-up of the system, before any untrusted core is started. Protecting the used configuration registers ensures, that no core can change them during runtime [9]. Using the same mechanisms as the HSM, ensures strong isolation of the FSM’s resources. Since the cache of core 0 is not shared, it requires no explicit protection to prevent cache-attacks.
Debug Setup
In the second AURIX generation, the debug interface is deactivated using special configuration registers. These registers are read out by the FSM at system startup and are checked for a correct configuration. If this check fails, all of the system’s memory units are deactivated using the OTP capability and locked for potential readout. If the system is already under the control of the debugger at this time, a readout cannot be prevented [1].
Event Log
In the implementation of the FSM framework, all events are recorded in a ring buffer with 256 entries. The time for the timestamps is provided by the internal system timer 0, which is implemented as a free-running 64-bit counter, where the current counter value cannot be manipulated by a program. The only way to manipulate the system timer is to disable it. Therefore, the corresponding registers are protected by core 0 against external access by the untrusted processor cores directly after system startup. For this reason, an external Real-Time Clock (RTC) is not used, since it offers significantly more potential attack vectors. To overcome the resets of timer 0 at each restart, the FSM logs every restart to the event log.
Secure Deletion
The FSM implementation provides a function to be called every time a cryptographic computation completes. It first zeros the memory regions holding variables during the computations and afterwards clears the caches of core 0.
Unique ID
In the Infineon AURIX, a Target ID is stored in the chip during production and then protected against manipulation by using the OTP feature [1]. This ID is used by the FSM, which provides it to other applications via an interface.
Bridge Module
A special characteristic of real-time capable systems is the requirement for absolute determinism. Only when this is given, a real-time system can always meet the required deadlines under all conditions. For this reason, memory is allocated statically and functions are executed in a plannable time period. This behavior is also applied to the FSM, which polls the requests of the other processor cores in a fixed time schedule and processes them accordingly. All access periods and the number of queues and requests are stored in the request configuration. The memory units used for all communication are the global SRAM memory units of the respective processor cores. These provide request queues at fixed addresses in their memory, which the bridge module periodically checks for new requests (see Fig. 4). As multiple applications of different criticality can be executed on one processor core, the bridge module supports a flexible number of request queues per core. The queues are polled by the request manager, who is given the corresponding cycles by the request configuration. For each query, the validity of all contained requests is compared with the request configuration. All valid requests are transferred to the request queue of the bridge module, which holds them for processing inside the FSM. The activity manager of the operating system reads the request queue cyclically and processes it accordingly. The results are only written back at a time defined by the request configuration. This procedure ensures that the FSM behaves in an absolutely deterministic way to the outside system, making inferences about the current workload of the FSM impossible.