1 Introduction

Software-defined radios (SDRs) have the characteristic that not the entire radio and signal processing chain is realised in hardware, but, as the name already suggests, parts of it are realised in software. The realisation in software makes it possible to subsequently adapt or change these software-implemented parts by changing the configuration. Due to the flexibility achieved in this way, an SDR can therefore be used for various applications with relatively little or, in some cases, no hardware redesign effort at all.

The use of an SDR is possible for a variety of applications. This ranges from the implementation of simple reconfigurable transceivers in different frequency bands with the flexibility to change coding schemes and data rates during flight [14] to different measurement setups like satellite ranging [15] or passive reflectometry [16] which is the primary purpose of the PRETTY mission [17]. In general, almost any setup can be realised based on the reception or transmission of different radio frequencies. For example, with relatively little change to the hardware, the system can be used to measure ice and sea heights and passively measure the temperature of the Earth’s surface [18].

SDR platforms are also of high interest for small satellite and CubeSat missions, as their high reconfigurability allows the fulfilment of various mission goals. At the Institute of Communication Networks and Satellite Communications, a 3U CubeSat called PRETTY (Passive REflecTometry and DosimeTrY) [1] is currently designed, built and tested, together with RUAG Space Austria and Seibersdorf Laboratories. After TUGSAT‑1 [2] and OPS-SAT [3], it is the third small satellite to be built at Graz University of Technology. It will host two scientific payloads dedicated to passive reflectometry, as well as dosimetry.

The PRETTY satellite will host such an SDR module (Fig. 1) in order to operate the passive reflectometry payload of the satellite. Its task will be to receive and sample the direct signals from the GPS satellites and the reflected signals from the Earth. Due to distortions in the reflection path, the correlation of both signals will provide information on the surface characteristics (e.g. measuring ice and sea heights).

Fig. 1
figure 1

A CAD model of the PRETTY spacecraft is shown on the left. The reflectometer antenna for the main payload is visible on the front. On the right hand, one can see the SDR platform of the PRETTY mission

2 The challenge of FDIR introduction on CubeSat missions

CubeSat missions might be characterised as small, light and cost-optimised. For large and much more expensive missions, Failure Detection Isolation and Recovery (FDIR) is often realised by installing several redundant modules and comparing the results between these identical modules or even by switching between them [4]. Due to the limitations of the CubeSats in terms of size and the limited budget, no (or only little) redundancy can be realised in these missions, which means that error prevention has to be done on a different level.

CubeSat missions generally have a very high failure rate. With a convergence interval of 95%, it can be assumed that within the first year in orbit, between 26.76% and 41.06% of the CubeSats are defective and no longer functional for a wide variety of reasons [5].

Of all failures within the first 30 days, 44% of all failures are due to the Electrical Power Supply (EPS) [5]. If these problems of the power supply system can be eliminated, it can be concluded that almost half of all mission failures can be prevented in the initial period of the missions, thereby significantly increasing mission success. This applies to the EPS of the entire satellite and the electrical power supply of each submodule.

3 Innovative FDIR approach and its realisation

Although most of the failures of CubeSat missions were related to the power system, additional areas of improvement were detected and investigated thoroughly.

An FDIR concept based on four layers was elaborated and realised onboard the PRETTY spacecraft, which consists of

  • Layer 1: Electronic Power Supply

  • Layer 2: Power bus switches

  • Layer 3: Power sequencer and supervisor

  • Layer 4: Monitoring of voltage converter’s power

A detailed description of the different layers is given in Sect. 3.1.

Fig. 2 shows a schematic architecture diagram of the four layers of our concept. In layers 1–3, it is always possible to interrupt the power supply to the SDR. In layer 4, the interruption option of layer 3 is used. This concept enables a finer breakdown and measurement of the individual voltages and currents of the SDR, which means that overcurrent, overvoltage and undervoltage situations can be better resolved than in standard CubeSat hardware. Due to fluctuations in the current demand during operation, a more precise delimitation between desired operation and the error case is possible. In comparison, the concept of the power supply, which is standard on many COTS CubeSat components, is also shown in the lower section of the diagram. Here, layers 2–4 are usually omitted, and the measurement of the current or voltage is the sole responsibility of the EPS. However, since all operating cases must also be covered here, the current limits, for example, must be selected so large that all operational scenarios are covered. However, if a fault occurs in an electronic component, which leads to increased current consumption, this cannot be detected in this system. Using all four layers, the probability for detecting these failures is higher due to the finer breakdown.

Fig. 2
figure 2

FDIR concept of the PRETTY SDR power supply in comparison to the design of most common COTS CubeSat hardware

3.1 FDIR Layer description

Layer 1: Electronic power supply

The first layer of the FDIR concept involves the Electrical Power Supply (EPS). The EPS system of the PRETTY satellite is a P60 system, which is a COTS product from GomSpace [6]. This EPS offers two power distribution units (PDUs), which in turn offer 9 supply channels each [7]. The SDR board is connected to one of these supply channels. These power channels are monitored by the software of the P60 module and provide first basic protection against overcurrent situations and thus also basic latch-up protection.

Layer 2: Power bus switches

The power bus switches are the second layer of the FDIR concept of the PRETTY SDR system. The application principle of the power bus switch on a Cubesat was first developed for the SEPP (Satellite Experimenters Processing Platform) [8], a module onboard the OPS-SAT spacecraft, which was built at our Institute and has been operational in orbit since late 2019. The principle is now adapted and used again for the PRETTY payload systems (SEPP, SDR and the dosimeter).

An Analog Devices LTC4281 is used as the power bus switch on the SDR board. It has the following essential properties [9]:

  • current, voltage, and power monitoring

  • overcurrent and latch-up protection

  • overvoltage/undervoltage protection

  • inrush current ramping

  • storage of minimum/maximum values and fault logging

Layer 3: Power sequencer and supervisor

A power sequencer and supervisor provide the third layer of the SDR frontend FDIR system. For the PRETTY SDR system, an Analog Devices LTC2937 chip is used. All DCDC and LDO converters of the board are connected to this chip, which offers the following key features [10]:

  • overvoltage and undervoltage detection

  • sequenced switching on and off of the individual voltage regulators

  • fault logging

The circuitry of the voltage supervisor is shown in Fig. 3.

Fig. 3
figure 3

Schematic of the LTC2937 voltage supervisor as implemented on the SDR PCB of the PRETTY satellite

Layer 4: Monitoring of voltage converter’s power

The fourth level of the PRETTY SDR hardware onboard FDIR system is given by monitoring each voltage converter’s power. The chip used for this is the Analog Devices LTC2945. This offers the possibility of resolving the voltages and currents of the converters with an accuracy of 12 bits. The currents are resolved by measuring the voltage drop across a sense resistor R_SNS. The full-scale resolution is specified for a voltage drop of 102.4 mV [11].

When selecting the sense resistors, several factors must be taken into account here. On the one hand, the resistors must be selected large enough so that the resolution of the current per converter is as accurate as possible. On the other hand, the resistor must be selected small enough so that the full-scale range of the ADC is not exceeded even when the maximum current is drawn, and the voltage drop across the sense resistor is not too large.

In the PRETTY case, the R_SNS values were selected so that a resolution accuracy between 0.5 mA per LSB and 2.5 mA per LSB is achieved at the transducers. Considering the manufacturing tolerances of the measuring resistors and the temperature behaviour of the entire electronics, this current resolution of the measurement is sufficient, especially in connection with changing component values and the resulting different current consumption.

The sense resistors were chosen in such a way that the LSB measuring range is approximately the same for each converter for each expected current range.

The voltage value at the SENSE-pin of the power monitor, i.e. after the voltage drop through the sense resistor, was selected as the voltage feedback to the converter so that an increased output voltage can compensate the voltage drop through the sense resistor and the desired voltage value is present directly before the filter for the generated voltage.

One feature to be pointed out in the following is the automatic detection and isolation of a fault on a dedicated converter. The LTC2945 Power Monitor measures the power drawn from each converter. If the power is higher than the configuration allows, various actions can be taken. In the case of the PRETTY power supply system, the ALERTn pin is pulled to the ground. This pin is also directly connected to the enable pin (label LDO_+1V3_EN) so that if a fault occurs in the circuitry connected to the converter, the converter is deactivated by pulling the enable pin of the converter to the ground. As a result, no output voltage is generated by the converter. This has the following two effects: Firstly, the subsequent circuit is no longer supplied with voltage so that any damage caused by an unwanted current flow can be prevented. Secondly, the voltage supervisor detects that no voltage is generated at the converter, and an under-voltage situation is detected. Depending on the configuration of the voltage supervisor, several predefined actions can be executed. On the PRETTY SDR board, the supervisor is configured so that the entire voltage converter chain is now sequentially deactivated and the supplies are discharged. After discharging, an attempt is made to reactivate the converters sequentially. This is automatically carried out 5 times. If the error occurs several times or if an automatic start is therefore no longer possible, a manual action can be taken by the satellite operator at the next ground station pass. The Power Monitor circuitry is shown in Fig. 4.

Fig. 4
figure 4

Schematic of the 3.3V converter. The 3.3V converter is shown on the left (IC14), the subsequent power monitor is on the right (IC15)

In addition, the board’s temperature near the power supply section is monitored with the help of an external temperature sensor of the type Texas Instruments TMP175 [12]. In case of a pre-configured too high temperature, the ALERT pin of the temperature sensor is pulled to the ground in the TMP175 chip, configured in comparator mode. In the PRETTY SDR, this pin is connected to the FAULTB pin of the power sequencer. If the FAULTB pin of the sequencer is now pulled to the ground by the temperature sensor, this triggers an action predefined in the FAULT_RESPONSE register of the sequencer, which we have also selected as discharged retry.

If an error occurs, the automatic fault detection induces an immediate switch-off of the affected converter and a sequenced switch-off of all other converters. However, this also means that the corresponding error protection mechanisms are no longer supplied with power and can no longer be read out. In order to investigate the error afterwards and evaluate the telemetry in such a way that the ground operator also knows why the automatic shutdown was carried out, components were chosen that store the error status in persistent registers. On the PRETTY SDR board, the power bus switch LTC4281 and the voltage supervisor LTC2937 store the fault indicators in a so-called fault log register in an EEPROM.

With the successful completion of the environmental tests and the flight of the hardware on the PRETTY mission, the SDR platform receives the so-called flight heritage. This is an important quality indicator, especially for CubeSats, whose hardware primarily consists of relatively inexpensive industrial-class COTS electronic components and no space-qualified components, as it concludes the system’s survivability in the harsh space environment and minimises the risk of failures both in the launch vehicle as well as during in-orbit operations [13].

4 Analysis and implementation

As ICs become smaller, the dimensions between transistors within an IC also change. The smaller the distances between transistors are, the easier the conditions are for a latch-up. A Latch-up is an effect where a low impedance path is created between the supply and ground. A trigger, such as ionisation, can cause this condition. However, once the path between the supply and ground is present, it usually persists even if the trigger condition is no longer present. This low-resistance path can lead to system malfunctions or catastrophic damage due to excessive current in an unwanted region of the electronic component and might result in a total mission loss. The latch-up condition usually requires a power cycle to restore the original state of the component and eliminate the low-resistance path [23].

Protons, usually trapped in the Earth’s radiation belts or emitted from solar flares, can cause direct ionisation SEEs (Single-Event Effects) in susceptible devices (e.g. CMOS technology), or more typically, produce an indirect ionisation effect that can cause an SEE. The integrated circuits (ICs) that use this technology range from complex microprocessors to dense Static Random-Access Memory [22]. In addition, cumulative long-term ionisation damage by protons and electrons can lead to components attaining reduced functionality, as the long-term effects can change component parameters such as threshold values, time behaviour, or similar [21].

On PRETTY, the probability of a system failure of the whole SDR platform is based on the sum of the failure probabilities of all installed electrical components on the PCB. Unfortunately, the exact failure probability of the individual components is unknown since CubeSats mainly use COTS components and often, no failure analysis is available for them. The probability of a system failure due to a single fault is therefore equal to the sum of the failure probabilities of all components.

This probability is contrasted with the quasi-known probability of a bus switch failure. The bus switches were tested following ESCC22900 [25] as part of the OPS-SAT project up to a total ionising dose (TID) of 222.6kGy [24]. In addition, the bus switches were already used on the SEPP of the OPS-SAT project. As part of the test campaign for this, it was also tested for SEEs at the Paul Scherrer Institute in Switzerland. In 2019, the OPS-SAT satellite was launched into space. Since then, the SEPP has been in operation, and no errors have occurred with the bus switches. For this reason, the error probability of these components can be classified as very low. It follows purely from the smaller number of bus switch components that the probability of failure due to defective bus switches is smaller than the probability of failure due to a defect in one of the other unknown electronic components in terms of failure probability.

So the current hypothesis is that

$$p\left(1bus\,\textit{switch}\,\textit{failure}\,\right| nbus\,\textit{switches})< p\left(1\,\textit{failure}\,\right| m\gg n\,\textit{components}\,onthePCB)$$

where p is the probability of failure, n is the number of bus switches, m is the number of total electronic components on the SDR.

The concept presented here has already been implemented on the PRETTY SDR and is being tested in the course of the PRETTY unit-level tests. The launch of the PRETTY satellite is scheduled for the second half of 2022.

5 Conclusion

For the specific mission objective of PRETTY, the application of the FDIR concept, the fault-tolerant implementation of the SDR system in general and the power supply section of the SDR in particular means a significant increase in the in-orbit lifetime of the payload hardware. This increases the possibility of measuring changes in sea and ice levels [19] and ocean surface currents [20] over a longer time and detecting temporal differences in them. As a result, an essential input for climate research is generated, as it allows the scientific community to analyse climate change impacts more precisely and gain a better understanding of them.

The implemented FDIR system is based on four different layers for monitoring or interrupting the supply to the SDR in the event of a fault. Firstly, currents and voltages are measured over the entire module on the EPS. As a second layer, the possibility of an interruption by so-called power bus switches at the SDR is provided. This additionally protects the EPS against high power draw and a spreading of a fault from the SDR over the EPS to other modules is prevented. A power sequencer and supervisor as the third layer and additional monitoring of each voltage converter power as a fourth layer further improve the failure detection resolution. Compared to conventional hardware systems on CubeSats, this enables more accurate and faster detection of faults in the module’s power supply, which means that the remaining electronic components of the SDR can be disconnected from the power supply more quickly in the event of a fault.