Introduction

The tokamak device is always equipped with a large number of different kind diagnostics, actuator or control systems. The systems working conditions are very difficult for the constructors. The few of the factors influencing system construction could be mentioned:

  • High magnetic fields, (e.g. above 0.3 T [1]); special care should be taken into account when using mechanical equipment (relays) as well typical PC components. Special shielding components could also be designed [2]

  • Neutron radiation: depending on the intensity it could be considered to design special firmware for FPGAs (e.g. redundancy techniques) [3] as well proper selection of integrated circuits (ICs) for use under radiation [4]

  • Electromagnetic interferences (EMI) from various sources: high or low frequency EMI with variable power. In that case, the proper shielding, electrical connections and/or signal filters (reconfigurable) should be considered.

  • Temperature: This depends on the components working conditions. Industrial grade (or better) ICs should be installed in regions with raised temperature or poor air circulation. Simulations using specialized software should be used to properly select cooling systems

  • Limited or restricted installation place: could be necessary to divide the system [5] into modules. The communication between part could be also difficult to design (e.g. due to EMI)

All of those issues need to be analyzed for proper system design. The installed diagnostics can be divided into following groups:

  • Providing real-time results—this type of systems need to produce data in the shortest possible time (e.g. several ms) and provide it for the user. Such system might be connected to the real-time networks, sharing the data with real-time updates over multiple computing modules. For more advanced designs, systems could be installed in the real-time feedback network, in order to dynamically change parameters of other devices. Used technology strongly depends on system requirements.

  • Providing offline results—the output data can be computed in period of rather minutes then several ms. In tokamak plasma diagnostics, the time boarder is mostly the time needed for the start of next plasma, including systems preparation stage. Used technology strongly depends on system requirements

The CCFE JET tokamak can be used as an example of plasma device equipped with numerous diagnostic, actuator, control and protection systems, as shown on Fig. 1. The similar scenario will apply also for ITER tokamak, however it is planned to keep the plasma reaction multiple times longer than it is done currently. This will impact directly the systems design, due to increment in the number of data to be processed during reaction.

Fig. 1
figure 1

CCFE JET tokamak equipped with various diagnostics and high performance systems [6]

A special care should be taken about proper data quality monitoring algorithms. When working in difficult environment conditions, a low-quality data (e.g. interferences) could easily saturate the input channels resulting in complete data bandwidth stall for the system, where no useful information could be registered. Another problem might occur for the systems responsible for controlling certain actuators, where decision could be based upon incorrect result from corresponding measurement system. In the literature there is rather no standard approach to the data quality topic, as well there are no guidelines for the designing such modules. Only few papers could be found describing the data quality analysis [7,8,9,10].

The importance of the proper data quality evaluation (DQE) could be visualized in typical measurement process, as shown on Fig. 2. The scheme is especially valid for the on-site installation of the new system and commissioning stages tests (ITER could be taken as example). The results from the laboratory system validation needs to be verified once again upon installation, since the working environment is new to the system. The typical steps are following: the person in charge of the system performs the measurement with the new system. The unit collect data related to the physical phenomena, for example specific parameters of plasma reaction. Next, the scientists (often an interdisciplinary team) try to interpret the obtained results. If the results are not satisfying, i.e. in relation to theory, the leader tunes the parameters of the system (e.g. trigger level change, supply voltages level modification) and re-run the measurement.

Fig. 2
figure 2

Typical measurement process during systems commissioning or analysing data

This can lead to the infinite process. Often, the results are being analyzed in scope of physics expectations, instead of interpretation of raw signals (source). The physical results could be a result of malformed signals registered by the detector or by the electronics section. Therefore, the analysis cannot be done without proper DQE techniques. DQE can answer does the results are based on the signals registered by the detector, or the received signals are malformed by some external phenomena (e.g. EMI), injected to the analog and data processing parts of the system like typical correct signals.

The discussion about data quality monitoring (DQM) in the modern systems is especially important from the electronics point of view—system engineers. The presented further ideas are based on the authors experience in construction of different plasma diagnostics systems, as well high energy physics systems. At a current stage, few papers presents specific implementation of the DQM model into the System [11,12,13].

The main goal of the paper is to start a general discussion about the necessity of the data quality monitoring, presenting general overview and examples of influence of the DQM techniques on the system results.

Typical measurement system layout

Before discussing the DQM system construction, it is necessary to provide a brief description of the most fundamental sections of the typical system with description of its functionality. The properties of each can produce important data in scope of DQM. The typical system layout that can work as diagnostic system for tokamak is often composed of the several basic blocks with different functionality, as illustrated on Fig. 3.

Fig. 3
figure 3

Typical measurement system layout in scope of dataflow

The detector is responsible for physical phenomena registration, which in general is in form of electrical signal. The unit is often exposed to extreme environmental conditions. The sensor can be highly radiated or work in strong electromagnetic fields during plasma. Those conditions may lead to generation of malformed signals, that could be hard to detect, since there are directly coming from the detector or first stages of signals input path. It is worth to note that the proper sensors’ electrical signals are often very subtle (e.g. detection of charge in scope of fC), fragile to noise. Additionally, unexpected events like random sparks in the gas detectors can provide a large number of seemingly good pulses, which easily adds to proper spectra.

Next part is a transmitter unit, which handles the electrical raw signals (assuming the detector is not a product consisting of some data processing stage). This section is mostly installed close to the detector, since it is important to minimalize external the influence on the signal due to long signal paths, attenuation, interferences etc. The design of the unit is especially important when working with very low intense signals, like charge signals from GEM detector or various triggering systems (use in high energy physics). Transmitter unit can implement various functionality, like signal-to-signal conversion (current to voltage), signal amplification, provide differential transmission etc.

The data acquisition block needs to handle multiple real-time data streams (channels) and perform necessary signal conversions (e.g. analog-to-digital). The unit needs also to redistribute the data to the processing unit. At that stage, it is necessary to select data, that is suitable for further analysis. The simplest solution is acquisition based on the trigger level, like in a typical oscilloscope. The systems limiting factor is bandwidth, since it is impossible to send all of the data from multiple channels at one time. The algorithms therefore need to be designed in a way to provide optimal amount of information.

A continuation of the data acquisition block is the processing unit. It can be a part of the data acquisition block (e.g. FPGA) or separate unit connected through communication interfaces. The design depends on the system requirements. Two implementation types can be considered:

  • Simple data processing—limited number of channels, simple algorithms

  • Multichannel data streaming systems—working with large number of channels (e.g. above 100), where it is impossible to process data in one data acquisition unit. Needs to consider data synchronization algorithms between data acquisition blocks. Depending on the streaming function can involve high throughput interfaces (PCI Express, Gigabit Ethernet, etc.).

In second type of the system, the unit must combine and synchronize input stream data from the previous data acquisition part. Due to large amount of data, the components used for the processing unit needs to meet high-performance criteria. Therefore, high-performance CPU/GPU (Graphics Processing Unit) or FPGAs (Field Programmable Gate Array) are used. It should be distinguished, that the data processing path can be divided into two stages: preprocessing and postprocessing. The first one if often composed of real-time algorithms, where second stage can be done either in real-time or offline scheme, depending on the product needs. When no DQM algorithms are implemented on the data acquisition stage, additional computations need to be done by the local algorithms in processing unit block, at least to check that boundary conditions of the input data are met. In addition, overflowing the system with large amount of malformed data can easily extend the systems’ resources regarding the storage and memory reservation, additionally significantly increasing the computation time.

The last stage is the final output product, that is expected from the system. The outputs can be e.g. topology or energy spectra of plasma radiation, counts over time for particle detector and many others. The data, like mentioned before, can be then distributed in real-time or offline way. For the first one, the system needs to be equipped with additional networking equipment that allows to distribute data over various nodes in real-time. This can be done by dedicated network cards, in order to provide real-time feedback to other systems. The implementation of mentioned mode strictly depends on the whole tokamak infrastructure. Second mode is offline data storage. Implementation is also not trivial, since there is often expectation to store as much of data as possible. This leads to problems with temporary data storage, as well data distribution to the local storage centers freely available for tokamak users. The large amount of registered data can result in need of modification of network infrastructure to access local servers, e.g. creation new physical links for especially demanding high-performance systems.

In the section were presented the most fundamental elements of a measurement system. The characteristic property of the data path is that introduction of the low-quality data on each stage will propagate further to linked blocks. It is especially important to provide algorithms that are able not only to register data, but also to distribute real-time information about the quality of the registered data. Neglecting this point can provide to scenarios resulting in: complete system’s bandwidth saturation, injection of malformed data in the final output products, long data computation and large storage requirements of measurements.

Since the DQM real-time subsystems can be resource demanding, the system should be constructed based on the feasibility studies regarding the scope of the DQM implementation. The next section will discuss the overall approach to the signal analysis for high-performance systems.

Real-time signal analysis

To complete the previous discussion a few methods of the signal analysis should be mentioned. The detector by itself can produce signals either in electrical or already preprocessed form. The sensor interface can be provided as: one channel output (raw electrical signal) for each sensor channel, digital interface or preprocessed electrical signal output. The data flows (in term of raw signals especially) in real-time, combining therefore a high bandwidth stream, often easily several Gbps or more. Due to very high data throughput, specialized elements like FPGAs, needs to be used to acquire and reduce the stream. The signal analysis can be split up to two sections: preprocessing and postprocessing. The features of each mode are presented in Table 1.

Table 1 Comparison of preprocessing and postprocessing modes in real-time measurement systems from the detector part

It can be noted that preprocessing stage mostly focuses on valid/ interesting data detection, while postprocessing stages work mostly on extracted (reduced) data from the signal stream. The significant difference is also in scope of programming languages used. The Hardware Description Languages (HDL) languages are difficult to implement, due to non-standard implementation methodology, as well that they are strongly hardware resources oriented. The postprocessing algorithms are much easier to implement, since the C/C++ are well-known languages for most of the engineers. It is worth to indicate that the time needed for recompilations and testing is much shorter.

To have a complete discussion, the advantages of using raw signal and preprocessed data as output from the sensor for analysis are presented in Table 2.

Table 2 Advantages of using the raw signal and preprocessed data for computation from detector part

The main advantage of the raw signal analysis is the possibility to provide the most accurate results due to full signal information. The data quality monitoring algorithms can be implemented in order to remove unwanted signals (malformed etc.). This mode is suitable for real-time measurement systems. On the contrary, the preprocessed data analysis is much simplified, since mostly relying on the frontend units. Those are mostly commercial products, e.g. frontend detector interfacing chips, where most of the data processing is already done. The system designer needs mostly to implement the readout path together with eventually final algorithms (e.g. spectra). It is important to note that the input arguments are already processed in frontend unit.

It can be noted, that raw signal analysis will result in much higher quality data and wider range of implementation. However, this mode is the most demanding from the system construction side. The designers need to be highly experienced in hardware design, embedded hardware programming as well know how to program high-performance CPU/GPU in correlation with algorithms implementation of real-time data streams. The systems would be mostly relaying on the FPGAs and high-throughput interfaces like PCI-Express (PCIe) or MultiGigabit Transceiver (MGT) units (various protocols), that are hard to verify. This approach has also the longest “time-to-market” parameter due to custom design of the multiple components, often including advanced or state-of-art hardware.

The preprocessed data-based computations are rather easy to interface. However, in the same time this approach can provide the worst data quality. The signals from detector are processed by the frontends electronics with manufacturer-embedded algorithms used to compute the output values. Therefore, it is impossible to modify the computation path (in order to tune it to user needs or correct improper results). Additionally, the middle stage values from the frontend can include low quality or corrupted signals. In such system construction is very hard or almost impossible to detect such phenomena, since there is mostly no access to raw signals used for computation.

An example can be provided, based on the signal analysis with high intense soft X-ray sources. The laboratory test was done in scope of simulation system behavior during intense plasma flux. Using DQM techniques, it has occurred that about 50–90% (depending of radiation parameters) of all events could be pile-up affected. The typical system working either with frontend chips or with simplified algorithms (cost-effective) would either reject those events, resulting in very low statistics output data (e.g. spectra) or include them in output products, resulting in completely different results than expected.

Therefore, it is important on the planning stage to select which data processing approach to choose. This is strongly based on the system purpose and working conditions.

System construction considerations involving advanced DQM

It is not an easy task to design a measurement system that will provide high quality output data, especially when it is required to work with high-bandwidth real-time streams. The system designer needs first to identify the scope of the work based on project specification, considering e.g.:

  • Measured sensors’ output values, e.g.:

    1. o

      Raw voltage signals

    2. o

      Raw current signals

    3. o

      Other control-measurement systems outputs

    4. o

      Digital frontend stage of the detector (preprocessed data)

    5. o

      Signal frequency spectra

  • Number of channels:

    1. o

      One output per channel—high pinout receiver device

    2. o

      One output per multiple channels (muxed, serialized)

    3. o

      Multiple outputs per channel (e.g. Serializer/Deserializer SERDES interfaces from ADCs)

  • Frequency and type of events:

    1. o

      Few kHz—low throughput

    2. o

      > 100 MHz—high throughput

    3. o

      Signal form: raw/preprocessed

  • Output product type:

    1. o

      Binary decisions (on/off)

    2. o

      Histograms, spectra

  • Output products specification

    1. o

      Resolution

    2. o

      Response time

    3. o

      Latency

  • Connection with tokamak infrastructure:

    1. o

      Triggering section—electrical interface type, types of triggers

    2. o

      Control section—framework used from the control room

    3. o

      Data distribution section—dedicated data transmission links, custom hardware, system requirements etc.

The answers on above questions will provide the estimation of resources needed to implement the project. It also provides information to the systems’ constructors about the required data bandwidth, conclusions about the data processing approach (as introduced in Tables 1, 2), systems limitations, e.g. regarding used operating systems or data distribution form.

In order to have a complete view on the system requirements, the target environment needs also to be analyzed in scope of:

  • System radiation exposition

  • Magnetic fields

  • Electromagnetic Interferences (EMI) and EMI generating equipment in the surrounding

  • Installation localization temperatures

  • Space for the system installation

  • Placement of installation: at the tokamak vessel, inside hall, room, etc.

  • Type of isolation between the sensor and the measurement values (e.g. vacuum port, wall, diffraction crystal etc.)

The constructed parameters map will provide a view on the installation site and possible difficultness and influence the diagnostics systems design. A few examples can be provided:

  • System exposition to the radiation significantly narrows the electronic components that can be chosen for the platform design; the design redundancy (or redundant components) should be considered

  • The temperatures correlated with the free space for the system installation can introduce necessity of specially designed cooling, impacting the electronics and whole mechanical design of the system

  • The localization of the system can significantly increase or decrease the complexity of the system construction together with data quality monitoring analysis. Depending on the project properties, divided systems into several (mostly two) parts could be considered

  • The EMI or magnetic fields can influence signals and data, therefore more complex analysis should be performed in scope of shielding, cabling distribution and type, additional processing stages (filters)

Based on the gathered information, the designer can select the approach regarding the data processing. In case of interfacing with elements providing electrical values, e.g. current, voltages or signals (detection of pulse in the noise with low Signal–Noise-Ratio SNR), the focus will be put on the raw signal processing, introducing probably FPGAs in the design. In case of parsing middle-stage products or working with the data from other systems, more focus will be put on CPU/GPU side with implementation in those units processing algorithms.

The performed analysis in scope of: installation environment, detector type, signal processing stages, including total required bandwidth for based implementation will provide basic system components and possible free resources. The free resources can be used for DQM implementation as well will define the overall implementation complexity. The DQM can work either on the input data providing efficient filtering or data marking, as well on semi-output products providing time and topology correlations. The efficient DQM implementation is resource consuming and can rise the overall costs of the system. An example about necessity of the DQM components in the diagnostics systems can be provided from the performed by the authors measurements. The Fig. 4 presents the topology spectra of an Fe55 isotope, registered by the Gas Electron Multiplier (GEM) detector [14]. Since the isotope was placed on the middle of the GEM window, the spectra should be Gaussian-like shaped. The blue line presents the measurement result without DQM, which matches the expected spectra. When applying DQM techniques, it can be observed that the result is not perfect. The advanced DQM component can provide even more data about the mentioned phenomena. The applied DQM techniques were providing information about:

  • Spatial distribution of events malfunctions over time using signal classificators models (several ms resolution with classificators like: underflow, overflow etc. for each channel)

  • Marking and recording events with active classificators for offline browsing

Fig. 4
figure 4

Registered Fe55 topology spectra by the SXR GEM-based measurement system with and without real-time DQM algorithms

Based on the analysis of spatial events distribution with underflow (UDF) flag active, it occurred, that few channels are producing more events of this class then the others. Using the DQM raw signal browser functionality with the UDF flag and certain channels selection, it occurred that:

  • Signal is similar to the valid event from the detector, composing an Gaussian-like pulse—this shape actually triggers the electronics to register the event

  • Before the regular pulse, the inverted pulse (towards negative values) is registered just before, not always reaching “0” on ADC scale (however still in the acquisition window)

  • The complete event acquisition window is in fact composing two signals: regular pulse and inverted one

  • The energy output is in low energy range (both pulses are more less similar)

Without DQM, those channels are providing events like normal ones, especially they are counted by spectra computation software as regular events, increasing the counts in topology spectra. With the applied DQM techniques it is possible to filter out the signals with UDF DQM flag on, revealing the malformed topology spectra. The effect is considered as slightly damaged electronic channels (number 51 and 55), resulting in such output. More details and description of DQM techniques implemented are provided in [13].

Therefore, the basic principle of the efficient DQM is to provide accurate data filtering to avoid low quality signals registration. Based on the already performed analysis, determination of systems working conditions and their properties in range of:

  • Difficultness of installation: on-site or laboratory stand

  • Types of interferences

  • Amount of data registered: raw signals, processed output data

  • Working mode: online or offline

Can provide the scope of the DQM implementation range. The working mode significantly influences the hardware resources needed for the DQM implementation. One can expect the following DQM challenges in online mode:

  • FPGA or GPU units used for stream data with additional DQM features

  • Extra resources for DQM mechanisms implementation

  • Fast, complex readout interfaces (e.g. serialized ADCs, fiber links) to be analyzed parallelly by DQM subsystems

  • Extra data distribution channels with DQM outputs

  • DQM module with data path integration

The offline working mode presents a bit more relaxed approach for the system design, due to lower timing constraints. However still it should be noted, that this basically depends on the requirements, for example the offline diagnostics should provide the output results after each plasma, resulting in approx. 15–20 min. time slot for data transfer and computation. The following DQM implementation challenges in offline mode can be considered:

  • Enough space for large volume data storage of proper performance (Non-Volatile Memory Express NVMe, Solid State Disks SSD etc.)

  • Communication links suitable for large data transfers

  • Finding common programming language platform where different specialists can work together (physicists with electronics teams)

The last point regarding the offline mode is actually a novel approach to the data analysis. In offline mode the data analysis are timing relaxed and can be implemented in without advanced optimization model. However it is possible to create a common platform, based on well-known programming language, where both physicists and electronics teams can actively cooperate in term of registered data in scope of DQM. Mostly, the algorithms are created by the physicists, which know how to correlate the electric signals with the output products related to physical phenomena. On the other hand, there are not specialized in embedded hardware data acquisition, processing and algorithms implementation. This significantly reduces the ability to upload new algorithmic solutions into the system for verification through new measurements. In the suggested approach, in offline mode there is no need to implement timing optimized algorithms. Finding common programming platform can improve the efficiency of the verification and implementation of the new computation algorithms, as well implementation of DQM techniques. Both the electronics teams and physicists can upload new codes (algorithms etc.) and actively iterate and verify the system outputs. The approach needs proper system construction and design.

Once the system specification is done, it is necessary to properly implement the DQM functionality, according to the system functionality evaluation in previous section. The system most important dataflow stages needs to be modelized, as described in [12]. Then, the critical analysis should be performed in order to properly identify the most probable place of the error data injection into the system. The properly placed DQM data filtering component will result in:

  • Data flow reduction due to the rejection of malformed events

  • Marking of low quality data for further offline analysis

  • Redesigning of the algorithms to properly compute the more complex data events

The DQM component, depending on the features, can influence the overall system construction, adding new requirements for the memory storage, latency reduction, communication channels etc.

The last stage of the systems’ design should include a full review of the system construction in scope of basic diagnostics requirements and compliance with additional requirements from: installation environment, integration with onsite infrastructure and DQMs component implementation.

Summary

The high-performance systems working with raw data and high throughput data streams on the tokamaks are often advanced or state-of-art constructions. They need to process large amount of data from multichannel detectors and produce results with millisecond resolution in real-time. Often, the data quality factor is not properly considered in the installed systems. However, this is important topic especially for the future projects, like ITER, planned to generate long term plasma. Another aspect is on-line feedback to the other systems, especially for control of actuator systems, where high quality data is required.

In order to properly design the system with high quality output data, it is necessary to review all of the requirements and working conditions to have an idea of amount and quality of the data to be processed. The analysis also provides preliminary hardware construction of the system, including real-time requirements. The designers need to select the approach to data processing: raw signals analysis or preprocessed data computation.

At the next stage, proper system model needs to be constructed with indication of most significant error injection place. The possibility of malformed data filtration needs to be analyzed with proper filtering model. Upon analysis, additional hardware requirements can be defined, to properly implement the DQM model. The working modes of the diagnostics: online or offline for the DQM module can influence the whole hardware system construction.

Having all of the analysis performed, it possible to provide the complete hardware design of the system with data quality monitoring algorithms for most optimal data flow with high quality output products. This approach can be especially interesting to the newly designed or planned systems, focuses on ITER or other modern high energy physics experiments.