Introduction

The Exploration of energization and Radiation in Geospace (ERG) project investigates radiation belt and geospace storm dynamics. The ERG (Arase) satellite was launched in December 2016 and began observations in March 2017. The satellite includes nine scientific instruments that provide various types of data for measuring plasma and particles over a wide energy range, as well as fields and waves over wide frequency ranges (Miyoshi et al. 2018).

However, for a comprehensive geospace investigation, ground-based observations and multi-point observations from the satellite are equally important. The ERG project, therefore, includes both satellite and ground-based observation teams. Ground-based observations are obtained from Super Dual Auroral Radar Network (SuperDARN) high-frequency (HF) radars, European Incoherent Scatter Scientific Association (EISCAT) radar, magnetometers, very low-frequency (VLF)/ELF (extremely low-frequency) loop antennas, riometers, VLF/low-frequency (LF) radio wave receivers, and optical imagers (Shiokawa et al. 2017).

Comparing simulations with observations is important to determine causal relationships and increase our quantitative understanding of various geospace phenomena. Hence, the project also involves a team that manages simulation and integrated studies (Seki et al. 2018).

However, it is not always easy to employ multiple datasets to their full potential, because it is difficult to learn the different types of data file formats and develop the necessary tools to access and process the relevant data for detailed analysis. Using common data formats and analysis tools can solve this problem. For example, the National Aeronautics and Space Administration (NASA)/Common Data Format (CDF) has standardized data storage and data access considerably. The solar–terrestrial physics community has developed common data analysis software, including the Space Physics Environment Data Analysis Software (SPEDAS), to analyze different data types for integrated studies. SPEDAS was originally developed for the Time History of Events and Macroscale Interactions during Substorms (THEMIS) mission and was known as the THEMIS Data Analysis Software (TDAS) (Angelopoulos 2008). This software is a suite of scientific analysis routines written in the Interactive Data Language (IDL). To promote accessibility for the space research community, other satellite projects, including the Magnetospheric Multiscale (MMS) mission (Burch et al. 2016) and Van Allen probes (Mauk et al. 2013), provided their own programs as SPEDAS plug-ins, by developing and customizing the SPEDAS code to enable seamless collaborative data sharing among missions.

Considering these advantages, the ERG project has archived its project data in CDF format and made these files accessible on the Internet. The ERG project has cooperated with the Inter-university Upper Atmosphere Global Observation NETwork (IUGONET) (Hayashi et al. 2013; Tanaka et al. 2013) to develop SPEDAS plug-ins for ground-based observation data. It has also cooperated with the THEMIS project to develop SPEDAS plug-ins to serve as common data analysis software across project teams (Hori et al. 2015). These include not only software to download data files from the remote data server, but also several tools to aid in data analysis, such as plasma dispersion solvers.

In order to coordinate observations made by the ERG satellite, on the ground, and by other satellites, it is necessary to properly organize the observation modes of the ERG satellite. The plasma wave experiment (PWE)/waveform capture (WFC) (Kasahara et al. 2018b; Matsuda et al. 2018) and Software-type wave –particle interaction analyzer (S-WPIA) (Katoh et al. 2017; Hikishima et al. 2018) have intermittently performed waveform observations along a satellite orbit. Waveform data of plasma waves are important for studying wave–particle interactions through detailed comparisons with the ground-based optical and wave observations (Shiokawa et al. 2017). The waveform data would accumulate more than 6 GB if always recorded along a satellite orbit. As a nominal case, the possible downlink budget available for the waveform data is approximately 700 Mb per day. Therefore, it is necessary to strategically select limited periods of waveform observations, taking into account the satellite orbit, conjunction periods with ground-based observations and other satellites, size of the data recorder, and the telemetry downlink plan.

The ERG project established the ERG Science Center (ERG-SC), which is operated by the Institute of Space and Astronautical Science (ISAS)/Japan Aerospace Exploration Agency (JAXA) and the Institute for Space-Earth Environmental Research (ISEE), Nagoya University. The main tasks of the ERG-SC are (1) to archive and distribute data to both research teams and the public, (2) to develop research software as SPEDAS plug-ins, and (3) to plan the operation of the ERG satellite and arrange for coordinated studies among ERG, other satellites, and ground-based observations. Figure 1 shows a summary of ERG-SC tasks in the ERG project. In this paper, we provide an overview of the ERG-SC and several examples of its tasks.

Fig. 1
figure 1

Overview of the tasks of the ERG-Science Center

Design of ERG data archive

The ERG satellite has continuously conducted observations and produced scientific and engineering data, which are downlinked to ground tracking stations every day. Figure 2 summarizes how these data are processed and delivered to end users. All raw data packets received by the multiple ground stations are first merged into the telemetry database of the Scientific Information Retrieval and Integrated Utilization System (SIRIUS) of ISAS (Nomura et al. 1983). Then, the server of ISAS extracts the raw data for each onboard scientific instrument and converts them to regular files in predefined formats, storing observed data in instrument-origin units. After data processing by the instrument teams, the data files are transferred to the data processing system of the ERG-SC, where multiple datasets are combined with additional information to produce scientific data at several processing levels. During this file rearrangement, scientific data are converted to physical units. At this stage, necessary data evaluation and calibrations are conducted, resulting in datasets that can be used for scientific data analyses. Finally, the generated data products will be made available to users on the ERG-SC Web site (https://ergsc.isee.nagoya-u.ac.jp).

Fig. 2
figure 2

Overview of the data processing flow for the ERG satellite data

Over the course of the data production pipeline described above, data are categorized into several predefined processing levels. Raw packet data received from the satellite and archived on SIRIUS are referred to as Level-0. Level-1 data are converted to regular files on the reformatting system at ISAS. Level-1 data carry time labels converted to Coordinated Universal Time (UTC), while observed data values are stored in instrument-dependent units. Further processes at the ERG-SC, including data calibration, generate Level-2 data. Unlike Level-1 data, Level-2 data have been assigned standard physical units as well as some geophysical coordinates. Data up to Level-2 are generated solely from Level-1 data of a single instrument. Level-3 data are the merged products of multiple Level-2 data points from different instruments. A typical example is pitch angle distributions of particle fluxes, which merges particle flux data from a particle instrument and magnetic field data from the magnetic field experiment (MGF) (Matsuoka et al. 2018). Another example is electron density data that are deduced using frequency traces of upper hybrid resonance observed by the PWE instrument (Kasahara et al. 2018b; Kumamoto et al. 2018) and MGF data. Level-3 particle data also go through inter-instrument calibrations to produce particle flux and phase space density data of electrons over a combined energy range of ~ 20 eV to more than 10 MeV. These data are referred to as Level-4 data.

Figure 3 outlines the data production pipeline for the ERG scientific data arranged by data processing level. In principle, scientific data are processed in order of increasing level. The instrument teams and the ERG-SC share tasks regarding the development and actual operation of these pipeline processes. Each instrument team is primarily responsible for the production of Level-1 data, while the ERG-SC is responsible for the production and archiving of Level-2 and higher-level data.

Fig. 3
figure 3

Outline of the scientific data production pipeline for the ERG satellite data

After necessary calibrations and data evaluation, Level-2, 3, and 4 data are made available on the ERG-SC Web site, which users can access online, as shown in Fig. 4. Integrated data analysis tools developed by the ERG-SC using SPEDAS, which are described in detail in the next section, can be also used to download these data files from the ERG-SC Web site.

Fig. 4
figure 4

Screenshot of a web browser viewing the ERG satellite data repository on the ERG-SC Web site (https://ergsc.isee.nagoya-u.ac.jp/)

Level-2 and higher-level data are archived as CDF data files (Hori et al. 2015). CDF supports various data types and data structures and allows data files to carry various forms of metadata. It also guarantees inter-operability independent of operating system and endian type of variables. In addition to its functionality, another benefit of archiving the project data in CDF is that CDF has become one of the de facto standard file formats of the solar–terrestrial physics community and the data in CDF are easily available in several software programs.

The ERG-SC has defined the standardized contents of data variables and metadata for data files archived in CDF. These standardized contents are used for both satellite observation data and ground-based observation data. In the past, many projects used their own data formats and data variable structures, which made it difficult for data users to combine data from different types of instruments in their data analyses. For example, the time label format often varies from project to project, and different observational data carry numerical values in different forms and are grouped into data files with different lengths. In fact, ERG satellite data and ground-based observational data were also provided in different formats/styles/granularity according to each instrument and observation team. In order to overcome this difficulty, the time label format is standardized by CDF epoch and data values are converted to the physical unit that can be directly used for data analysis. Moreover, the data format is rearranged in a simple and straightforward sequential order. The ERG-SC has unified them into our standardized data format and added common metadata. As a result, the data files contain ancillary information as metadata that are necessary for not only SPEDAS data analysis but other data analysis software. On the basis of discussions with the instrument and observation teams, the ERG-SC identified necessary ancillary information and defined the common metadata list to accommodate all necessary information. The designed metadata list was partly reported by the literature (Hori et al. 2015). Metadata also include information on the names of the original data and the programs with version ID to generate the CDF files. This information guarantees the traceability of the data version for both data users and the data archive center. Hence, these unified data archives, with the integrated data analysis tools described in the next section, allow data users to combine and analyze different types of data in a truly seamless way.

In general, the data center must overcome common difficulties, such as data file preservation and data transfer issues. For the ERG project data, the total size of possible data from both satellite and ground-based network observations is expected to be several tens of TB, and they can be processed without serious difficulty by utilizing commonly used computers and disk systems. Furthermore, the speed of commonly installed inter-university networks such as the Science Information NETwork (SINET) is not a bottleneck for data transfer from the ERG-SC to other universities and institutes.

As mentioned, the ERG-SC archives many types of data files and should control the version of the data file. For example, the number of satellite data product types is more than hundred, and the number of ground-based observation data product types is several hundreds. Because these data are irregularly transferred to the ERG-SC from each instrument and observation team, it is not easy to manage, archive, and deliver such various files of data to users. In order to overcome these difficulties, the ERG-SC developed a system to automatize the process of data production to the maximum extent possible. As discussed in Fig. 3, we have developed and connected many modules of data processing into a pipeline system which can facilitate all data processing from the raw observation data of each instrument and observation team to the scientific data files. The actual implementation of the system is made as much as possible with common routines of SPEDAS and IDL. Use of the single software language, which has also been widely used for scientific data analyses, enables efficient development of program codes and contributes to substantially reducing the time spent on development and maintenance work.

For data file preservation, a standardized data file containing all types of necessary information can contribute to this issue because a single data file carries all the necessary information for scientific analysis. The practical information contained in data files, such as file names with version numbers of the source data files, and computer codes to generate data files, enables automatic processing of data files in many tasks for maintenance of the data archive. In addition to these efforts, the data files are synchronized between the ERG-SC at Nagoya University and ISAS/JAXA for redundancy.

Data analysis tools for the ERG project

The ERG-SC has also released software to load and analyze data obtained by the ERG satellite, as described in “Design of ERG data archive” section, in the form of plug-in software libraries for SPEDAS (Hori et al. 2015). Figure 5 shows an example of visualizing multiple data using SPEDAS and the plug-ins. Figure 5a shows magnetic field data in the solar magnetic (SM) coordinate system obtained by the MGF instrument, omnidirectional electron flux data observed by the medium-energy particle experiments—electron analyzer (MEP-e) instrument (Kasahara et al. 2018a) onboard ERG, geomagnetic field data at Kagoshima (KAG: 31.48N geographic latitude and 130.72E longitude, Yumoto et al. 1996), and the provisional AE index on June 21, 2017. This plot combining multiple data is created using only a few commands of SPEDAS and the plug-in shown in Fig. 5b. Using “erg_init” in the command line, the environment for loading ERG data is set up. The “timespan” command included in the SPEDAS core routines is used for designating the date. The magnetic field and electron flux data saved in CDF files are loaded and stored as “tplot variables” using load procedures “erg_load_mgf” and “erg_load_mep,” which are available in the ERG plug-ins. The geomagnetic field data observed by a fluxgate magnetometer at KAG, and the provisional AE data, which are provided by the World Data Center for Geomagnetism, Kyoto, are loaded using “erg_load_gmag_mm210” and “kyoto_load_ae.” All “tplot variables” are plotted using the “tplot” command, one of the SPEDAS core routines. Using only seven commands of SPEDAS and the plug-ins, we can seamlessly visualize data observed by the ERG satellite and ground-based instruments. Thus, this tool encourages collaborative studies using not only ERG satellite data but also ERG-ground network and other project data.

Fig. 5
figure 5

Example for showing data with SPEDAS and plug-ins. a The magnetic field data in the SM coordinate system observed by the ERG MGF instrument, the electron flux observed by the ERG MEP-e instrument, the geomagnetic filed data on KAG, and AE index. b Command lines of SPEDAS and “plug-ins” for Fig. 4a

As a new attempt for the space plasma physics, the ERG-SC developed ISEE_3D, an interactive visualization tool for the three-dimensional plasma velocity distribution function (Keika et al. 2017). This tool provides a variety of visualization methods for the distribution function of space plasma; scatter, volume, and iso-surface visualization. ISEE-3D has been included in the bleeding edge of SPEDAS as it is capable of loading plasma data from the magnetospheric multiscale (MMS) mission. Examples of 3-D ion distribution visualizations are shown in Fig. 6. For these 3-D distributions, ions with energies of 10 eV to 30 keV and the magnetic field observed by the fast plasma investigation (FPI) instrument (Pollock et al. 2016) and the fluxgate magnetometer (FGM) (Russell et al. 2016) are applied. Figure 6a, b presents the 3-D ion distributions in scatter mode and the 2-D slice of distributions for 02:13:19.916 UT on November 18, 2015, respectively. Cyan and yellow arrows represent the directions of the magnetic field and velocity vector. Note that this tool will be applied to data obtained from the ERG satellite after L2 particle data are released.

Fig. 6
figure 6

Examples of ISEE_3D. a Scatter plot and b 2-D slice plot for ion velocity distributions (phase space density vs. velocity) in the magnetic field coordinates observed by MMS1/FPI (Keika et al. 2017)

The ERG-SC has also implemented the SPEDAS GUI plug-in package of the Kyoto University Plasma Dispersion Analysis Package (KUPDAP) (Sugiyama et al. 2015), which is a plasma dispersion solver. The main engine original KUPDAP code was developed by the space group of the Research Institute for Sustainable Humanosphere (RISH), Kyoto University. Figure 7a shows a control panel of KUPDAP in IDL. Figure 7b is an example diagram of the wave frequency and wave number (ωk diagram) and linear growth rate as a function of frequency and wave number on the electromagnetic ion cyclotron (EMIC) waves. This plug-in package has been used in various studies of plasma waves (Uchino et al. 2017; Shoji and Omura 2017) as a function of SPEDAS.

Fig. 7
figure 7

a Control panel of KUPDAP GUI package. b Example of output panel. The frequency–wave number, the growth rate–frequency, and the growth rate–wave number plots are displayed

As well as software in SPEDAS, the ERG-SC has developed two tools that run on a web browser. One is the ERG Web Analysis Tool (ERGWAT), which can be used for interactive data visualization and analysis on a web browser, as shown in Fig. 8a (Umemura et al. 2017). Another tool is the Conjunction Event Finder (CEF) shown in Fig. 8b (Miyashita et al. 2011). CEF provides orbits and footprints of ERG and other satellites and information on ground-based instruments, which can be used to confirm the relative locations between satellites and ground-based observations and to produce the operation schedule.

Fig. 8
figure 8

Screenshots of tools on Web browser which are provided by ERG-SC. a ERGWAT. b Conjunction Event Finder for showing orbits and footprints of ERG and other satellites and information on ground instruments

Science operation planning

The ERG satellite operates nine science instruments for nominally observing plasma/particles and fields/waves as the nominal observations. Each instrument has several observation modes that are appropriately selected depending on various factors such as L-shell, magnetic latitude of the satellite position, so that planning of the science operation based on the predicted orbit is necessary. In addition to the nominal observations, the ERG satellite has burst mode operation modes for PWE/WFC and S-WPIA. Data from the burst observations of PWE/WFC and S-WPIA are first stored in the mission data recorders (MDR) and subsequently downloaded to the ground through the system data recorder (SDR) (Takashima et al. 2018). Note that the size of MDR and SDR is 32 GB and 2 GB, respectively.

The observation schedules for the ERG satellite are first drafted by the ERG-SC using information on the satellite orbit, the Earth’s shadowing, and the electric power available onboard. The sampling frequency for PWE and MGF instruments and the time resolution of particle instruments vary for each orbit. Intermittent chorus and EMIC burst observations of PWE/WFC and S-WPIA are also conducted, and plans are scheduled for the burst mode operations.

Besides the default observation schedules, conjugate observations with ground-based instruments and other satellites such as Van Allen Probes are often planned, in which PWE/WFC and S-WPIA are included. The drafted schedule files are reviewed by the instrument teams and then the finalized files are sent to the satellite tracking center in ISAS where they are converted to a command plan used to operate the satellite. If any problem is found in the schedule files, they are updated to satisfy the feasible operation conditions.

As mentioned in “Introduction” section, the expected amount of waveform data is more than 6 GB if recorded along a satellite orbit, while the potential amount of downlink data for waveform data along a satellite orbit is approximately 700 MB per day. Thus, it is necessary to select periods for the waveform observations, taking into account the size of the data recorder and the telemetry downlink plan. The frequency–time spectrogram data of the PWE/onboard frequency analyzer (OFA) are quickly reviewed to select time periods for the burst observations of PWE/WFC and S-WPIA that require downlinking for further scientific analyses. Figure 9 shows an example of the tool for data selection. This tool shows PWE/OFA data that are used to examine an overview of the plasma wave observations and identifies the period when the ERG satellite observes plasma waves of interest, such as chorus, hiss, and other types of plasma wave. Those periods of interest are preferentially selected and then transferred to SDR.

Fig. 9
figure 9

GUI panels for selecting the burst mode data from OFA Level-1 prime data. The top four color maps in the main panel show the HFA, OFA-E, OFA-B, and EFD dynamic power spectrum, observed on April 15, 2017. The bottom four plots show the burst observation record times for each burst mode (chorus, EMIC, EFD, and S-WPIA). The subpanel appears when we select the time range (indicated by red colored bars in the main panel) for the decision of the priority of the corresponding data

The possible data size for MDR to SDR transfer depends on the downlink plan from the satellite to the ground; therefore, the selection of burst observations of PWE/WFC and S-WPIA has to consider actual estimates of the downlink data size. Occasionally, the potential amount of downlink data for the waveform observations is small, less than 500 MB per day. If the downlink data amount can be estimated before uploading the schedule file to the satellite, we can reduce the operation periods for WFC and S-WPIA in the schedule file. On the other hand, if we know the possible downlink data amount after uploading the schedule file to the satellite, observations of WFC and S-WPIA are made and stored in MDR. In this case, because the potential amount of downlink data to the ground is smaller than the typical amount, careful data selection is required by looking at the PWE/OFA data and deleting some wave form data from MDR.

Concluding remarks

This paper provides an overview of the ERG-SC. The tasks of the ERG-SC include: (1) archiving data from the ERG satellite, ground-based instruments, and modeling/simulation; (2) developing SPEDAS plug-in software for data analysis and visualization of various data file types; and (3) scheduling observations and data downlinks from the satellite.

The data processing pipelines in this system employ commonly used software/languages, such as IDL, C, and UNIX shell. Thus, implementing these processes in various cloud-computing systems is technically feasible. In fact, the Center for Integrated Data Analysis Science (CIDAS), ISEE, Nagoya University, has been operating a cloud system in which users can use SPEDAS to analyze the ERG satellite data, ground-based observations, and other satellite data by connecting from their remote terminals. Use of a cloud system has great merit for users in that they do not need computer resources with large memory and a high CPU clock speed. Thus, cloud systems will be a useful computer environment for the large data quantity and varied data types included in the ERG-SC data archive.

The ERG-SC works as a hub for the ERG project by unifying management of data CDF files for different observations and modeling/simulation and integrated data analysis software to seamlessly visualize and analyze different types of data. As integrated data analysis is a key for a comprehensive understanding of phenomena that occur simultaneously at different locations, which are observed by several satellites and ground-based instruments, the ability to develop tools for such analyses is essential.

Because SPEDAS is a standard software package used across the space physics community, SPEDAS plug-ins contribute to providing a seamless data analysis environment across different projects. The ERG project has collaborated with the THEMIS team to develop and enhance SPEDAS by generating plug-in software that allows researchers to not only download files from the ERG project, but also perform advanced analysis, such as ISEE_3D and KUPDAP. It is worth mentioning science centers of other geospace projects. As an example of related geospace missions, the NASA/Van Allen Proves mission comprises science operation centers (SOC) for each instrument, and each SOC provides its own database and software for data analysis (e.g., Kletzing et al. 2013; Spence et al. 2013). Similar to the ERG-SC, science data files are archived in CDF format and these CDF files can be accessed via the internet, so that users can use appropriate software for their analysis. This is a good example of the benefits of using standardized data files. Software like SPEDAS can read and manipulate data in CDF files from both the ERG project and Van Allen Probes, which is helpful for analyzing phenomena from various angles by integrating different types of data. The developed software, combining different data types, will be an important legacy for future space science missions. The ERG-SC thus contributes to achieving the scientific objectives of the ERG project, providing new insights into the dynamics of radiation belts and the inner magnetosphere.