1 Introduction

A range of phenomena originating from the Sun can produce effects throughout the heliosphere, such as coronal mass ejections (CMEs), solar energetic particle (SEP) events, high-speed solar wind (HSSW) streams, and co-rotating interaction regions (CIRs). One example is the 2003 “Halloween Storm” period (e.g., see the summary article of Gopalswamy et al. 2005) in which several CMEs erupted from an active region on the solar surface, resulting in impressive aurorae on Earth, and continuing on to intersect the Voyager spacecraft near the edge of the heliosphere (Intriligator, Rees, and Horbury 2008). Heliospheric phenomena are commonly studied using both remote sensing and in situ instruments, each allowing a unique perspective on the physics of these events. The more complete the picture of these phenomena is, the better we can model and ultimately make predictions of space weather conditions. The beauty of heliophysics is that its understanding requires a diversity of perspectives. However, this causes an inherent problem, because the diversity of disciplines involved results in knowledge gaps between the different disciplines (e.g., coordinate systems, definitions of phenomena, terms and acronyms, data standards, descriptions, and analysis languages).

In astronomy, data standards have been unified with the creation of virtual observatories (VOs) and their associated access tools. These tools allow scientists to obtain public data for any object using only its celestial coordinates. A similar approach has begun in heliophysics with the creation of the Virtual Solar ObservatoryFootnote 1 to gather many of the available solar data (N.B. NASA’s Planetary Data SystemFootnote 2 is the VO for planetary sciences). The European Grid of Solar Observations (EGSO; Bentley and EGSO Consortium 2002) and more recently the Heliophysics Event Knowledgebase (HEK; Hurlburt et al. 2012)Footnote 3 go a step further, adding event lists and other functionality that is intended to aid users in finding the available data linked to a given event or feature on the Sun. AstroGrid,Footnote 4 through its HelioScope service, provides tools and services to access data archives and catalogues, find data sets, as well as a limited infrastructure for remote data processing. Finally, CDAWebFootnote 5 and Automated Multi Dataset Analysis (AMDA)Footnote 6 offer data access and basic remote processing capabilities for planetary scientists. Currently solar and planetary data access and analysis are disjointed, although both are necessary to study the inter-disciplinary field of heliophysics.

The HELiophysics Integrated Observatory (HELIO; Bentley et al. 2011b) addresses these issues. It combines many features of previous projects and aims to act as a centralised access point for as many heliospheric data as possible and to associate catalogued events with observations in which they can be found. Creating a VO for heliophysics is more complex than for astronomy because of the range of dynamic spatial and temporal scales involved. Instead of the 2D position vector that is generally used to uniquely identify astrophysical objects, time and distance are also required to associate data with heliospheric events. As such, the main interface of HELIO is a step forward by allowing 4D searches and providing automatic association between events, features, and data in time and space using both matching and built-in propagation models.

In this paper, we use HELIO to obtain catalogued event entries and data by running a simple propagation model to perform three heliospheric event case studies. Using the output from specific HELIO services (each described in Section 2), conventional methods are utilised to analyse each event. The results and discussion for each use case is presented in Section 3. Conclusions on the performance and future directions for HELIO are presented in Section 4.

2 The Heliophysics Integrated Observatory

The principal objective of HELIO is to create a collaborative environment where scientists from different communities can discover connections between solar phenomena, interplanetary disturbances, and their effects on the planets. HELIO consists of a set of services that provide access to event and feature catalogues and remote processing services, as well as access to observational data and descriptions of the instruments. All services are available from a centralised web interface ( http://hfe.helio-vo.eu ) that allows the user to pass information seamlessly from one service to another, without requiring the user to have a deep knowledge of how the system is built. However, the nature of the system architecture allows all of the services to be individually accessible. This enables advanced use of the system through SQL queries or by means of a programming language (e.g., IDL or Python). A complete and detailed description of the HELIO architecture and services can be found in both Bentley et al. (2011a) and the online documentation.Footnote 7

The general problem that HELIO aims to solve can be split into three parts:

  1. i)

    Identify the occurrence of an interesting event or phenomenon.

  2. ii)

    Review the availability of suitable observations.

  3. iii)

    Locate, select, and retrieve the required observations.

These parts are shown graphically in Figure 1, which will be used later to describe the execution of different use cases. As shown in Figure 1, there are three collections connected to the Search service – Events, Features, and Data. This indicates that users may choose to start by searching the various HELIO databases for a particular type of event, solar feature, or certain data characteristic (e.g., format, heliophysical location, etc.). When the phenomena under consideration are in the same spatial location (i.e., no propagation through the heliosphere is required) the user can combine any number of these three Search formats until a complete data set is selected. Note that the output of each Search query can be passed to additional Search queries, as indicated by the dual-directional arrows in Figure 1. However, if the user is instead interested in studying phenomena at different locations in the heliosphere, then a propagation model is required. This is indicated by the Propagation service in Figure 1, which may be applied in either direction between each pairing of the Events, Features, and Data collections.

Figure 1
figure 1

Simplified diagram of the services offered by HELIO and their connections. From the top, the Search service connects with the Events, Features, and Data services with bi-directional arrows, meaning that searches can be applied to these collections or their search results may be passed to additional search services. The Events, Features, and Data services are inter-connected with the Search service for the analysis of phenomena in the same location in the heliosphere. Alternatively, the Propagation service is used when location of the event changes with time. Note that Data is shown at the bottom as its access is generally the final product for the user.

A more detailed description of the different services within HELIO and the properties of the propagation model used in this paper are shown in the following subsections.

2.1 HELIO Environment

HELIO consists of a number of services that are combined into a single unified front-end interface, but each service can be independently queried using their own interfaces. This kind of design follows a service oriented architecture, avoiding a single monolithic system and allowing the services to be used individually or combined together through a workflow environment.

This study only makes use of four services provided by HELIO – three Search services (Event, Feature, and Data) and one that runs a propagation model. The other services provide valuable information that is of interest in different workflows. This includes services to search for instruments based on capability or position in the heliosphere, and a service that provides contextual information (e.g., GOES X-ray flux or flare positions on full-disk solar images). In addition to those, HELIO also has a catalog with all non-full-disk instruments. This is aimed at providing more detailed information when searching for instruments that may have observed a particular event. Finally, data mining on in situ measurements is offered through AMDA, while large processing capabilities and storage resources are provided by Grid-Ireland.Footnote 8 A brief description of the event and feature catalogues, as well as the data provider service is presented.

The event catalogue includes several separate catalogues (created both manually and automatically) of CMEs, flares, energetic particles, and solar wind events. The feature catalogue is an evolving collection of automated feature detection algorithms that are run within HELIO. These cover filaments, coronal holes, active regions, sunspots, and both Type ii and iii radio bursts. Table 1 lists the codes available at the moment and their references. The data collection service is a centralised mechanism that connects with different data providers and has a fall-back or contingency capability determined by provider accessibility. This differs from the event and feature catalogues, in that data-search queries are enhanced by other services that allow queries on the location of the instrument in space and time or queries on instrument capability (e.g., remote sensing or in situ measurements, time series/imaging/spectral data, etc.) to be run.

Table 1 List of automated feature detection algorithms available in HELIO.

2.2 Solar–Heliospheric Event Ballistic Algorithm

The first propagation model implemented in HELIO is the Solar–Heliospheric Event Ballistic Algorithm (SHEBA). Its main purpose is to determine if the Sun, a planet or spacecraft can be associated with an event and, if so, to provide a time interval for this interaction. It has been split into three different functions: CMEs, SEPs, and CIRs. SHEBA can be executed in both a forward and backward sense – i.e., to estimate the time and position of an event on the Sun based on the detection of an event at a certain time and position in the heliosphere and vice versa. SHEBA is a 2D ballistic model, in which events propagate at a constant speed and are confined to propagate in the solar equatorial plane (Burlaga 1984). All objects (spacecraft and planets) are projected onto this plane.

The three different scenarios implemented in SHEBA are based on the same ballistic model; however, they are adapted to the different properties of the events. The CME case considers a blob of plasma that moves at a constant radial speed with a fixed longitudinal width (i.e., a circular arc, centred on the Sun, that expands radially). The inputs for the CME case are starting time, starting heliographic longitude, longitudinal width, and speed of the CME (with an optional uncertainty). Flare start times are often used as an estimation for the CME start time, as CMEs are only detected after a few solar radii. Likewise, the CME starting longitude may be obtained from the active region location or position of a disappearing filament on the Sun. A CME listed in one of the catalogues available in HELIO will have an associated speed that can be used as the input to SHEBA. The remaining parameters (CME width and uncertainty in speed) must be provided by the user. The model determines which spacecraft and planets (taking into consideration their own motion) would be hit by a CME using the input parameters. The error in the speed would provide an interval of time for the possible impact. The top panel in Figure 2 shows the output for one of the CMEs, the 2000 “Bastille Day” event, with the input parameters and results shown in Table 2.

Figure 2
figure 2

Examples of the three possible scenarios in SHEBA: CME (top panels), SEP (middle panels), and CIR (bottom panels). Each example shows the position of the planet at start time (open circle ∘) and end-time (filled circle •; i.e., time of impact) positions of the planets. The spacecraft are just shown on their impact location. The CME is represented as a front of ascribed angular width with colours showing the number of days it takes to travel through the solar system. SEP events are shown as a spiral of certain width (input as uncertainty in v sw) through which energetic particles would travel. CIRs are represented in a similar way as the SEPs, with the difference that, as CIRs are considered to be long-lived features, SHEBA finds the closest impact time by rotating the CIR ± 180 (i.e., ± 12.5 days – half solar rotation). The inputs and results of each of the events displayed in these panels are detailed in Table 2.

Table 2 SHEBA inputs and outputs for the three possible scenarios: CME, SEP, and CIR shown in Figure 2. The inputs for these runs come from the 2000 “Bastille Day” event for the CME, a flare from the 2003 “Halloween Storm” period for the SEP, and the edge of one of the CHs shown in Section 3.3. Satellites orbiting planets are approximated by the planets positions.

The SEP and CIR scenarios are closely related as they both use the solar wind speed to determine the shape of the Parker spiral (Parker 1958). The shape of the 2D spiral is given by,

$$ r=\frac{v_{\mathrm{sw}}}{\Omega_{\odot}}(\theta_{0}-\theta), $$
(1)

where r is the distance from the Sun, v sw is the speed of the solar wind, Ω is the angular rotational speed of the solar surface near the equator (14.4 day−1), θ 0 is the source longitude, and θ is the longitude under consideration. The input parameters are similar to the CME model with the exception of the longitudinal width and speed, which in the SEP and CIR cases refer to the solar wind. In the case of SEPs, the model calculates the time it takes for energetic particles from a certain time and position on the Sun to travel along the Parker spiral and reach any objects that the spiral intersects. The SEP model also offers the possibility to choose the fraction of the speed of light (β) at which the energetic particles travel along the Parker spiral. For CIRs, SHEBA models a Parker spiral from a source longitude that instantaneously extends out to 60 AU. The model then calculates the time taken for this spiral to rotate around to each object in the heliosphere (N.B. this assumes the prior existence of the CIR and hence it assumes the corona is long-lived and static). The user may choose the position and time of a flare for the SEP case and the western-most edge of a coronal hole (CH) at a certain time for the CIR case. The speed of the solar wind could be obtained from in situ data or by user choice. The middle and bottom panels in Figure 2 show the output for the SEP and CIR models, respectively, using the input shown in Table 2.

The implementation of SHEBA in HELIO is made by a webservice that connects a front-end interface with a set of IDL routinesFootnote 9 that run on the grid system at Trinity College Dublin. SHEBA produces two images as a context, one for the inner solar system and another for the outer (as shown in Figure 2). The input and output parameters are additionally saved as a VOTable that is useful for the interaction with the rest of the HELIO system and other VO tools. The three use cases presented in the following section make use of SHEBA. However, in the future HELIO will incorporate more sophisticated analytical models, or even numerical simulations, which could be chosen by the user as an alternative to SHEBA.

3 Event-Specific Case Studies

HELIO is not designed to carry out scientific data analysis but instead facilitates this goal by providing a central resource to search the various events lists, types of observation, spacecraft locations and capabilities, and to allow these products to be downloaded for analysis. As a result, some of the functions HELIO provides require a priori knowledge or certain user inputs. For example, in the case of automatic detection algorithms, the user should first be familiar with the algorithm limitations as only events or features with characteristics predetermined by the methods or parameters used by the algorithm will be detected (e.g., CACTus relies on Hough transforms and so its algorithm limits detected CMEs to those with constant speed; Robbrecht and Berghmans 2004).

In the following subsections, we investigate three HELIO use cases that highlight each of the applications of the SHEBA propagation system. The first case study investigates CMEs impacting both Earth and Mars (Section 3.1). The second case study tracks a SEP event resulting from a solar flare out through the heliosphere (Section 3.2). The third case study aims to determine if the source of a HSSW stream that sweeps past Earth is due to a CH (Section 3.3).

3.1 CME Use Case

CMEs propagating through the heliosphere can be detected in several forms of remote sensing observations and in situ measurements. Close to the Sun, CMEs are observed by solar imaging instruments such as extreme ultraviolet (EUV) imagers and further out by coronagraphs, heliospheric imagers, space-borne radio instruments and many ground-based radio observatories. CMEs may be detected (directly) through in situ measurements of interplanetary plasma properties or (indirectly) by their effects on planetary magnetospheres. Planetary effects can also be observed through remote observations, e.g., EUV and radio emission from auroral activity triggered by the passage of a CME. Connecting such disparate observations through time and across disciplinary domains is a complex and cumbersome task. HELIO facilitates this effort by providing a central resource to search the various events lists, types of observation, and spacecraft locations and capabilities to allow this mixture of data to be downloaded for analysis.

In this use case the aim is to examine the propagation of CMEs from the Sun into the heliosphere. In particular, we use a study by Falkenberg et al. (2011), which investigates CMEs interacting with Earth and Mars as a template to demonstrate some of the capabilities of the HELIO infrastructure. The study Falkenberg et al. analyses a time frame when the Mars Global Surveyor (MGS; Acuna et al. 1992, 1998) was in operation while Earth and Mars were separated by less than 80 in longitude.Footnote 10 This corresponds roughly to the years 2001 and 2003. These time frames could be identified using the HELIO framework to search for this combination of planetary and spacecraft arrangements, but we proceed using the following time ranges: April 2001 – January 2002, and May 2003 – December 2003, based on the events in the paper Falkenberg et al. (2011).

The use case shown in Figure 1 is complicated because the study Falkenberg et al. begins by looking for shock events at Mars. However, no shock event list exists based on data from Mars (other than the list of Falkenberg et al., which would not be a rigorous test of the system) so it is necessary to begin from Earth. Shock events detected at Earth were back-propagated to the Sun and associated with source events. These source events were forward-propagated into the heliosphere, providing time intervals of possible event arrival at Mars.

The first task in this use case was to search for candidate events at Earth (L1). An initial search of the SOHO/CELIAS/MTOF/PM Interplanetary shock listFootnote 11 during the two time periods indicated above yields a list encompassing all of the Falkenberg et al. events. Table 3 collates the matching results from the first time period. Refining this list requires the user to have a basic understanding of the relevant science, in this case that halo CMEs are a possible indication of Earth-directed events and additionally that CMEs and flares are often associated. This is not a weakness of the HELIO system, as its aim is only to facilitate scientific study and not to actually carry out the investigation. For the current goal, a review of the comments associated with each event (i.e., association with a halo CME, proximity to disk centre, association with a flare and its position) resulted in nearly one-to-one correspondence between the HELIO output and the results of Falkenberg et al., as indicated in Table 3. At this point the near-Earth in situ data could be downloaded for analysis, e.g., to obtain estimates of speeds and more rigorous timing. HELIO facilitates quick-look inspection of data as well as data mining, where mathematical criteria can be applied to the data in order to obtain relevant time intervals for some in situ instruments. However, as this has already been done by Falkenberg et al. (2011), we will proceed using the information from their results.

Table 3 Extraction of the merged results from the HELIO event catalogue query on the SOHO/CELIAS/MTOF/PM Interplanetary shock list and Falkenberg et al. events for April 2001 – January 2002 where starting times at Earth match. Columns give the time of the event on Earth, comments from the Interplanetary shock list (“?” indicates missing or undetermined measurements as recorded in the catalogue), and the match event number from Falkenberg et al. (2011).

The next task is to back-propagate a specific event to the Sun and search for related events. The time and speed of the shock at Earth (L1) reported by Falkenberg et al. are input into the SHEBA propagation service, which gives an estimated source time interval for the CME as indicated in Figure 3. Event 1 gives a time and speed of 4 April 2001 15:00 UT and 800 km s−1 at L1, which yields a source time between 1 April 2001 03:28 UT – 3 April 2001 01:06 UT using an uncertainty in speed of ± 300 km s−1. This time range was used to search for events either at (i.e., flares) or close to (i.e., CMEs) the Sun. Some of the catalogues relevant to this search include the GOES Soft X-ray Flare List,Footnote 12 the CDAW SOHO/LASCO CME Event List,Footnote 13 and the CACTus SOHO/LASCO CME List.Footnote 14 An initial search of these catalogues in the time range resulted in 15 flare detections, 10 CDAW CME detections, and 11 CACTus CME detections. These results can be narrowed down to three CDAW and three CACTus detections by looking for halo (or very wide – i.e. larger than 180) CMEs, because the events are known to be Earth-directed. This can be further refined to two CMEs that are present in both detection lists using the event speed and timing information. The resulting CMEs were detected at 1 April 2001 11:26 UT and 2 April 2001 22:06 UT in both lists.Footnote 15 These CMEs have reported speeds of 1475 km s−1 (CDAW) and 774 km s−1 (CACTus) for the first, and 2505 km s−1 (CDAW) and 1300 km s−1 (CACTus) for the second one. The different speeds from the two CME catalogues result from the differing detection methods, and this demonstrates that the user must apply appropriate knowledge for HELIO to produce sensible results. Each of the CMEs has only one flare close to their time, a GOES M5.5 flare with an initial detection time of 1 April 2001 10:55 UT and a GOES X20 flare with starting time of 2 April 2001 21:32 UT. At this point the event has been tracked from Earth back to the Sun and associated with two halo CMEs and a flare each.

Figure 3
figure 3

CME use case workflow for the 4 April 2001 event shown in Table 3. The figure is separated into three columns based on heliospheric locations: Sun, Earth, and Mars. Each time-series shows contextual data showing the start time (vertical red line, as shown in the corresponding catalogues detailed below) and prediction (blue, as calculated by SHEBA). The arrows connect the different locations by the use of the propagation model and using the parameters shown on their sides. The top panel shows the solar wind speed measured near-Earth from ACE/SWEPAM at the starting time of the shock (vertical red line) as listed by the Interplanetary Shock catalogue (± 3 days for context). That event is back-propagated to the Sun using SHEBA, which provides a time range where the CME should have started. The time range obtained is then filtered by finding which CMEs have large width (explained in text). The selected CMEs are associated with a GOES M5.5 flare detected on 1 April 2001 10:55 UT, and a GOES X20 flare on 2 April 2001 21:32 UT. The GOES soft X-ray light curve and snapshots of the CMEs detection by CACTus (CME A on the left, CME B on the right) are shown on the left hand side. The vertical red lines on the GOES light curve show the flares starting times that are consequently used in the CME propagation model to find the expected time of arrival at Mars. The SHEBA diagram is shown linking the GOES light curve at the Sun and the measurement at Mars by MGS instruments (pressure proxy at Mars calculated from magnetic field measurements in the top, and the background count rate from the Electron Reflectometer in the bottom). The time interval predicted by SHEBA for CME B is shown with vertical blue lines, whereas the red line indicates the time measured by Falkenberg et al. (2011).

The next task is to forward-propagate both the CMEs from the Sun into the heliosphere. For clarity we will refer as CME A for the one observed on 1 April, and CME B for the one on 2 April. The propagation service was employed to estimate the CMEs arrival time at Mars and at Earth (although Earth arrival time could seem redundant for this particular use case, it helps to distinguish between the two candidate CMEs) using the time of the flare, speed of the CME, and the CME source position. Figure 3 shows one possible output of the propagation service using the associated flare start time and a CME speed of 800±300 km s−1. This speed is chosen, as it is already known from the previous step to provide the correct time range at Earth. The estimated times of arrival at Earth are 3 April 2001 01:17 UT – 4 April 2001 22:42 UT for CME A and 4 April 2001 00:49 UT – 7 April 2001 03:06 UT for CME B . Comparing both intervals with the ACE measurements of the bulk speed of the solar wind (top panel in Figure 3) indicates that CME B is more likely to correspond to the one recorded by the SOHO/CELIAS/MTOF/PM Interplanetary shock list on 4 April 2001 14:21 UT (the starting point of the use case). Also, close inspection of the CME and flare catalogues and their data products shows some indications that the CME A may have lifted up on the east side of the Sun making it less probable to hit Mars. Therefore, the estimated time of arrival at Mars for CME B is in the range 5 April 2001 07:00 UT – 8 April 2001 03:59 UT and a search for instruments located at Mars in this interval yields MGS Electron Reflectometer (ER) and magnetometer (MAG). The instruments and time range were then used to identify available data and provide download links. These data were subsequently analysed as described in Falkenberg et al. and references therein. The results are displayed in Figure 3, where a shock is detected at Mars in the predicted time range.

Thus, a shock event detected at Earth (L1) was successfully back-propagated to the Sun and associated with two flares and CMEs. These CMEs were forward-propagated first to Earth, which helped us to distinguish which one was the one we tracked back, and consequently find the estimated time of arrival at Mars, leading to the identification of a shock at Mars. Each of the events identified above from consideration of the Falkenberg et al. paper could be treated in a similar manner to obtain all relevant data (e.g., X-ray and coronagraph observations, in situ measurements at Earth and Mars).

3.2 SEP Use Case

SEP events are significant increases (both sudden and gradual) in the flux of high-energy particles following solar flare or CME events. These particles can be accelerated up to near-relativistic speeds and travel out into the heliosphere, spiraling along magnetic field lines that are connected to the source region. When a planet or spacecraft intersects the path of a SEP event, the high-energy of the particles can damage satellite equipment and endanger the health of astronauts. The investigation of SEPs requires knowledge of the time–space locations of active regions, flares, and planets. The SHEBA propagation model (described in Section 2.2) is used to associate the differing spatial and temporal locations, assuming some properties of the solar wind and the accelerated particles.

We present the case study of a combined flare and CME event occurring on 7 June 2011, which also produced a SEP event at Earth. This event was chosen for its extensive press coverage and popularity within the community when the use cases were proposed. This event resulted from a relatively small flare that accompanied one of the most spectacular CMEs so far in Solar Cycle 24. This was well observed by a myriad of instruments including SDO/AIA, which has exceptional spatial and temporal resolutions.

A GOES M2.5 flare is observed at around 7 June 2011 06:00 UT. Searching the GOES soft X-ray Flare List in the event catalogue for flares on the day of interest yields a precise start time (06:16 UT) and provides a region of origin (NOAA AR 11226 at W64 longitude). The time and location of the flare are used as input for the propagation service, with the assumption of a quiet solar wind with a radial speed of 400±20 km s−1 (typical conditions taken from user experience).

The model output predicts that Earth is directly in the path of the SEP event and that protons traveling at 0.5c would reach Earth in 19.76 min (06:27 UT), assuming the protons were instantaneously accelerated at the catalogued flare start time minus the X-ray light travel time (≈ 8.5 min). A particle speed of 0.5c was chosen because it corresponds to a kinetic energy for protons of 145 MeV, which is a reasonable value to compare with the highest energy channel measured by GOES proton flux detector (> 100 MeV). This predicted arrival time is cross-referenced with the GOES Proton Event List, resulting in a SEP event with a start time of 08:20 UT. Figure 4 shows the workflow used for this case study. This event was well isolated in time, with no other flares occurring on the same UT day, providing a stable environment in the heliosphere into which the event propagates. This contributes to the success of SHEBA in predicting SEP planetary arrival times, as the solar wind magnetic field lines follow a simple Parker spiral geometry. The expected arrival times could be less accurate when analysing a more complicated event or during a more active period, due to the simplicity of SHEBA. As a result, the HELIO propagation service would benefit from the inclusion of a model incorporating more advanced physics.

Figure 4
figure 4

A workflow used to obtain data relevant to the 7 June 2011 SEP event. The top panel shows GOES X-ray light curves for the high-energy (0.5 – 4.0 Å; black curve) and low-energy (1.0 – 8.0 Å; blue curve) channels, with the catalogue flare start time indicated by a red vertical line. The middle panel shows the output of SHEBA for the event, assuming a solar wind speed and flare source location. The particle velocity is assumed to be 0.5c (145 MeV) with the grey band representing the uncertainty in solar wind speed. The bottom panel shows the energetic proton flux curves at Earth for a range of energies, with the flare start time and SHEBA predicted SEP start times indicated by red and blue vertical lines, respectively.

3.3 HSSW Stream Use Case

CHs are large-scale density depletions in the corona and are the source of HSSW streams (Altschuler, Trotter, and Orrall 1972). Fast solar wind accelerated in these regions catches up slow solar wind at larger radii as the Sun rotates. The region where the slow and fast streams interact is a CIR (Pizzo 1978). In this case study, we aim to connect an in situ solar wind event observed at Earth (L1) with a CH at the Sun. Here the event in question is the detection of a CIR as it sweeps past a solar wind particle detector. We utilise both a catalogue of in situ events (“Stream Interaction Regions from Wind and ACE Data” as described in Jian et al. 2006), and a catalogue of CH detections (data products of CHARM) that relies on EUV images. HELIO services are used to obtain data associated with the event. Conventional analysis techniques are then used to characterise the complete event from the Sun to Earth.

The workflow illustrated in Figure 5 is complicated by the SOHO spacecraft entering “key-hole”Footnote 16 during the event. Analysing an event with missing data highlights the benefit of HELIO in searching for alternate data sources. This workflow incorporates these data to help define the event more completely. Using the HELIO front-end interface, the steps shown are executed automatically – after model parameters are supplied by the user, time-ranges are communicated between services and resulting data sets are delivered.

Figure 5
figure 5

A workflow describing the HSSW stream use case. A HSSW event detected near Earth is back-propagated to the Sun using the speed and times provided by the “Stream Interaction Regions from Wind and ACE Data” catalogue overplotted on ACE/SWEPAM data (process 1). The HELIO propagation service outputs a longitude where the CH should be found that day (process 2), but in this case no CH detections are available from CHARM as SOHO is in “key-hole”. Alternative data sources include ground-based He i 10 830 Å observations from the CHIP instrument at Mauna Loa Solar Observatory, which show the presence of CHs as regions of enhanced brightness (process 3). However, the previous solar rotation (process 4; −27 days) and following solar rotation (process 4+; +27 days) can be queried in the CHARM catalogue as CHs are often long-lived (CHARM detections are highlighted in red on SOHO/EIT 195 Å images). The propagation service is used to forward-propagate the westernmost edge of the detected coronal holes (processes 5 and 5+) with the results compared to the in situ catalogue. In the ACE/SWEPAM plots red vertical lines indicate catalogued event start and end times, while blue vertical lines indicate CIR arrival time predictions from the HELIO propagation service.

The first step in the workflow is to choose an event of interest. Here we search for the maximal speed in the catalogue of stream interaction regions reported by Jian et al. (2006). The chosen event began on 25 March 2004, and a summary is presented in Table 4. The event properties are used as input for the propagation service (process 1 in Figure 5) and the event is back-propagated to the solar surface. This results in a heliographic position on the Sun (process 2) that is matched to detections from CHARM. Unfortunately, there are no catalogued detections from CHARM because of a SOHO/EIT data gap corresponding to SOHO being in “key-hole”.

Table 4 A summary of the Jian et al. (2006) Stream Interaction Region catalogue for February 2004 – April 2004 obtained from HELIO. Events studied in the use case are shadowed.

As there are no alternate EUV instruments observing during the time range of interest, we search for those observing in the He i 10 830 Å wavelength (process 3), which is commonly used to detect CHs using ground-based observations (Henney and Harvey 2007). Although data are available, there are no catalogues confirming the presence of a CH on disk at this time. However, because CHs may persist for multiple solar rotations, we check both 27 days before (process 4) and 27 days after (process 4+) the propagated time (i.e., 25 February 2004 and 22 April 2004, respectively). Fortunately, CH detections produced by CHARM are available for both days, as shown in Figure 5.

The longitude of the westernmost edge of the CHARM detection is taken to represent the forward edge of the HSSW stream, and hence the location of the CIR. This longitude and detection time are implemented with a user-defined solar wind speed to forward-propagate the CIR Parker spiral (processes 5 and 5+) and generate a predicted travel time for the CIR to reach Earth (i.e., the time taken to catch up with Earth’s orbit due to the rotation of the Sun). A good match is found between the predicted arrival time and the catalogue of Jian et al., indicated by the dark shadowed rows in Table 4 and vertical lines in the ACE/SWEPAM plots of Figure 5.

Users can proceed to download associated data sets and begin detailed analysis now that appropriate time ranges and lists of instruments have been determined using HELIO. External services such as the HEK could have been used to determine the presence of a CH during the SOHO “key-hole” period. In the future, it is expected that the HEK catalogues will be accessible through HELIO.

4 Conclusions and Future Directions

In this paper, we have presented three case studies that rely on the event, feature, and data-search capabilities of HELIO.Footnote 17 The effort required to perform these case studies was significantly reduced by this system in comparison to relying on the original data sources for each data set. We have focused on the role of the propagation service in connecting the three search services offered by HELIO. However, other important services aided this work, such as the “context service” that allows the visualisation of contextual information at each workflow step and data mining capabilities offered by AMDA through HELIO. This work illustrates how HELIO aids in the location and acquisition of data across the boundaries of traditional scientific domains.

The chosen case studies involve three different workflows that exhibit the potential of the HELIO infrastructure. The CME use case presented in Section 3.1 indicates that a shock event observed at Earth can be used to determine the source location of a CME and predict its arrival time at Mars. The SEP use case, detailed in Section 3.2, highlights that it can be used to determine which planets or spacecraft were hit by high-energy particles by taking the spiral into account. Finally, the HSSW use case outlined in Section 3.3 shows the connection between HSSW streams and CHs over different solar rotations. This example additionally showcases the flexibility of HELIO to find supplementary data sources or alternate times when the desired data would be available.

HELIO is an evolving tool that is constantly being improved. Some of the ways in which this will occur are:

  • Development of new visualisation functions – e.g., the ability to preview data.

  • Addition of more sophisticated propagation models – e.g., semi-empirical and magnetohydrodynamic simulations that are physically more complete.

  • Addition of new algorithms to detect the same features in different data – e.g., the code described in Henney and Harvey (2007) to detect CHs in He i 10 830 Å images.

  • Addition of algorithms to detect the same features in the same data to allow inter-comparison of algorithm outputs – e.g., the Spatial Possibilistic Clustering Algorithm (SPoCA; Barra et al. 2009), which detects CHs in EUV images (for comparison with CHARM).

These and other improvements will be implemented soon.

The primary point of access to HELIO is intended to be the web front-end interface, but it can also be accessed by other means, e.g., the HELIO branch in the SolarSoftWareFootnote 18 (SSW; Freeland and Handy 1998) tree of the IDL environment and workflow tools such as Taverna.Footnote 19 Using HELIO through the Taverna workflow application allows users to share their workflows with the rest of the community using the MyExperiment infrastructure.Footnote 20 The operations described in the CME,Footnote 21 SEP,Footnote 22 and HSSWFootnote 23 use cases have been converted into Taverna workflows and are accessible online.

It is important that the consortia engaged in the development of heliophysics VOs collaborate, in order to accelerate the progression of the field and avoid replication of work. A goal of HELIO is to incorporate knowledge from additional projects, e.g., the Solar-Terrestrial Investigations and Archives (SOTERIA), the Europlanet Research Infrastructure, the Space Weather European Network (SWENET), and the HEK.Footnote 24 This will be achieved by creating inter-catalogue links and utilising algorithms that have been developed for these projects.