Visual analytics of high-frequency lake monitoring data

A case study of multiple stressors on a large inland lake system
  • Mark P. Wachowiak
  • April L. James
  • Renata Wachowiak-Smolíková
  • Dan F. Walters
  • Krystopher J. Chutko
  • James A. Rusak
Regular Paper

Abstract

In recognizing the cumulative effects of multiple stressors on altering aquatic ecosystem function, scientists have become increasingly interested in capturing high-frequency response variables using a variety of sensors. This practice has led to a demand for novel ways to visualize and analyze the wealth of data in order to meet policy and management goals. Time series data collected as part of these monitoring activities are not easily analyzed with traditional methods. In this paper, a visual analytics system is described that leverages humans’ innate capability for pattern recognition and feature detection. High-frequency monitoring of weather and water conditions in Lake Nipissing, a large, shallow, inland lake in northeastern Ontario, Canada, is used as a case study. These visualizations are presented as Web-based tools to facilitate community-based participatory research among scientists, government agencies, and community stakeholders. These analytics techniques contribute to collaborative research endeavors and to the understanding of the response of lake conditions to environmental change.

Keywords

Environmental monitoring Visual analytics Data analytics Web systems Community-based participatory research High-resolution data 

1 Introduction

With the increasing influence of multiple stressors (e.g., point and diffuse pollution from urban, agricultural, and industrial development, climate change) on freshwater systems, the ability to effectively harness environmental monitoring data is critical in assessing and implementing meaningful policy and decision making for long-term sustainable water management [11]. As a result, many researchers have highlighted the importance of advanced techniques for high-frequency data analysis [10]. Initiatives such as the Global Lake Ecological Observatory Network (GLEON),1 consisting of teams of limnologists, ecologists, computer scientists, and engineers, represent one such research community where this need is particularly acute [9, 10, 41]. The pooled data and computational toolboxes generated by these and other networked observatories, which provide important insights into key lake processes and their associated stressors, would benefit greatly from improved visualization techniques.
Fig. 1

Lake Nipissing and surrounding region. Buoy sites for 2014 data collection are identified by yellow triangles. (Color figure online)

Analysis of environmental time series is particularly challenging, as advances in sensor technology have facilitated high-frequency data acquisition for a large number of parameters. In addition to their size, the data may be heterogeneous, as they are often collected by multiple sensors at potentially varying sampling frequencies. Consequently, detecting patterns and other useful information is important, as environmental systems often exhibit characteristics that are not well understood and difficult to uncover with standard data mining or classification approaches [33]. To address these problems, new approaches from the emerging areas of data science and data analytics, and in particular spatiotemporal visual analytics, are being harnessed to uncover subtle patterns, trends, and anomalies that may inform process-level understanding [5]. As visual approaches make no assumptions about the data, and utilize the inherent pattern recognition abilities of human users, they provide a powerful complement to traditional statistical, data mining, pattern recognition, and time-frequency techniques [33]. Implemented online, such a system also encourages direct collaboration among environmental scientists, and provides a mechanism for quickly sharing, examining, and assessing data for multiple years using a common interface. Further, this Web-based solution has a longer-term goal of encouraging community-based participatory research approaches and increasing the public’s environmental awareness of water quality data [34].

The current paper applies visual analytics (VA) to environmental monitoring. VA is the science of analytical reasoning assisted by interactive user interfaces [36]. Specifically, VA is applied to the study of Lake Nipissing, the third largest lake in Ontario, Canada (see Fig. 1), and presented through an intuitive Web-based interface. As environmental monitoring is increasingly recognized to be location-specific [9], this work contributes new information specific to a regionally important lake, and presents a case study inquiry for discovering patterns related to in-lake processes and identifying stressors that may contribute to the formation of harmful algae blooms.

In the short term, such an online system encourages direct collaboration among environmental scientists, and provides a mechanism for quickly sharing, examining, and assessing data for multiple years using a common interface. By making these data accessible online, a longer-term goal is to encourage community-based participatory research approaches and to increase the public’s environmental awareness [34]. The VA system described in this paper is a prototype to extend the current public interface (http://iwrc.nipissingu.ca).

2 Motivation for the Study

2.1 Study area and scientific importance

Lake Nipissing is a large (\(873 \,\hbox {km}^{2}\)), shallow (mean depth = 4.5 m) lake, collecting water from a 13,100 \(\hbox {km}^{2}\) watershed of predominantly Precambrian Shield terrain and flowing westward into Georgian Bay via the French River. The lake is of economic, cultural, and ecological importance to businesses, industries, governmental policy-makers, First Nations communities, and residents of North Bay (pop. 54,000) and surrounding communities. Eight major river inflows have been evaluated as either impaired or marginal in water quality designation [21]. In two bays with upstream agricultural activity, total phosphorus concentration in streamflow has been found to exceed provincial water quality objectives, and there is evidence of eutrophic conditions [21]. In recent years, the lake’s walleye (Sander vitreus) population has been reported as highly stressed, with ecosystem change listed as one potential cause of the decline. Increased blue-green algae blooms and invasive aquatic species are also causes for concern [26].

In response to these problems, monitoring of water and weather conditions in several bays on Lake Nipissing was initiated in 2013 by researchers at Nipissing University (North Bay), in collaboration with the Ontario Ministry of Environment and Climate Change and the North Bay-Mattawa Conservation Authority, by installing lake buoys in three bays during the ice-off season.2

The visual analytics system described in this paper provides a platform in which to study weather and water quality conditions in Lake Nipissing that may contribute to the occurrence of harmful algae blooms. Processes of lake mixing, thermal stratification and development of low oxygen conditions (e.g., below 2 mg/l) within the water column are of particular interest due to links with delivery mechanisms of phosphorus from lake sediment back into the water column (termed internal loading) [29]. The 2014 observations are the first high-frequency water quality data of its kind collected on Lake Nipissing, and were specifically used to examine how meteorological conditions (include episodic rain/wind events) influenced oxygen concentrations in the water column and the spatial variability among the three bays.

Furthermore, high-frequency collection of lake and meteorological properties, and their exploration through VA, is vital in the construction of descriptive and predictive system models for studying responses to multiple stressors [14], and in validating deterministic ecological models [9].
Fig. 2

Linked map interface (left), user-selectable properties (center), and selectable diverging (first nine color maps), standard, and grayscale color maps

2.2 Visualization and visual analytics

Visualization is a widely studied discipline encompassing not only computer graphics, but also the role of users, human–computer interaction, efficiency, and computational resource management (see [28] for an overview of visualization research and state-of-the-art techniques). In addition, software engineering and validation are now integral considerations in this emerging field. To facilitate the design of effective visualizations, a nested model [27], as well as enhancements to this model [24], has been proposed. This model consists of four nested layers: (1) characterizing the problem domain, in which target users are engaged, and developers become familiar with the problem domain; (2) mapping the problem (in terms of the vocabulary of the domain users) to that of computer and visualization scientists, and defining generic operations and data abstractions; (3) designing visual encoding and interaction, which are particularly important tasks in visual analytics systems where non-specialist users are to benefit from exploratory tools; and (4) developing the algorithms to carry out the visual encoding and interaction. The most important challenges at this level are guaranteeing algorithm robustness and efficiency.

The interactive visualizations used in VA allow subtle trends to be seen, which can be further investigated with rigorous statistical and machine learning methods. Such visualizations are particularly important for real-time analysis of geospatial data, and data that are collected at high-frequency [3, 23]. For time series visualization, many new approaches have been proposed to meet specific needs.3 Many of these techniques explicitly consider analytics (e.g., clustering, principal components analysis) and interaction [2].

2.3 Visual analytics in water quality monitoring

Many examples of water quality visualization and VA are found in the literature. For instance, Accorsi et al. [1] developed a visual analytics approach combining spatiotemporal data mining and interactive map-based visualizations to assess river quality. Smith et al. [34] describe a Web-based platform, built with JavaScript and the Dygraphs library, for visualizing and analyzing multiple time series of the Laurentian Great Lakes region. An approach to continuous monitoring water quality using mini-boats loaded with sonde probes and a wireless sensor network-based monitoring system is described in [38]. This group has expanded their work to include a Web-based interface to show real-time updates of water quality from the wireless sensor network [39].

3 Methods

3.1 Data acquisition

Three buoys were installed on Lake Nipissing during the 2014 ice-off season to target specific at-risk areas of the lake (Cache Bay, Callander Bay, West Bay). NexSensTM data buoys were placed in Callander Bay (10.5 m depth) and West Bay (8.5 m depth), while a smaller buoy was deployed in Cache Bay due to its shallower depth (3 m). YSI EXO2 water quality sondes (YSI Inc., Yellow Springs, OH, USA) were installed on both Callander Bay and West Bay buoys, controlled by a Campbell Scientific (Edmonton, AB, Canada) CR1000 data logger housed inside each buoy, to record a suite of water quality parameters at a depth of 1m below the water surface on a 10-min interval.

At each site, an Onset HOBO dissolved oxygen (DO) logger was installed 1 m above the bottom sediment. Buoys were deployed in July and retrieved in late October 2014. Weather data (wind speed, rain, air temperature) for the period of record were collected in the nearby town of Sturgeon Falls by Nipissing University’s Environmental Monitoring Network.

3.2 Data characteristics

The data are continuously valued high-frequency (collected every 10  min) water quality parameters: temperature, conductivity, pH, optical dissolved oxygen, turbidity, chlorophyll, and blue-green algae. Water temperature was also recorded at 1 m (Callander Bay and Cache Bay) and 2 m (West Bay) intervals throughout the water column using a chain of Onset HOBO PendantTM temperature/light loggers. Additional DO measurements were collected at depth (specific to each site). Data were subjected to a quality control process, in which data outside the sensor range were removed.

3.3 User requirements

Although the VA system was primarily designed to address scientific questions and to be used by researchers and government agencies, an important goal is also to disseminate lake monitoring data to the general community. Therefore, a Web-based visualization system was considered to be the best means of disseminating these data. Given both professional and general community users, design requirements include: providing temporal and spatial comparisons; intuitive interaction; allowing several properties to be seen simultaneously; providing interactivity in selecting time periods; allowing both long- and short-term trends to be seen; regular data update; allowing multiple views of the same data; and providing summary statistics.

3.4 Interactivity

Interactivity plays a crucial role in time series visualization by guiding users in understanding the relationship between different properties, the effects of user-specified parameters, and in distinguishing interesting and potentially hidden patterns [2]. When designing the interactivity of the system, special consideration was given to exploration (allowing focus on a specific time range or combination of variables), reconfiguration (allowing the time scale to be re-arranged), encoding (by providing different ways of looking at the same data through different visualizations), abstraction (showing overall trends), elaboration (allowing zooming in to different levels of detail), and connectivity (the ability to discover, compare, and evaluate similarities and relationships for different time periods, spatial locations, and choice of variables) [2].

More thorough data exploration is achieved by relating multiple graphs, visualizations, and maps, thereby allowing related data to be linked together so that an operation performed on one graph or map will be reflected on the others, creating a dynamic connection between the information and corresponding visualizations [35]. This linked approach was taken for examining the Lake Nipissing monitoring dataset, where users choose to display a variety of properties based on spatial locations (Fig. 2). The VA system employs line plots, small multiples, horizon plots, multivariate parallel coordinate plots, and three matrix visualizations: depth profiles, windowed cross-correlation, and multiresolution views. In addition to the visualizations, basic statistical measures—mean, standard deviation, median, minimum, maximum, and interquartile range—are shown below the visualizations for user-selected time periods and for specific sites. A global diagram showing the data flow and interactions among the system components and user controls is shown in Fig. 3.
Fig. 3

Global view of visualization tools and flow of data

3.5 Line plots

Line plots are the most common and easily understood form of representing time series, and indicate the overall shape of the data over time [2]. The Lake Nipissing bays VA system supports standard line plots with multiple axes (see Fig. 4). A time selector control below the plots allows users to zoom to a time period of interest. Hovering over a series at a specific date/time highlights the data, displaying information about weather and water conditions at the selected date and time.
Fig. 4

Line plots displaying meteorological drivers [air temperature \((^\circ C)\), rainfall amount (mm), wind speed (m/s)] and water quality conditions (dissolved oxygen at 8m below surface) during the period August 3–23. Air temperature and rainfall amount are shown on the left graph, while wind speed and dissolved oxygen are shown on the right. Hovering over a date/time displays the data for that time in the legend. The time selector for zooming (with a graph of the aggregate of all time series displayed on the plot) is shown below the plot. This time selector aggregate plot is only for display purposes

Fig. 5

Interactive small multiples plot displaying the three meteorological drivers (air temperature, rainfall amount, wind speed), and additional water quality variables (DO, temperature, specific conductivity, pH, turbidity) measured 1 m below surface for the period of 12–26 July

Fig. 6

Interactive horizon plots displaying the meteorological and water quality conditions during the period August 3–23

3.5.1 Small multiples

In small multiples plots (Fig. 5), the graph space is split into individual plots, one for each variable.

The lines representing variables are filled to delineate interesting features in the data [15, 37]. A key advantage of small multiples is the reduction of clutter. However, because time series having a large dynamic range are compressed, inter-series comparisons can be difficult. In the Lake Nipissing bays VA system, individual graphs can be moved through a “drag-and-drop” mechanism, allowing the user to freely juxtapose graphs.

3.5.2 Horizon plots

Like small multiples, horizon plots (Fig. 6) are a space-splitting technique where many time series variables are displayed simultaneously in a small display space, thereby minimizing clutter. They are constructed from a simple line plot and a baseline (zero, the mean, or some user-defined value). Areas above this baseline are colored differently than those below the baseline. Those areas are further split into bands based on their value. In this way, higher absolute values are colored with deeper shades of the primary color. They have been found to be more useful for “drill down” tasks than for at-a-glance overviews, and for “dispersed visual span” tasks, where a long duration of data, represented horizontally, is to be analyzed [15]. The horizon plots option presented here allows the user to choose the baseline value, as well as the number of bands. Changing the number of bands makes identifying sharp features (e.g., minima and maxima, anomalies) easier than is the case with small multiples, which are better suited to identifying coarse trends.

3.6 3D parallel coordinate plots

3D parallel coordinates plots (PCPs) are also part of the linked visualizations in the Web-based system (Fig. 7). PCPs are popular for analyzing multivariate data, and allows the user to visually identify mechanistic relationships among several variables simultaneously. In PCPs, multivariate values are represented as piecewise connected polylines (continuous lines consisting of more than one segment) intersecting with parallel axes, which denote variables [17]. PCPs are particularly useful for identifying outliers, trends, and for classification, especially when combined with clustering techniques [13].

In PCP, d parallel axes, each corresponding to a dimension of the data, are drawn. Additionally, the ordering of the parallel axes can be changed to better observe correlation between two axes (dimensions) that are placed directly beside one another [4].

Many enhancements have been made to standard PCPs [12, 17]. Although they are especially useful for multidimensional discrete data, they can be adapted for time series, in which time is treated as a dimension with a fixed ordering [8, 12].
Fig. 7

3D parallel coordinate plots with date, time, and six properties, colored by five clusters (left). Temporal 3D PCPs rotated to show temporal evolution of variables (right)

A 3D PCP, where buoy locations (e.g., Callander and West bays) comprise the third dimension (as each buoy is uniquely referenced geospatially), is implemented so that every parallel axis has been extended along the z-axis into a plane, as shown in Fig.  7 [18]. An additional option allows time to be displayed in the third dimension, allowing users to examine the temporal evolution of multivariate relationships. These temporal 3D PCPs are particularly useful studying for long-term data over several seasons. The interface allows the user to arbitrarily rotate the 3D PCP.

Opacity controls are provided to reveal the density distribution of the graph, as polylines in dense areas have greater visibility. The polylines can also be grouped into categories [19], with each polyline subsequently colored based on the cluster to which it belongs. The clustering can be based on any combination of d properties, using some distance measure between the d-D data points, further highlighting correlations between the properties. In the current study, the standard k-means clustering approach was used to group values based upon user-specified properties.

To follow sets of polylines across data dimensions, groups of polylines can be selected directly on the plot by “brushing” [6], wherein users draw a “brush line” over the plot to highlight any polylines that the brush line intersects.
Fig. 8

Profile view of water column temperature in West Bay (\(^{\circ }\)C) for the period 13 July to 18 October, 2014. Dissolved oxygen concentrations (mg/l) and meteorological drivers are displayed below the plot

The polylines in the PCP can also be colored based on the property chosen for the map, with each polyline’s color matching that of its corresponding point on the map.

3.7 Matrix visualizations

Matrix visualizations provide an intuitive way of visualizing 2D data, usually with time as one of the dimensions. Matrices are represented as “heat maps,” where matrix elements are colored according to some function of their numeric value. The specific coloring scheme greatly affects interpretation. The Lake Nipissing VA system provides user-selected color maps, including grayscale, the standard rainbow (red = high, blue = low) color map, and diverging color maps, which, by providing a transition between colors for low and high values through an unsaturated color (e.g., white), address some of the interpretive shortcomings of rainbow maps [25].

3.7.1 Depth profiles

Profiles are images of measurements of lake profile properties collected at various depths, such as water temperature and DO (Fig. 8). The profile is generated with linear interpolation. Profile views are also implemented as Dygraphs plotters, and therefore have all the interactive features of the other plots. To relate the profile to other properties, user-selected properties are shown below the profile (e.g., DO concentration in Fig. 8). The time selector can also be used in this view. Synchronized plots of user-selectable variables were added to aid in the interpretation of the heat maps.

3.7.2 Windowed cross-correlation

Windowed cross-correlation (WCC) and multiresolution plots are useful for analyzing long-term trends or multiple-season data, some of which may be discovered through the 3D PCP described above. Correlation between time series can uncover relationships and the dynamics of environmental phenomena. However, especially in long time series, the temporal offset between the correlations or changing correlations over time must be considered, or interesting patterns may remain undetected. The windowed cross-correlation shows dynamically varying correlations of one (autocorrelation) or two series [22] (Fig. 9). The time series is divided into intervals/windows of equal length, and the correlations between two user-selected windows of size n are computed, shifting the windows relative to each other [22]. The result is a 2D matrix, where each window corresponds to a row, and each column contains the correlation with lags from \(-\) n to n, where n is the user-specified number of lags. The matrix is then displayed as a heat map, using user-selected color maps.
Fig. 9

Windowed cross-correlation of air temperature (Sturgeon Falls weather station) and water temperature in West Bay (1 m below the surface). A diverging color map is used

Fig. 10

Multiresolution view of mean, standard deviation, and kurtosis of dissolved oxygen concentration (mg/l). An interactive cursor control (here, at 12 days) displays the time series and standard deviation at the selected resolution below the plot

3.7.3 Multiresolution plots

The multiresolution plot is a matrix visualization in which each row of the matrix represents the time series at a specific time scale [33]. In Fig. 10, the standard color map (red = high, blue = low) demonstrates the utility of these plots to exhibit contrasts. The first row is the original series, and each subsequent row aggregates the data for that time scale. The top row represents the aggregate value of the entire series. Mean, variance, and complexity measures (e.g., entropy) are the most common aggregate measures. In these plots, visually detectable temporal trends and patterns emerge from different resolutions displayed simultaneously. Implementation details are found in [33].

Multiresolution views can show the seasonal trend and cycles of environmental properties, as well as dynamic trends. Variance (or standard deviation) exhibit the range and distribution of the time series, and is a useful metric for detecting changes due to disturbance [7]. Complexity measures show deviation from normality, which are important in assessing anomalies and dynamic environmental conditions over long time periods. These views are implemented through a custom Dygraphs plotter [40]. Two views are juxtaposed for comparing series. Users can zoom in to different time scales using the horizontal cursor on the view, which then displays a line plot below that particular time slice. Views can be displayed for the mean, standard deviation/variance, and kurtosis, which is a proxy measure of time series complexity.

3.8 Implementation

The VA system consists of HTML pages, JavaScript, and PHP for the Postgres SQL database queries. The graphs and matrix visualizations were developed at Nipissing University using custom Dygraphs plotters [40], which provide interactivity (such as the time selector and zooming) and basic functionality. The parallel coordinate plots were implemented directly with WebGL. The color scales were implemented with a JavaScript library,4 which also provides the functionality to customize the color maps. The calculations for the correlation were performed using a modified JavaScript program for the Fourier Transform.5 Although the graphs have user-controlled parameters to enhance interpretation, all graphs are supplied with intuitive default values.

4 Results

This section provides an evaluation and a case study demonstrating how the tool can be used by domain experts to gain new insights into lake processes. The various line plots and matrix visualizations are primarily used as examples for short-term findings in a specific study site (West Bay). The 3D parallel coordinate plots, especially the temporal PCPs (as well as the matrix visualizations) will prove valuable for long-term time series, and in analyzing decadal data.

4.1 Line plots, small multiples, and horizon plots

Figure 4 illustrates climate drivers (air temperature, rainfall amount, and wind speed) that can influence lake conditions, as well as an example of water quality conditions measured in West Bay (dissolved oxygen at 8 m below the surface) during the period of 3–23 August, 2014. These variables were selected through the main interface (Fig. 2). Alignment of line plots (e.g., drivers and response variables) illustrate relationships between meteorological conditions and episodic mixing events within West Bay [31], similar to the findings of Jennings et al. [16]. High winds and rainfall starting on Aug 12 led to mixing of the water column, isothermal conditions, and an increase in dissolved oxygen at 8 m depth. These water quality conditions followed a period of lower DO concentrations that developed in early August in a stratified water profile. A similar pattern in seen in Callander Bay, but with DO levels below the 2 mg/l threshold.

For multiple scales of interest (sub-hourly to multiyear high-frequency data), small multiples allow quick overview of re-occurrence of key phenomena (e.g., occurrence of low oxygen conditions) and time and monitoring locations where more detailed examinations will be useful. In Fig. 5, preliminary inspection of short-term water quality patterns in West Bay during the period of 12–26 July suggests diel variation in some variables, such as DO and pH (both shown at 1 m depth), that can result from influences of the daily cycle of solar radiation, which in turn generates cycles in air and water temperature and biological processes such as phytoplankton photosynthesis/blooms [20]. Small multiples plots also allow cursory searches for periods where diel patterns may be masked by episodic events (e.g., meteorological events or upstream inflows, either natural or anthropogenic).

In Fig. 6, the baseline for each variable is defined by the mean seasonal value. As a result, the darkest red and blue colors indicate the highest and lowest values, respectively. In complement to Fig. 4, horizon plots clearly illustrate the high wind speeds (dark red) that occur with the meteorological event starting on August 12 and leading to the mixing of the water column. Rainfall amounts are less noticeable on this plot (better illustrated in Fig. 4). The horizon plot also shows the high daily temperatures preceding this event and the lower DO concentrations.

4.2 Matrix visualizations

The depth profile of water temperature in West Bay (Fig. 8) provides a comprehensive view of periods of mixing and thermal stratification extending from 13 July to 18 October, 2014 [31]. During periods stratification (e.g., early August, preceding the Aug 12 event), the top of the water column warms and colder, denser water at depth in the hypolimnion is prone to decreasing oxygen levels due to isolation from the atmosphere and biological update. Under these conditions, phosphorus can be released from sediment back into the water column, as one mechanism of internal loading. During the 2014 ice-off season, Callander Bay showed longer periods of stratification, and Cache Bay (not shown) showed no development of stratification, documenting important differences in conditions that set up across the three bays on Lake Nipissing [31].

The WCC is displayed in varying time frames and resolutions through user-controlled time window length and percent overlap. Zooming capabilities controlled by the time selector in the main plot enable users to visually detect interesting correlations in time series. In Fig. 9, air temperature from the Sturgeon Falls weather station and water temperature at 1 m below surface in Callander Bay show periods of high positive (red) and negative (blue) correlation for the period of 11–27 July, due in part to daytime warming and night-time cooling where water retains heat and cools at different rates than air. Longer periods of positive correlation (e.g., 16–17 July) result from weather patterns, in this instance a cold front generating rain.

The multiresolution view of water quality in West Bay (1 m below surface) can be used to further examine both diel patterns and episodic events, shown here for the period of 11–26 July (Fig. 10). Multiday periods of lower (e.g., 16–17 July) and higher DO (e.g., 21–23 July) suggest an influence, either directly or indirectly, of meteorological forcing. Coarser resolution time scales of DO appear to result in an average behavior that masks diel patterns and emphasize longer-term (e.g., multiday) influences. These aggregate temporal patterns are otherwise very difficult to extract in more traditional data plots, and provide additional insight into the aggregate properties embedded in high-frequency time series.

5 Discussion

The current prototype Lake Nipissing monitoring system satisfies many of the goals of VA [36], including: representing a large amount of data in a small space (small multiples and horizon plots); enhancing pattern recognition by organizing data in spatial and temporal relationships (linked maps, clustering in parallel coordinate plots); facilitating inference of patterns, relationships, and anomalies that would normally be undetected (matrix visualizations); and, very importantly, by presenting an interactive, interface that enables exploration in all visualizations. Furthermore, it is expected that near real-time database updates will be implemented in the near future, thereby increasing interest by the general public.

As shown in the Results section, lake phenomena can be analyzed and explored in greater depth. For example, the Lake Nipissing VA can easily identify evidence of anoxic conditions at depth conditions which generate the potential for the internal loading of phosphorus that can trigger algal blooms. When coupled with additional insights from relationships with air temperature and wind speed, one can also begin to predict the type and severity of these blooms [30, 32]. The historical records provide multiyear evidence of oxygen concentrations below 2 mg/l, considered to be a threshold value. The observations from time series analytics can be complemented with water chemistry variables, as direct evidence is obtained through grab samples.

Although the VA system enables immediate access to lake data at different scales, as more data are collected, data warehousing will be an important component that will be especially useful in offline data mining and computational analysis at these varying scales.

6 Conclusion

The visual analytics described in this paper can be used to explore the large quantities of data acquired from high-frequency environmental monitoring systems, and employ the innate pattern recognition capabilities of human users. The dynamic and interactive nature of many of these plots provides the flexibility that is directly responsible for insights into data structure that may not be otherwise possible. These features, along with the diversity of visualizations presented here, have been highly valued by researchers interested in exploring both single and multiple variable relationships in high-frequency datasets.

These capabilities of information visualization, combined with computational data analysis, both facilitate analytic reasoning to support lake research and provide a mechanism to disseminate the results to a variety of stakeholders and to the larger community. Therefore, future work includes engaging with scientific users and community stakeholders to enhance the usefulness and efficiency of the system, and further integration of VA with statistical measures and data mining, machine learning, and time-frequency analysis to provide a deeper understanding of process-level lake phenomena.

Footnotes

Notes

Acknowledgements

The authors are grateful to the anonymous reviewers for their constructive criticisms and helpful suggestions. MPW is supported by NSERC Discovery Grant 386586-2011. ALJ is supported by the Canada Research Chairs program, Nipissing University, the Canada Foundation for Innovation, and NSERC. The authors thank M. Prescott for QA/QC of the 2014 data, C. McConnell for buoy design, and B. Dobbs, T. Singhe, M. Timson, D. DuVal, and J. Moggridge for programming assistance. The authors also thank GLEON for providing a mechanism and platform for stimulating interdisciplinary collaborations using high-frequency data.

Compliance with ethical standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

References

  1. 1.
    Accorsi, P., Lalande, N., Fabrègue, M., Braud, A., Poncelet, P., Sallaberry, A., Bringay, S., Teisseire, M., Cernesson, F., Le Ber, F.: Hydroqual: visual analysis of river water quality. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 123–132. (2014)Google Scholar
  2. 2.
    Aigner, W., Miksch, S., Schumann, H., Tominski, C.: Visualization of Time-Oriented Data. Springer, Berlin (2011)CrossRefGoogle Scholar
  3. 3.
    Andrienko, G., Andrienko, N., Keim, D., MacEachren, A.M., Wrobel, S.: Challenging problems of geospatial visual analytics. J. Vis. Lang. Comput. 22(4), 251–256 (2011)CrossRefGoogle Scholar
  4. 4.
    Blaas, J., Botha, C., Post, F.: Extensions of parallel coordinates for interactive exploration of large multi-timepoint data sets. IEEE Trans. Vis. Comput. Graph. 14(6), 1436–1451 (2008)CrossRefGoogle Scholar
  5. 5.
    Diehl, A., Pelorosso, L., Delrieux, C., Saulo, C., Ruiz, J., Gröller, M., Bruckner, S.: Visual analysis of spatio-temporal data: applications in weather forecasting. Comput. Graph. Forum 34(3), 381–390 (2015)Google Scholar
  6. 6.
    Edsall, R.M.: The parallel coordinate plot in action: design and use for geographic visualization. Comput. Stat. Data Anal. 43(4), 605–619 (2003)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Fraterrigo, J.M., Rusak, J.A.: Disturbance-driven changes in the variability of ecological patterns and processes. Ecol. Lett. 11(7), 756–770 (2008)CrossRefGoogle Scholar
  8. 8.
    Gruendl, H., Riehmann, P., Pausch, Y., Froehlich, B.: Time-series plots integrated in parallel-coordinates displays. Comput. Graph. Forum 35(3), 321–330 (2016)Google Scholar
  9. 9.
    Hamilton, D.P., Carey, C.C., Arvola, L., Arzberger, P., Brewer, C., Cole, J.J., Gaiser, E., Hanson, P.C., Ibelings, B.W., Jennings, E., et al.: A global lake ecological observatory network (GLEON) for synthesising high-frequency sensor data for validation of deterministic ecological models. Inland Waters 5(1), 49–56 (2015)CrossRefGoogle Scholar
  10. 10.
    Hanson, P.C., Weathers, K.C., Kratz, T.K.: Networked lake science: how the global lake ecological observatory network (GLEON) works to understand, predict and communicate lake ecosystem response to global change. Inland Waters 6(4), 543–554 (2016)Google Scholar
  11. 11.
    Heathwaite, A.: Multiple stressors on water availability at global to catchment scales: understanding human impact on nutrient cycles to protect water quality and water availability in the long term. Freshw. Biol. 55(s1), 241–257 (2010)CrossRefGoogle Scholar
  12. 12.
    Heinrich, J., Weiskopf, D.: State of the art of parallel coordinates. In: STAR Proceedings of Eurographics, pp. 95–116 (2013)Google Scholar
  13. 13.
    Heinrich, J., Weiskopf, D.: Parallel coordinates for multidimensional data visualization: basic concepts. Comput. Sci. Eng. 17(3), 70–76 (2015). doi: 10.1109/MCSE.2015.55 CrossRefGoogle Scholar
  14. 14.
    Hipsey, M.R., Hamilton, D.P., Hanson, P.C., Carey, C.C., Coletti, J.Z., Read, J.S., Ibelings, B.W., Valesini, F.J., Brookes, J.D.: Predicting the resilience and recovery of aquatic systems: a framework for model evolution within environmental observatories. Water Resour. Res. 51(9), 7023–7043 (2015)CrossRefGoogle Scholar
  15. 15.
    Javed, W., McDonnel, B., Elmqvist, N.: Graphical perception of multiple time series. IEEE Trans. Vis. Comput. Graph. 16(6), 927–934 (2010)CrossRefGoogle Scholar
  16. 16.
    Jennings, E., Jones, S., Arvola, L., Staehr, P.A., Gaiser, E., Jones, I.D., Weathers, K.C., Weyhenmeyer, G.A., Chiu, C.Y., De Eyto, E.: Effects of weather-related episodic events in lakes: an analysis based on high-frequency data. Freshw. Biol. 57(3), 589–601 (2012)CrossRefGoogle Scholar
  17. 17.
    Johansson, J., Forsell, C.: Evaluation of parallel coordinates: overview, categorization and guidelines for future research. IEEE Trans. Vis. Comput. Graph. 22(1), 579–588 (2016). doi: 10.1109/TVCG.2015.2466992 CrossRefGoogle Scholar
  18. 18.
    Johansson, J., Forsell, C., Cooper, M.: On the usability of three-dimensional display in parallel coordinates: evaluating the efficiency of identifying two-dimensional relationships. Inf. Vis. 13(1), 29–41 (2014)CrossRefGoogle Scholar
  19. 19.
    Johansson, J., Ljung, P., Jern, M., Cooper, M.: Revealing structure within clustered parallel coordinates displays. In: Proceedings of the IEEE Symposium on Information Visualization (INFOVIS), pp. 125–132. (2005)Google Scholar
  20. 20.
    Jones, R.C., Graziano, A.P.: Diel and seasonal patterns in water quality continuously monitored at a fixed site on the tidal freshwater Potomac River. Inland Waters 3, 421–436 (2013)CrossRefGoogle Scholar
  21. 21.
    Kelly-Hooper, F.: The water quality of Lake Nipissing and the contributing watershed. The Wilderness Preservation Committee of Ontario, Toronto (2001)Google Scholar
  22. 22.
    Köthur, P., Witt, C., Sips, M., Marwan, N., Schinkel, S., Dransch, D.: Visual analytics for correlation-based comparison of time series ensembles. Comput. Graph. Forum 34(3), 411–420 (2015)Google Scholar
  23. 23.
    Mansmann, F., Fischer, F., Keim, D.A.: Dynamic visual analytics—facing the real-time challenge. In: Dill, J., Earnshaw, R., Kasik, D., Vince, J., Chung Wong, P. (eds.) Expanding the Frontiers of Visual Analytics and Visualization, pp. 69–80. Springer, London (2012)Google Scholar
  24. 24.
    Meyer, M., Sedlmair, M., Quinan, P.S., Munzner, T.: The nested blocks and guidelines model. Inf. Vis. 14(3), 234–249 (2015)CrossRefGoogle Scholar
  25. 25.
    Moreland, K.: Diverging color maps for scientific visualization expanded. Adv. Vis. Comput. 5876, 92–103 (2009)CrossRefGoogle Scholar
  26. 26.
    Morgan, G.E., Bay, N.: Lake Nipissing Data Review 1967 to 2011, pp. 1–46. Ontario Ministry of Natural Resources, North Bay (2013)Google Scholar
  27. 27.
    Munzner, T.: A nested model for visualization design and validation. IEEE Trans. Vis. Comput. Graph. 15(6), 921–928 (2009)Google Scholar
  28. 28.
    Munzner, T.: Visualization Analysis and Design. CRC Press, Boca Raton (2014)Google Scholar
  29. 29.
    Nürnberg, G.K., Molot, L.A., O’Connor, E., Jarjanazi, H., Winter, J., Young, J.: Evidence for internal phosphorus loading, hypoxia and effects on phytoplankton in partially polymictic lake simcoe, ontario. J. Gt Lakes. Res. 39(2), 259–270 (2013)CrossRefGoogle Scholar
  30. 30.
    Paerl, H.W., Huisman, J., et al.: Blooms like it hot. Science 320(5872), 57 (2008)CrossRefGoogle Scholar
  31. 31.
    Prescott, M.: Characterizing mixing and stratification in Lake Nipissing embayments through employment of the lake analyzer package and an analysis of meteorological controls. Master’s thesis, Nipissing University (2015)Google Scholar
  32. 32.
    Rigosi, A., Hanson, P., Hamilton, D.P., Hipsey, M., Rusak, J.A., Bois, J., Sparber, K., Chorus, I., Watkinson, A.J., Qin, B., et al.: Determining the probability of cyanobacterial blooms: the application of Bayesian networks in multiple lake systems. Ecol. Appl. 25(1), 186–199 (2015)CrossRefGoogle Scholar
  33. 33.
    Sips, M., Köthur, P., Unger, A., Hege, H.C., Dransch, D.: A visual analytics approach to multiscale exploration of environmental time series. IEEE Trans. Vis. Comput. Graph. 18(12), 2899–2907 (2012)CrossRefGoogle Scholar
  34. 34.
    Smith, J.P., Hunter, T.S., Clites, A.H., Stow, C.A., Slawecki, T., Muhr, G.C., Gronewold, A.D.: An expandable web-based platform for visually analyzing basin-scale hydro-climate time series data. Environ. Model. Softw. 78, 97–105 (2016)CrossRefGoogle Scholar
  35. 35.
    Theus, M.: Statistical data exploration and geographical information visualization. In: Dykes, J., MacEachren, A.M., Kraak, M.-J. (eds.) Exploring Geovisualization, pp. 127–142. Elsevier, Amsterdam (2005)CrossRefGoogle Scholar
  36. 36.
    Thomas, J.J., Cook, K.A.: A visual analytics agenda. IEEE Comput. Graph. Appl. 26(1), 10–13 (2006)CrossRefGoogle Scholar
  37. 37.
    Tufte, E.: Envisioning Information. Graphics Press, Cheshire (1990)Google Scholar
  38. 38.
    Tuna, G., Arkoc, O., Gulez, K.: Continuous monitoring of water quality using portable and low-cost approaches. Int. J. Distrib. Sens. Netw. 9, 249598 (2013)CrossRefGoogle Scholar
  39. 39.
    Tuna, G., Nefzi, B., Arkoc, O., Potirakis, S.M.: Wireless sensor network-based water quality monitoring system. Key Eng. Mater. 605, 47–50 (2014)CrossRefGoogle Scholar
  40. 40.
    Vanderkam, D.: Dygraphs Javascript charting library. http://dygraphs.com (2006). Accessed 9 Sept 2017
  41. 41.
    Weathers, K., Hanson, P.C., Arzberger, P., Brentrup, J., Brookes, J.D., Carey, C.C., Gaiser, E., Hamilton, D.P., Hong, G.S., Ibelings, B., et al.: The global lake ecological observatory network (GLEON): the evolution of grassroots network science. Limnol. Oceanogr. Bull. 22(3),71–73 (2013)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer Science and MathematicsNipissing UniversityNorth BayCanada
  2. 2.Department of GeographyNipissing UniversityNorth BayCanada
  3. 3.Department of Geography and PlanningUniversity of SaskatchewanSaskatoonCanada
  4. 4.Dorset Environmental Science CentreOntario Ministry of the Environment and Climate ChangeDorsetCanada

Personalised recommendations