Background

The development and manufacture of animal telemetry devices that exploit the Argos satellite data collection and transmission system (CLS-Argos, http://www.argos-system.org/) began at the Sea Mammal Research Unit (SMRU) in the late 1980s [1]. Research and development are on-going by a dedicated group within SMRU (SMRU Instrumentation Group; SMRU-IG) using a package of sensors produced by Valeport Ltd (UK), but a stable version of these tags that has been routinely deployed since 2003 on wide-ranging polar marine mammals, is the Conductivity-Temperature-Depth Satellite Data Relay Logger, or CTD-SRDL (Fig. 1).

Fig. 1
figure 1

Photograph of a CTD-SRDL, with visible hardware components labelled (photograph by Lars Boehme, SMRU). The tag is housed in normal solid epoxy rated either to 500 m or 2000 m depth. Standard sensors include pressure, a wet/dry saltwater switch, temperature, conductivity. The tag has a PC interface, is powered by a primary cell (battery) and the standard version includes an antenna

A precursor of the CTD-SRDL was deployed on beluga whales in 2001 to study the hydrographic conditions in an Arctic fjord [2], in Svalbard, and on Antarctic fur seals at Bird Island, South Georgia [3]. Some of the first studies to use CTD-SRDLs involved the instrumentation of southern elephant seals from South Georgia and Kerguelen [4, 5] illustrating the usefulness of the devices for jointly studying mobile top predator behaviour and ecology and the ocean environment they inhabit [6]. Data from CTD-SRDLs are used not only to learn about the instrumented animals themselves [4, 7, 8], but also to update bathymetric information in poorly surveyed areas [9], collect extensive and sometimes otherwise inaccessible oceanographic data [1015], and have become substantial contributors to the global ocean observing system [1619].

The purpose of this instrument and its data collection protocol is to remotely and autonomously collect and transmit simultaneous hydrographic and behavioural data from air-breathing marine vertebrates. Hence, the protocol is designed for the collection of time-series data over long periods, while being constrained by the physical size of the device and bandwidth for data transfer; both those imposed by the limit time animals are at the surface and by CLS-Argos technology. These constraints necessitate the use of efficient algorithms to produce small parcels of information that have a chance of being received by the ground station once they are transmitted, such as data summaries and abstracted data [2022]. Ultimately, the objective is to achieve the most effective representation of animal behaviour and the in-situ environment, given the constraints on data processing, data storage, and data transmission. A thorough appreciation of the way they are collected and the algorithms by which they are compressed [20, 23] is critical for the meaningful scientific interpretation of the data that are finally received.

CTD-SRDLs are particularly relevant for studies in which long-term datasets are required, and in cases where archival tags can only infrequently be recovered. These instruments have been deployed on a wide range of animals and have been particularly successful in providing oceanographic information from under-sampled locations, such as remote, ice-covered or near-shore regions in polar oceans [19]. The oceanographic data produced by these tags is widely used and are increasingly being made freely available [24]. For example, there is now a web portal [25] for accessing much of the data collected by these tags in a consistent format and standardized accuracy [10]. Here, we document the general structure of the protocol for data collection by CTD-SRDLs and, where useful, give example parameter values. We present a condensed overview of the types of data that are collected, in Table 1. We present schematic representations of the relationships between the different behavioural states (Fig. 3), the protocol for collection of CTD profiles (Fig. 4), and an overview of the possible outcomes as one parcel of information passes through the algorithm (Fig. 5).

Table 1 Data collection and transmission summaries (CTD_GEN_07B)

Review

Aim

In this section we provide a detailed description of the order and hierarchy of events that take place on-board a CTD-SRDL when collecting, processing and sending data, so as to highlight the strengths, limitations and potential applications of the data returned by the instruments. The CTD-SRDL software specification CTD_GEN_07B, developed in 2007 for deployments on southern elephant seals Mirounga leonina is used here as a working example to illustrate the data collection, abstraction and compression routines carried out on-board a CTD-SRDL. Variants of this basic program have been developed since then, but this represents the most generally used protocol for CTD-SRDLs.

To illustrate the problem the protocol tries to solve, consider this challenge: about 100 kB of information are collected each day at 1-4 sec intervals. No more than 1 kB is likely to be received by the ground station. Therefore, data on the parameters of interest need to be fitted into several messages, of fixed size, that are not interdependent. These messages are required by the CLS-Argos to be of fixed size and no acknowledgement of safe receipt is returned to the CTD-SRDL. Although we present here a generalized version of the software protocol, it is one that illustrates the principles, strategies and trade-offs that form the common backbone of the software.

Hardware components

The tag consists of a set of sensors, a microcontroller (programmed mainly in C), a real-time clock, and data storage, linked to a transmitter and antenna to send the information to a satellite. A primary lithium cell to provides operating energy (Fig. 1). The microcontroller manages the overall operation of the tag including the sensors’ operation and duty cycle, collection of data, and data compression, selection and analysis. Monitoring all of these is critical to making the best use of its battery. The standard sensors measure pressure, the temperature and conductivity of seawater. A wet-dry sensor acts as a surface detector to establish when to attempt to transmit [22] (Fig. 1). Variants have included other sensors for measuring environmental variables, such as swimming speed, fluorescence, and oxygen.

Software systems

The software design for CTD-SRDLs strives to overcome a set of constraints imposed primarily by three components of the system: 1) the hardware: constrained by size and limited energy of the battery; 2) the communication system: constrained by limited availability and bandwidth of the satellite system used to relay data to the ground station; and 3) the behaviour of marine animals: adds constraints to other elements by brief and infrequent surfacing, together with extended periods at sea. Details of the interaction of these three factors can be found in Table 2 (also Fig. 2). Aspects of the collection and compression routines for the behavioural and hydrographic data have already been described in [21, 23] and [22], respectively. These are presented succinctly in Tables 3 and 4.

Table 2 Data relay
Fig. 2
figure 2

A diagram of the relationship and trade-offs between the rate of data collection (sampling rate) and transmission rate over the lifetime of a CTD-SRDL. The role of the pre-deployment configuration stage of the protocol is to bring this balance as close to optimal as possible (for details see Additional file 1: Supplementary material S1)

Table 3 Data abstraction with the broken-stick model
Table 4 Data collection by hydrographic sensors

Protocol overview and rationale

The tag can be activated prior to deployment by simulating submergence (allowing a current to flow between the two copper lugs in the front of the tag). Otherwise, the tag is activated automatically after deployment as soon as the wet-dry sensor is submerged, and does not switch off until it runs out of energy. The basic sampling regime generally occurs at a 4 sec interval and consumes little energy, but some sensors, such as conductivity, and depth to a certain extent, cannot be operated continuously and are switched on when needed according to specific parameters (Fig. 4, Additional file 1: Supplementary material S4).

The protocol overview is as follows. First, temperature, pressure and wet/dry sensors are sampled at maximum temporal resolution (i.e., every 4 sec, or every 0.5 sec if within 6 m of the surface), and at maximum precision. Functional models of behaviour are used to structure the data in a biologically intuitive way. Standard behavioural categories are “diving”, “hauled out” and “at the surface” (Table 1 and Fig. 3 for overview, Additional file 1: Supplementary material S2 for details). Data selection and compaction techniques are applied to each data type to reduce the physical memory required to store the data [23] (Table 3). A memory store of appropriate size is created, according to the deployment’s data collection priorities, to hold the data for later transmission. The tag continually monitors the timing (when) and frequency (how many attempts) of data transmission to control energy costs and reach its targeted lifetime. Due to unpredictability in the behaviour of the tagged animals, transmission is to some degree opportunistic. A number of transmission targets can be set during pre-deployment configuration. The tag monitors progress towards the next target and adjusts an hourly transmission quota accordingly (Table 2).

Fig. 3
figure 3

A flow diagram representing the relationship between the behavioural states defined by the protocol and the actions associated with the collection of each data type (for details see Table 1, Additional file 1: Supplementary material S2)

Data are organised into separate data templates called “pages”, each of which must fit into the data space allowed for a single transmission. The way the information is organised on each page is flexible and is tailored to user requirements. Specific parameter values are discussed in the Additional file 1: Supplementary material. Throughout the operation of the instrument, calculations are performed on the information received from the sensors to summarize data using a predefined structure for each type (Table 1 and Additional file 1: Supplementary material). The lifetime of the tag is largely determined by the use of these sensors and by the number of attempted transmissions.

Pre-deployment software configuration

Careful consideration of the data collection priorities prior to tag deployment is vital for getting the most useful information out of each deployment. Different studies may address a wide variety of questions from a broad range of species, so the controlling software must be easily adapted to change the balance of data sent and the rate of transmission. This need for flexibility demands complex data collection software. Sensor activation, sampling rates and duty cycles are controlled by parameters that can be modified by fixed or dynamic schedules. Program specifications for each deployment are adjusted specifically in accordance with user requirements based on the data collection priorities, the behaviour of the target species, the questions of interest, and the range of hydrographic conditions the animal is expected to encounter. Only one data type can be maximised in each deployment. The amount of all other information likely to be received is then adjusted according to the constraints put in place by having to maximise the chosen data type.

Several steps are involved in balancing the data collection priorities with what is feasible, given the species and the likely conditions the animal platform will encounter. First, the number of events of each type (CTD record, dive record, haulout record, cruise record, behavioural data summary record, Table 1, Figs. 3, 4 and 5) that will occur over the deployment is estimated according to the biology of the study species. The probability of receiving any one transmission varies geographically, decreasing with proximity to competing transmitters and, because CLS-Argos satellites are polar orbiting, distance from the poles. Reception probability is estimated for each deployment based on its expected geographical location and extent. To configure the software to collect the desired data, the user needs to specify the minimum proportion of events (e.g. 10 % of dives, 90 % of summaries) that it is acceptable to receive during the deployment. This sets a lower bound on the resolution at which different data types will be collected. It is then possible to decide which quantity to maximise (e.g. number of CTD profiles expected to be received). Next, template pages are created for the dive, summary, and CTD data types, specifying the number of bits needed to achieve the required resolution. The required resolution of each data type is based on the study’s priorities, which dictate the proportion of all transmissions that should be devoted to each data type. Each page may contain several data items within the maximum of 247 bits available for a single transmission. The proportion of transmissions assigned to each template and the CTD sampling rate are modified such that the target quantity is maximised and other data types each achieve the minimum number of receptions. Note that changing the CTD sampling rate affects the energy available for transmissions and hence the expected number of data items received. The sampling rate is tuned until an optimal balance between sampling and transmission is achieved.

Fig. 4
figure 4

A flow diagram representing the collection of CTD data (for details see Table 4, Additional file 1: Supplementary material S4)

Fig. 5
figure 5

A flow diagram representing the passage of a given data parcel through the protocol. If the criteria for a collection of a CTD profile are not met, no actions are carried out

Sometimes there is no satisfactory solution to the trade-off between the CTD sensors’ sampling rate and the expected number of data items received, e.g. if battery capacity is inadequate (for long deployments), or the additive effect of the animal's surfacing behaviour and CLS-Argos transmission frequency restrictions do not allow enough transmissions (for short deployments). In these cases, the constraint on the number of receptions can be relaxed to reduce the required battery power, or two CLS-Argos transmitter id’s can be used to increase transmission frequency.

An important assumption is made with regard to data relay by CTD-SRDLs; namely that all pages are equally likely to be transmitted and received and that there is no spatial or temporal bias in the type of data that are transmitted and subsequently received. Spatial and temporal bias can arise when one type of behaviour is carried out intensively (for example, if a long bout of diving is common prior to a long surface period), and when satellite coverage is particularly good or bad, in one area or for a time. Data are held in memory “buffers” to try to make this assumption approximately valid (Tables 2 and 5, Additional file 1: Supplementary material S1). The use of memory buffers makes the received data more representative of behaviour across the whole range of spatial and temporal locations visited during a deployment.

Table 5 Dive records

Software procedures during deployment

During operation, the information received from the tag’s sensors triggers processes that extract and summarize data appropriately for each data type (Table 1, Fig. 3 and Additional file 1: Supplementary material S2, S3 and S4), and regulate the transmission scheduling of the tag, so as to take account of remaining battery power (Table 2). The behavioural model, parameterised appropriately for the study species (for parameter values routinely used for elephant seals see Table 1, Fig. 3; also Additional file 1: Supplementary material S2), is used to partition behaviour into three mutually exclusive and exhaustive states of haulout, dive, and surfacing behaviour. Behaviour is summarised over blocks of time, typically 4 or 6 hrs, to allow diurnal patterns to emerge. The summary statistics for each time block represent the entirety of the data recorded during that period, not a statistical sample. A subset of dive records is transmitted with the summaries, providing a more detailed representation of behaviour in relation to environmental information, within the context of the summary information. The received dive records are a subset because not all records that are collected and transmitted are received. It is assumed that the dives in the transmission buffer provide an unbiased sample of dives, from which dives are missing at random (Table 5, Additional file 1: Supplementary material S3).

When dives and hydrographic profiles are collected the raw data are submitted for further processing using a broken-stick model (BSM) [23, 26] (Table 3, Fig. 3). For dives, four iterations of the algorithm are carried out resulting in six times and six depths, including two surface points at the beginning and the end of the dive (Table 5, Additional file 1: Supplementary material S3). For hydrographic data, oceanographic requirements can, in some cases, determine specific depths at which temperature and salinity points must be collected, otherwise data are processed by BSM the same way as XBT casts [22, 26] (Additional file 1: Supplementary material S4). All numeric samples are coded as a proportion of each variable’s range and stored together with the minimum and maximum values. Digitising information in this way reduces the memory required to store the data records and allows them to be fitted as closely as possible into the prescribed message size for transmission.

After compression, related data types are grouped into pages according to the page templates (e.g., dives are stored in groups of three) and stored in the transmission buffer (Additional file 1: Supplementary material S1). In addition, nine bits of error checking code are interleaved in each page to allow most errors in received transmission to be detected. Each page must fit into the bit limits (239 + 9 = 248 bits) of a single CLS-Argos message. Pages of data are drawn from the buffer for transmission in proportion to the desired likelihood of reception specified in the pre-deployment configuration. This is necessary because there is no two-way communication to indicate whether a given page of data has been successfully received by the satellite.

Avoiding transmission bias

An additional technique is used to avoid transmissions being biased towards periods of better satellite coverage or changes in the animal’s behaviour that favour transmission. Typically, five different pages are produced and stored in the transmission buffer, ready for transmission by CLS-Argos. Two types of page (labelled here: 1a and 1b) contain detailed information about dives, one page contains the summary data (2) including behavioural information regarding any haulouts and cruises for one summary period, and the two last pages each contain part of a CTD profile (3 and 4) because the hydrographic data are spread over two separate messages (fixed points and broken-stick points, when applicable). Pages are selected for transmission in a prescribed sequence according to their type (for example: 1a 1b 2 3 4 3 4) in order to achieve the correct proportion of each type of information as described above. The sequence of selection repeats indefinitely, cycling through the information in the transmission buffer.

When it is time for a transmission, the page template is applied to the next unexpired item in the appropriate circular buffer. Sending pages more often increases their probability of reception. An item may remain in the buffer for many of these cycles before being discarded when it reaches its “sell-by” date (e.g. each summary message is transmitted 40 times on average to achieve a 95 % probability of being received, for more detail see Additional file 1: Supplementary material S1). The independence of the insertion process (as the data are collected and processed) and the extraction process (as the data are selected from the buffer and transmitted) ensures that the likelihood of data being received is not biased by the animal’s behaviour at the time of transmissions. The effect is that of randomizing the information in the buffer so that each data item is equally likely to be received.

Conclusions

The CTD-SRDL is an animal telemetry device whose hardware and software have been developed over the course of more than two decades. Its soft- and hardware systems are purpose-built for the remote collection of data on the biology of marine animals and their environment, and this is the first time they have been holistically reviewed. The evolution of the instrument is on-going, driven both by biological and oceanographic questions. The field of animal telemetry has expanded dramatically since the first prototype predecessor of a CTD-SRDL was deployed and transmission technologies have diversified substantially. Nonetheless, the CTD-SRDL is still the most commonly used instrument for collecting information about widely ranging species in high latitude areas. Almost 500 000 CTD and 5 million dive profiles respectively, have been collected to date by CTD-SRDLs since 2003 (SMRU-IG unpublished data, June 2015). These CTD profiles are not only numerous but they are also of global importance. They have contributed substantially to the World Ocean Database [29] by providing hydrographic information from areas never directly surveyed [16, 24] and now comprise 70 % of all oceanographic profiles ever collected south of 60 °S, with increasing contribution in the Arctic [19]. At the poles, the CLS-Argos system continues to be the most effective satellite transmission platform for animal data, particularly when a) the animals do not spend long enough periods at the surface or hauled out to allow the use of the Iridium communications system, or b) when they do not haul out in areas that allow the use of other transmission technologies such as GSM. The species that have been instrumented with CTD-SRDLs most often include southern and northern elephant seals (M. leonina and M. angustirostris), Weddell seals (Leptonychotes weddellii), grey seals (Halichoerus grypus), hooded seals (Cystophora cristata) and several species of sea turtle.

The outlook on future development of CTD-SRDLs, similarly to other animal telemetry devices, includes reducing the size and increasing the battery life of the device. Neither of these is a trivial challenge. Reduction in size is severely limited by the size of the conductivity and temperature sensors, while making the tag more energy efficient depends on how much energy the sensors are using as well as the overall energy availability provided by the battery. The addition of novel sensors, such as oxygen and pH, which would deliver highly desirable and previously unobserved datasets, is most often at odds with reducing the size and energy requirements of the device. These sorts of developments are constantly being researched but involve implicit challenges imposed by the existing framework of limited memory and energy as well as stringent bandwidth constraints.

The trade-off between precision and bias, highlighted above, is equally relevant here as it is in other applications of optimisation. Maximising timing and frequency of data transmission is important, but not if the most informative bits of data (i.e., precision) are lost in the process. The key objective for this protocol is to collect the most relevant data while dealing with the sometimes strict constraints of technology and animal behaviour. The dynamic modification of the transmission rates on-board telemetry devices, either due to extrinsic or intrinsic factors affecting transmission could result in bias. The use of memory buffers, in the case of CTD-SRDLs, is intended to ameliorate this but may, as a consequence, mask the degree of intensity of behaviours, which is in turn important information regarding the animal platform. Due to multifaceted objectives and trade-offs, the usefulness of any device for the remote collection of animal observations will only be as good as the overlap between the capabilities of the device, the specific behaviour of the animals and the scientific questions sought to be answered using the resulting data. We hope that this review will promote appropriate usage of both historical CTD-SRDL data and data that continue to be collected, as well as contribute to the on-going development of animal telemetry devices.