SwarmSight: Measuring the temporal progression of animal group activity levels from natural-scene and laboratory videos
We describe SwarmSight (available at https://github.com/justasb/SwarmSight), a novel, open-source, Microsoft Windows software tool for quantitative assessment of the temporal progression of animal group activity levels from recorded videos. The tool utilizes a background subtraction machine vision algorithm and provides an activity metric that can be used to quantitatively assess and compare animal group behavior. Here we demonstrate the tool’s utility by analyzing defensive bee behavior as modulated by alarm pheromones, wild-bird feeding onset and interruption, and cockroach nest-finding activity. Although more sophisticated, commercial software packages are available, SwarmSight provides a low-cost, open-source, and easy-to-use alternative that is suitable for a wide range of users, including minimally trained research technicians and behavioral science undergraduate students in classroom laboratory settings.
KeywordsInsects Birds Bees Cockroaches Group activity Motion detection Activity detection Activity levels Video motion analysis Natural scenes Activity change Automated behavior assessment Software SwarmSight
Measuring the complex behavior of animals that often act in groups has been a challenge. In classic work, Altmann (1974) discussed the strategies of managing the complexity of assessing social animal behavior by, among other things, using randomly timed sampling sessions and limiting the number of continuously tracked (focal) animals. Since then, tracking individual animals and humans using video from digital cameras has become relatively straightforward. However, the automated assessment of the activity of large groups of animals in their natural environments has yet to see the same advances as individual tracking. Although manually assessing group movement is usually possible, automating this process with computer vision software can greatly increase the speed of data collection and expose the details of group movement structure to an extent that was not available before. Here, we briefly cover the progress of group activity detection software and how a new tool—SwarmSight—advances the state of the art.
Many animals are social and exhibit complex group behavior. For example, some fish gather together into shoals while migrating, searching for food, and avoiding predators (Cushing & Jones, 1968). Similarly, many birds accomplish similar goals and optimize group aerodynamic efficiency through flocking (Higdon & Corrsin, 1978). Insects like locusts form swarms as a response to overcrowding (Collett, Despland, Simpson, & Krakauer, 1998), and bees, wasps, and termites swarm when searching for a new colony (McGlynn, 2012). Some insect colonies, like those of stingless bees, release many individuals to form a swarm to defend against raiding species (David, 2006). These examples demonstrate that group dynamics play an important role when studying animal feeding, migratory, and defensive behaviors.
Group behavior is also present in humans, who interact with each other in a group setting as pedestrians, as drivers in car traffic, and as shoppers in retail establishments. The latter two categories play important economic roles in modern human society. For example, in 2010 traffic congestion in the U.S. resulted in economic losses of approximately US$101 billion (Lindsey, 2012), whereas in 2015, U.S. non-online sales totaled approximately US$4.4 trillion (U.S. Department of Commerce, 2015). Humans also form complex social relationships and behave differently when they are members or leaders of larger groups (Dyer, Johansson, Helbing, Couzin, & Krause, 2009). Insights into some aspects of human leadership can be gained from studying animal and agent-based computer models (Quera, Beltran, & Dolado, 2010), which are easier to manipulate experimentally. Thus, the tools that are used to assess animal group behavior can be applied to study specific human group behaviors.
With minimal use of technology, entomologists, ornithologists, and psychologists can visually observe movement or flight behaviors, use handheld counters, and record their scores and observations in computer worksheets (Altmann, 1974; Boch, Shearer, & Petrasovits, 1970; Wyatt, 1997). However, when the number of individuals in the group is too large to be tracked reliably by a human observer, only more extreme manifestations of the behavior of interest may be feasible to track. For example, when assessing defensive bee behavior, sting attacks may be recorded, whereas changes in fairly stereotypical, erratic flight behavior might be scored only qualitatively (Jones et al., 2012; Pickett, Williams, & Martin, 1982). However, this method does not provide for a detailed analysis of the temporal progression of erratic flight behavior. To obtain more precision, an individual animal’s behavior can be tracked and assessed by means of watching videos of the recorded behavior and manually scoring behavior on a frame-by-frame basis (Dyer, Johansson, Helbing, Couzin, & Krause, 2009; Grüter, Kärcher, & Ratnieks, 2011). However, that type of scoring is tedious, time consuming, and error prone.
Computer systems have been used to automate some of these tasks and do not suffer from cognitive load, attention, subjectivity, and fatigue limitations (B. R. Martin, Prescott, & Zhu, 1992; P. Martin & Bateson, 1993; Noldus, Spink, & Tegelenbosch, 2001; Olivo & Thompson, 1988; Spruijt & Gispen, 1983). Despite the advantages, existing motion detection software packages continue to have significant drawbacks that limit their usefulness in studying animal group behavior.
Some software is not designed for scientific research applications or requires programming knowledge. One way to assess group activity automatically is to extract the motion component from a prerecorded or live video. Software like iVMD (IntelliVision, 2015) can be used in such way, but it is designed for efficiently reviewing surveillance footage. Video players like VLC media player (VideoLAN, 2015) can also show regions that change from frame to frame. However, without programming against the tool’s application programming interface, the frame-by-frame motion data is difficult to access. Similarly, general-purpose scientific computing packages like MATLAB (MathWorks, 2015) or Python (Python Software Foundation, 2015) have been used to extract motion data (Hashimoto, Izawa, Yokoyama, Kato, & Moriizumi, 1999; Ramazani, Krishnan, Bergeson, & Atkinson, 2007; Togasaki et al., 2005), but also require programming knowledge.
Other software would be impractical to use with natural-scene videos. OpenControl (Aguiar, Mendonça, & Galhardo, 2007) is open-source software that uses a background subtraction algorithm to detect movement and can be used to track the locations of single animals and control maze actuators. However, the software requires setting a motionless reference frame, against which all other frames are compared. This makes the software difficult to use in natural environments like forests or open fields, where ambient lighting conditions can change due to wind or clouds, and thus shift the reference frame.
Individual-tracking software can become computationally expensive when extended to large groups of individuals. MCMC (Khan, Balch, & Dellaert, 2006), another software package implementing accurate algorithms for tracking multiple targets, has been tested with groups of up to 20 individuals, but it may not be computationally practical for more numerous groups. If many individuals enter and leave the video scene, tracking them with such software may not be more useful than simple motion data extraction.
Finally, available commercial software can be expensive and is generally not open-source. EthoVision (Noldus et al., 2001) is a sophisticated software package with an activity detection feature (Noldus Information Technology, 2015), which has been used to assess the locomotor effects of insecticide on carabid beetles (Tooming, Merivee, Must, Sibul, & Williams, 2014) and the effect of antiviral drugs on the locomotor activity of ferrets (Oh, Barr, & Hurt, 2015). Though successfully applied, the EthoVision software is expensive, and its source code is not available for inspection and modification.
To address the problems of cost, platform availability, and customization, and to create a uniquely tailored user-friendly interface for assessing the temporal progression of animal group activity in natural environments, we created the open-source SwarmSight, which runs on Microsoft Windows and works with a wide range of video formats.
In the following sections, we describe the algorithm that SwarmSight implements, and demonstrate its validity for detecting motion in a battery of synthetic motion videos. We then demonstrate a wide range of possible scientific applications by applying SwarmSight to the assessment of the group activity of stingless bees, wild birds, and hissing cockroaches.
Assessing movement activity with SwarmSight
SwarmSight algorithm and implementation
To compute the activity metric of a video, the software uses the ffmpeg (Bellard, 2015) library to extract frames from video files. One major advantage of SwarmSight is that this underlying, open-source library supports over 50 video codecs (Bellard, 2015) and enables our software to read a wide range of video file formats, including the common .mov, .avi, and .mp4.
In other words, for each frame, count the pixels at which the color change from the previous frame exceeds the user threshold. This results in an activity metric that is highly correlated with scene motion. The threshold value t is user-selected, and its optimal value highly depends on the environment depicted in the video. A low threshold will amplify any background movements or even video compression artifacts, whereas a high threshold may diminish the desired signal. We demonstrate how to find an optimal threshold in Experiment 2 below.
The limitations of the software stem from the fact that it measures the aggregate movement of all objects in the video scene. This may be undesirable if the movement of some objects cannot be excluded with a different recording angle or the use of the “region-of-interest” feature (see below). It should be noted that, unless the video scene contains only one individual, the software is not designed to measure individual animal activity levels, but aggregate, group activity levels instead.
When analyzing the activity metric produced for each video, researchers using the software should treat the activity metric as a relative measure. The absolute value of the metric provides a useful metric only when compared to the value obtained from other parts of the video or of videos captured under the same conditions. For example, a close-up video of a scene will result in more pixels changing per frame, whereas a zoomed-out video of the same scene will register fewer pixels changed per frame. However, if over the course of the video the camera perspective and motion threshold parameter are not changed, the progression of pixels changed per frame will be isolated to the progression of movement in the scene.
Low wind conditions
If the video depicts flying insects, the background wind should be minimal. Background wind can cause the flying insects to move involuntarily, which may affect the activity metric. Additionally, if the scene contains undesirable objects that are moving due to low wind and are confined to a part of the scene (e.g., moving leaves in a corner), the “Region of Interest” tool could be used to exclude this undesirable motion. In Experiments 2 and 3 below, we are able to use the tool to exclude peripheral moving leaves, branches, and sponges. We note that if the target and undesirable objects overlap, the tool is not able to exclude the undesirable movement. Depending on demand from the scientific community, we could add other types of region inclusion/exclusion features in future versions of the software.
Stationary, stable video
The metric will register any movement, including camera movements, changes in perspective, or zoom. To ensure that the metric reflects only the desired movement, the videos should be shot on a tripod or other stable surface, so that only the objects whose motion is being assessed are moving.
After the video starts playing, the user can change the activity settings (Fig. 2) and also select the Highlight Motion checkbox to see which pixels are changing. This option can be used to easily spot fast moving objects like flying insects and can be used as an aid to counting them manually.
Minimally compressed preference
Ideally, the videos should be minimally compressed. Higher compression tends to introduce compression artifacts, which under low-threshold conditions appear as changed pixels, confounding the activity metric. Native compression by most digital cameras is acceptable.
Setting up and using the software
Installation and system requirements
The software can be freely downloaded and installed from https://github.com/justasb/SwarmSight. It is a Microsoft .NET application written in C#. The code is open-source and available at https://github.com/justasb/SwarmSight/tree/master/Source. The software runs on Microsoft Windows with the .NET 4.5 framework installed.
The software uses the ffmpeg library (Bellard, 2015), which supports a very wide variety of video file formats. The video formats produced by most digital cameras are supported.
Once the software starts, the user can select the video file to analyze, set the sensitivity threshold, and draw a rectangular region of interest to exclude unnecessary or spurious movements.
As the video plays, a chart below the video screen displays the frame-by-frame logarithm of the activity metric. Using the logarithm reduces the effect of extreme movement events (the linear metric is preserved in statistics calculations). To the right of the chart, the raw frame activity data can be exported into a comma-separated value (CSV) file for further processing. The CSV file is saved in the same directory as the video file, with the same filename as the video file, but ending with the .csv extension. CSV files can be opened with Excel, R, MATLAB, and most other statistical software packages.
To verify that the tool’s activity metric is correlated with object motion, we assessed motion using both synthetic and live field videos. In addition, we demonstrate that the tool can be used to detect and quantitatively compare insect swarm activity differences, track the progression of swarm activity over time, and reveal interesting temporal dynamics in the feeding behavior of birds and the nest-finding behavior of hissing cockroaches.
Experiment 1: Synthetic motion videos
The first test performed as expected. At a threshold of 1 (Fig. 3, center), when a 5 × 5 box appears, a change of 25 pixels was registered. When the box moved 1 pixel, a 10-pixel change was detected, which consisted of a 5-pixel change of the leading edge of the box and a 5-pixel change of the trailing edge that the box had previously occupied. As the threshold increased to 64, the software registered 15 pixels when the box appeared, reflecting the top two rows of the box: 100 % black and 50 % black. At a threshold of 128, only the 5-pixel top black row registered. At threshold 255, no pixels registered. The changes produced by the moving box were all registered correctly.
The second test with multiple moving objects also demonstrated the expected behavior. As the first 5 × 5 box started to move, the software registered a 10-pixel/frame change (Fig. 4). As each additional object started to move, the activity metric increased by 10 pixels/frame, showing a change of 30 pixels/frame when all three objects were moving.
Experiment 2: Threshold selection in stingless bee video
The results demonstrated that the software can be used to distinguish flying insects from the background. In Fig. 5, the control, unprocessed video frames can be seen on the far right, with the motion-detected frames arrayed to the left at various thresholds, showing detected motion in yellow.
Specifically, note that when using low thresholds (Fig. 5, left, 5), the software registers noise that is not indicative of insect motion, and when using high thresholds (Fig. 5, right, 100), it does not emphasize motion enough. At medium-range thresholds, the bees are easily visible. For example, in Fig. 5, at threshold 20 in the top row, showing the zoomed-in view, the bee wings and their bodies are outlined in yellow. In contrast, at threshold 50 in the bottom row, the small bees are visible as yellow dots. As we mentioned in the previous section, because the threshold can be chosen by the user, a researcher can play a video in question and adjust the threshold via the slider to discover the optimal threshold for each video. Once discovered, the same threshold can be used to compare the activity present in several videos shot under the same conditions.
Experiment 3: Detection of change in activity and its progression
SwarmSight software was used to assess how flying-insect activity changes over time in response to a treatment. Here we examined a video of an entrance to a T. angustula hive before and after a treatment with a chemical mixture of the stingless bee’s alarm pheromone: Citral (Sigma Aldrich, B1334), 6-methyl-5-hepten-2-one (Sigma Aldrich, M48805), and benzaldehyde (Sigma Aldrich, B1334) (Wittmann, Radtke, Zeil, Lübke, & Francke, 1990). We used the SwarmSight software to assess bee swarm activity during the 30-s video and observe its progression over time. To confirm that the activity metric was proportional to the actual flying bees, we compared the SwarmSight results to the number of visible flying bees in the video frames sampled at 1-s intervals.
To test whether the software can detect significant changes in behavior from nest entrance videos, we used the software to analyze 107 T. angustula nest entrance videos, consisting of control and treatment groups. The control groups were administered behaviorally inert mineral oil (Breed, Guzmán-Novoa, & Hunt, 2004), whereas the treatment groups were administered the alarm pheromone mixture. The average activity metrics of the two groups were compared.
To demonstrate that the software can be used for measuring the progression of group activity of birds and of other insects, we used videos of wild birds and hissing cockroaches. The wild-bird video showed rock doves (Columbia livia), mourning doves (Zenaida macroura), and Gambel’s quail (Callipepla gambelii) discovering and consuming newly placed wild-bird food (Global Harvest Foods, Ltd., 2015), which was spread evenly across a 6-foot × 6-foot area on concrete pavement. This video was chosen because it contained a period without any birds as well as three phases showing waves of arriving birds. To demonstrate that SwarmSight can detect such arrival waves, the activity metric was expected to contain distinct, corresponding increases in activity during each arrival wave.
Finally, to demonstrate that the software can be used with nonflying animals, we selected a video of a group of Madagascar hissing cockroaches (Gromphadorhina portentosa) locating an artificial nest. The video scene displayed a round dish containing four insects with a covered center area: the artificial nest. The insects were placed outside of the nest and allowed to crawl freely. The video contained two phases of insect activity separated by a long phase of relative dormancy. We expected the SwarmSight results to indicate the three phases.
Experiment 4: Performance assessment
To assess the performance of SwarmSight, we measured the frames per second processed by the program. The threshold was set to 17, quality to 100 %, and the testing system was an Apple Macbook Air laptop with a 2-GHz Intel Core i7 processor, 8 GB RAM, running Windows 7 Professional on a Parallels Desktop virtual machine.
Numbers of frames per second processed by SwarmSight in response to videos with different resolutions
Hissing cockroaches (scaled down)
Bee zoomed out
SwarmSight measures movement in a video scene and was demonstrated to automate quantitative assessment of the temporal progression of the activity levels of flying-insect swarms, bird flocks, and animal groups. It was also shown to be useful for detecting hard-to-see flying insects and to assist with traditional methods of insect tracking and counting. Generally, the software is most useful in situations in which changes of aggregate movement over time in a region of a video provide information that is valuable to investigators.
In addition to assessing insect and bird group behavior, the software should be useful in assessing human group behavior. For example, a video filmed from an upper floor of a building of pedestrians walking on a street or a sidewalk could be analyzed using the software. If the pedestrian traffic is low, individual pedestrians could be distinguished by sudden increases and subsequent decreases of pixel change activity. During high traffic flow, the changes in relative pedestrian activity levels over time could be assessed and compared. The same method for measuring street pedestrian traffic could be used to measure walking shopper behavior in retail stores or malls. For example, a section of a hallway or an isle could be video-recorded and baseline shopper movement activity assessed. Various interventions, such as the placement or alteration of signage, could then be assessed by comparing their effects on the video activity levels. A similar technique could be used to assess and track the changes in activity of automobile traffic.
Furthermore, SwarmSight could be used to test theoretical models of swarm or large-group behavior. For example, a group behavior model described by Couzin, Krause, Franks, and Levin (2005) predicts that only a small fraction of individuals in a group need to possess information in order to influence the behavior of the larger groups. For example, in Dyer et al. (2009), just 5 % of the individuals in a large, 200-person group needed to be informed for the larger group to converge onto the correct target. The videos recording the movement of the individuals could be analyzed using SwarmSight. Analysis of a video of the whole group would be expected to show the group movement activity over time. This would reflect the aggregate speed of the group and show when the group started and stopped moving. Additionally, any transient increases or decreases of the group speed would be reflected in the pixel change signal. Such analysis could complement manually performed measures of the group movement and its change over time.
The leadership behavior in simulated agent-based flocks, such as were seen in Quera et al. (2010), may also be assessed with SwarmVision. Though computer simulations benefit from the ability to observe any of the simulated state variables (Kelton & Law, 2000), some very large or complex simulations may not be feasible or cost-effective to resimulate. In such cases, if a video of the completed simulation is available and the progression of movement in a region of the video contains information that is useful to investigators, then our software could be used to analyze it. In this way, an aggregate-movement metric could be extracted from a video of a complex simulation without rerunning it.
One advantage of SwarmSight is the simple, easy-to-use interface that makes it possible to learn to use the tool very quickly and enables its use as a teaching tool. For example, graduate and undergraduate students in an Animal Behavior Methods class at Arizona State University received a 10-min presentation about bees and the software. Following the presentation, the students were able to use the software effectively to analyze a stingless bee nest raid video. The software presentation and the example video can be downloaded from https://github.com/justasb/SwarmSight.
Overall, SwarmSight provides a powerful, yet easy-to-use, tool for assessing motion from natural and laboratory video scenes that has applications in behavioral experiments, as well as serving as a teaching tool for classroom-based behavioral experiments.
In the future, the authors plan to add additional filters and to incorporate machine-learning classifiers for more advanced insect tracking and behavior classification, and to utilize onboard graphics processing units (GPUs) to increase performance. For example, supplying a support vector machine (Cortes & Vapnik, 1995) trained on previously labeled examples of target insects, birds, humans, or other animals with pixel color and motion data could enable the software to count the number of target individuals per frame. In some cases, this metric would be even more useful than the aggregate-movement metric of the current version. Furthermore, a GPU implementation of the above algorithm might execute one or two orders of magnitude faster than the CPU implementation (Asano, Maruyama, & Yamaguchi, 2009).
J.B. was supported by an Arizona State University Interdisciplinary Graduate Program in Neuroscience Fellowship. C.M.J. was supported by Arizona State University and by a Smithsonian Tropical Research Institute Short Term Fellowship Award. B.S. was supported by NIH Grant No. NIH-NIDCD (DC007997). We thank David W. Roubik for providing the Tetragonisca angustula bee hives, which C.M.J. filmed for part of this experiment. We also thank Steven Pratt for the Madagascar hissing cockroach videos, and Karen Hastings for the wild-bird videos. Finally, we thank the anonymous reviewers, who helped to improve the earlier version of this article. We declare no known conflicts of interest.
- Asano, S., Maruyama, T., & Yamaguchi, Y. (2009). Performance comparison of FPGA, GPU and CPU in image processing. In M. Daněk, J. Kadlec, & B. Nelson (Eds.), Proceedings of the 19th International Conference on Field Programmable Logic and Applications (pp. 126–131). Piscataway, NJ: IEEE Press.Google Scholar
- Bellard, F. (2015). Ffmpeg documentation. Retrieved on 28 April, 2015, from https://www.ffmpeg.org/documentation.html
- Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.Google Scholar
- Global Harvest Foods, Ltd. (2015). Audubon park wild bird food. Retrieved 28 April, 2015, from www.ghfoods.com/product.asp?pc=2120
- Higdon, J., & Corrsin, S. (1978). Induced drag of a bird flock. American Naturalist, 727–744.Google Scholar
- IntelliVision. (2015). Intelligent video motion detector. Retrieved from 28 April, 2015, www.intelli-vision.com/products/intelligent-video-analytics/intelligent-video-motion-detector
- Jones, S. M., van Zweden, J. S., Grüter, C., Menezes, C., Alves, D. A., Nunes-Silva, P., . . . Ratnieks, F. L. (2012). The role of wax and resin in the nestmate recognition system of a stingless bee, Tetragonisca angustula. Behavioral Ecology and Sociobiology, 66, 1–12.Google Scholar
- Kelton, W. D., & Law, A. M. (2000). Simulation modeling and analysis. Boston, MA: McGraw Hill.Google Scholar
- MathWorks. (2015). MATLAB—The language of technical computing. Retrieved 28 April, 2015, from www.mathworks.com/products/matlab/
- Noldus Information Technology. (2015). Gathering data. Retrieved 28 April, 2015, from www.noldus.com/EthoVision-XT/Gathering-data
- Python Software Foundation. (2015). Welcome to python.org. Retrieved from 28 April, 2015, https:/www.python.org/
- Togasaki, D. M., Hsu, A., Samant, M., Farzan, B., DeLanney, L. E., Langston, J. W., . . . Quik, M. (2005). The webcam system: a simple, automated, computer-based video system for quantitative measurement of movement in nonhuman primates. Journal of Neuroscience Methods, 145, 159–166.Google Scholar
- Tooming, E., Merivee, E., Must, A., Sibul, I., & Williams, I. (2014). Sub-lethal effects of the neurotoxic pyrethroid insecticide Fastac® 50ec on the general motor and locomotor activities of the non-targeted beneficial carabid beetle Platynus assimilis (Coleoptera: Carabidae). Pest Management Science, 70, 959–966.CrossRefPubMedGoogle Scholar
- U.S. Department of Commerce. (2015). Quarterly retail e-commerce sales 3rd quarter 2015. Retrieved 28 November, 2015, from https://www.census.gov/retail/mrts/www/data/pdf/ec_current.pdf
- VideoLAN. (2015). VideoLAN—Official page for VLC media player, the open source video framework! Retrieved 28 April, 2015, from www.videolan.org/vlc/
- Wittmann, D., Radtke, R., Zeil, J., Lübke, G., & Francke, W. (1990). Robber bees (Lestrimelitta limao) and their host chemical and visual cues in nest defense by Trigona (Tetragonisca) angustula (Apidae: Meliponinae). Journal of Chemical Ecology, 16, 631–641. doi: 10.1007/BF01021793 CrossRefPubMedGoogle Scholar
- Wyatt, T. D. (1997). Methods in studying insect behaviour. In D. R. Dent & M. P. Walton (Eds.), Methods in ecological and agricultural entomology (pp. 27–56). Wallingford, UK: CAB International.Google Scholar