Abstract
Tracking individuals in a fish school with video cameras is one of the most effective ways to quantitatively investigate their behavior which is of great value for biological research. However, tracking large numbers of fish with complex non-rigid deformation, similar appearance and frequent mutual occlusions is a challenge task. In this paper we propose an effective tracking method that can reliably track a large number of fish throughout the entire duration. The first step of the proposed method is to detect fish heads using a scale-space method. Data association across frames is achieved via identifying the head image pattern of each individual fish in each frame, which is accomplished by a convolutional neural network (CNN) specially tailored to suit this task. Then the prediction of the motion state and the recognition result by CNN are combined to associate detections across frames. The proposed method was tested on 5 video clips having different number of fish respectively. Experiment results show that the correctness of their identities is not affected by frequent occlusions. The proposed method outperforms two state-of-the-art fish tracking methods in terms of 7 performance metrics.
Similar content being viewed by others
References
Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. In: IEEE Conference on computer vision and pattern recognition, 2008. CVPR 2008. IEEE, pp 1–8
Arulampalam MS, Maskell S, Gordon N, Clapp T (2002) A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans Signal Process 50(2):174–188
Bercla J, Fleuret F, Fua P (2006) Robust people tracking with global trajectory optimization. In: 2006 IEEE Computer society conference on computer vision and pattern recognition, vol 1. IEEE, pp 744– 750
Bruyndoncx L, Knaepkens G, Meeus W, Bervoets L, Eens M (2002) The evaluation of passive integrated transponder (pit) tags and visible implant elastomer (vie) marks as new marking techniques for the bullhead. J Fish Biol 60(1):260–262
Butail S, Paley DA (2012) Three-dimensional reconstruction of the fast-start swimming kinematics of densely schooling fish. J R Soc Interf 9(66):77–88
Chen Y, Yang X, Zhong B, Pan S, Chen D, Zhang H (2015) Cnntracker: online discriminative object tracking via deep convolutional neural network. Appl Soft Comput
Ciresan D, Giusti A, Gambardella L M, Schmidhuber J (2012) Deep neural networks segment neuronal membranes in electron microscopy images. In: Advances in neural information processing systems, pp 2843–2851
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer society conference on computer vision and pattern recognition, 2005. CVPR 2005, vol 1. IEEE, pp 886–893
Delcourt J, Becco C, Ylieff M, Caps H, Vandewalle N, Poncin P (2006) Comparing the ethovision 2.3 system and a new computerized multitracking prototype system to measure the swimming behavior in fry fish. Behav Res Methods 38(4):704–710. doi:http://dx.doi.org/10.3758/BF03193904
Delcourt J, Ylieff M, Bolliet V, Poncin P, Bardonnet A (2011) Video tracking in the extreme: a new possibility for tracking nocturnal underwater transparent animals with fluorescent elastomer tags. Behav Res Methods 43(2):590–600
Delcourt J, Denoël M, Ylieff M, Poncin P (2013) Video multitracking of fish behaviour: a synthesis and future perspectives. Fish Fish 14(2):186–204
Fan J, Xu W, Wu Y, Gong Y (2010) Human tracking using convolutional neural networks. IEEE Trans Neural Netw 21(10):1610–1623
Fontaine EI (2008) Automated visual tracking for behavioral analysis of biological model organisms. Ph.D. thesis. California Institute of Technology
Fontaine E, Lentink D, Kranenbarg S, Müller UK, van Leeuwen JL, Barr AH, Burdick JW (2008) Automated visual tracking for studying the ontogeny of zebrafish swimming. J Exp Biol 211(8):1305–1316
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 580–587
Guo Y, Chen Y, Tang F, Li A, Luo W, Liu M (2014) Object tracking using learned feature manifolds. Comput Vis Image Understand 118:128–139
Hinton G, Deng L, Yu D, Dahl GE, Mohamed Ar, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath TN et al (2012) Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. Signal Process Mag IEEE 29(6):82–97
Jarrett K, Kavukcuoglu K, Ranzato M, LeCun Y (2009) What is the best multi-stage architecture for object recognition? In: 2009 IEEE 12th International conference on computer vision. IEEE, pp 2146–2153
Kalal Z, Mikolajczyk K, Matas J (2012) Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34(7):1409–1422
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li Y, Huang C, Nevatia R (2009) Learning to associate: hybridboosted multi-target tracker for crowded scene. In: IEEE Conference on computer vision and pattern recognition, 2009. CVPR 2009. IEEE, pp 2953–2960
Li H, Li Y, Porikli F (2015) Robust online visual tracking with a single convolutional neural network. In: Computer vision–ACCV 2014. Springer, pp 194–209
Liu J, Hu H (2010) Biological inspiration: from carangiform fish to multi-joint robotic fish. J Bionic Eng 7(1):35–48. doi:10.1016/S1672-6529(09)60184-0
Miller N, Gerlai R (2007) Quantification of shoaling behaviour in zebrafish (danio rerio). Behav Brain Res 184(2):157–166
Miller N, Gerlai R (2012) Automated tracking of zebrafish shoals and the analysis of shoaling behavior. In: Zebrafish protocols for neurobehavioral research. Springer, pp 217–230
Noldus LP, Spink AJ, Tegelenbosch RA (2001) Ethovision: a versatile video tracking system for automation of behavioral experiments. Behav Res Methods 33(3):398–414
Pérez-Escudero A, Vicente-Page J, Hinz R, Arganda S, de Polavieja G (2014) idTracker: tracking individuals in a group by automatic identification of unmarked animals. Nat. Methods 11(7):743–751. doi:10.1038/NMETH.2994
Pirsiavash H, Ramanan D, Fowlkes CC (2011) Globally-optimal greedy algorithms for tracking a variable number of objects. In: 2011 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 1201–1208
Qian Z, Cheng X, Chen Y (2014) Automatically detect and track multiple fish swimming in shallow water with frequent occlusion. PLoS ONE 9(9):e106,506. doi:10.1371/journal.pone.0106506
Reid DB (1979) An algorithm for tracking multiple targets. IEEE Trans Autom Control 24(6):843–854
Rosemberg D, Braga M, Rico E, Loss C, Córdova S, Mussulini B et al (2012) Behavioral effects of taurine pretreatment in zebrafish acutely exposed to ethanol. Neuropharmacology 63(4):613–623
Rosenthal SB, Twomey CR, Hartnett AT, Wu HS, Couzin ID (2015) Revealing the hidden networks of interaction in mobile animal groups allows prediction of complex behavioral contagion. Proc Nat Acad Sci 112(15):4690–4695
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229
Szegedy C, Toshev A, Erhan D (2013) Deep neural networks for object detection. In: Advances in neural information processing systems, pp 2553–2561
Vedaldi A, Lenc K (2014) Matconvnet-convolutional neural networks for matlab. arXiv:1412.4564
Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-end text recognition with convolutional neural networks. In: 2012 21st International conference on pattern recognition (ICPR). IEEE, pp 3304–3308
Yu Q, Medioni G, Cohen I (2007) Multiple target tracking using spatio-temporal markov chain monte carlo data association. In: IEEE Conference on computer vision and pattern recognition, 2007. CVPR’07. IEEE, pp 1–8
Zhou X, Xie L, Zhang P (2015) Online object tracking based on cnn with metropolis-hasting re-sampling. In: Proceedings of the ACM international conference on multimedia. ACM, pp 1–4
Acknowledgments
The authors would like to thank Ye Liu for the valuable discussions and insightful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Thanks to National Natural Science Foundation of China, Grant No.61175036 for funding.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Wang, S.H., Zhao, J.W. & Chen, Y.Q. Robust tracking of fish schools using CNN for head identification. Multimed Tools Appl 76, 23679–23697 (2017). https://doi.org/10.1007/s11042-016-4045-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-4045-3