Abstract
The overall aim of this work is to perform a systematic analysis of several off-the-shelf machine learning classification algorithms and to assess their ability to classify Search And Rescue (SAR) patterns from noisy Automatic Identification System (AIS) data. Specifically, we evaluate Decision Trees, Random Forests and Gradient Boosted Trees on a large volume of historical AIS data so as to detect SAR activity from vessel trajectories, in a scalable, data-driven supervised way, with no reliance on external sources of information (e.g. coast guard reports). Our analysis verifies that it is possible to identify SAR patterns, while the results show that although all algorithms are capable of achieving high accuracy, Random Forests marginally outperform the others in terms of performance and speed of execution.
Similar content being viewed by others
References
Bertrand S, Díaz E, Lengaigne M (2008) Patterns in the spatial distribution of Peruvian anchovy (Engraulis ringens) revealed by spatially explicit fishing data. Prog Oceanogr 79(2):379–389. https://doi.org/10.1016/j.pocean.2008.10.009
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Caruana R, Niculescu-Mizil A (2006) An empirical comparison of supervised learning algorithms. Proceedings of the 23rd international conference on machine learning (New York, NY, USA, 2006), pp 161–168
Chatzikokolakis K, Zissis D, Spiliopoulos G, Tserpes K (2018) Mining vessel trajectory data for patterns of search and rescue. EDBT/ICDT workshops 2018, pp 117–124
Chen J, Li K, Tang Z, Bilal K, Yu S, Weng C, Li K (2017) A parallel random Forest algorithm for big data in a spark cloud computing environment. IEEE Trans Parallel Distrib Syst 28(4):919–933. https://doi.org/10.1109/TPDS.2016.2603511
Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. Proceedings of the 22Nd ACM SIGKDD international conference on knowledge discovery and data mining (New York, NY, USA, 2016), pp 785–794
Ester M, Kriegel H-P, Xu X (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the second international conference on knowledge discovery and data mining (Portland, Oregon, 1996), pp 226–231
Falcon R, Abielmona R, Blasch E (2014) Behavioral learning of vessel types with fuzzy-rough decision trees. 17th International Conference on Information Fusion (FUSION) (Jul. 2014), pp 1–8
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232
Risk analysis for 2017 (2017) [ebook] Frontex and European border and coast guard agency. Available at: https://frontex.europa.eu/assets/Publications/Risk_Analysis/Annual_Risk_Analysis_2017.pdf. Accessed 4 March 2019
Galdorisi G, Goshorn R (2006) Maritime domain awareness: the key to maritime security operational challenges and technical solutions. Ft. Belvoir: Defense Technical Information Center, 2006. http://handle.dtic.mil/100.2/ADA457569
Genuer R, Poggi J-M, Tuleau-Malot C, Villa-Vialaneix N (2017) Random forests for big data. Big Data Research 9(Sep. 2017):28–46. https://doi.org/10.1016/j.bdr.2017.07.003
Huang H, Hong F, Liu J, Liu C, Feng Y, Guo Z (2018) FVID: fishing vessel type identification based on VMS trajectories. J Ocean Univ China. https://doi.org/10.1007/s11802-018-3717-1
Mixed Migration Flows in the Mediterranean and Beyond (2017) [ebook] International Organization for Migration. Available at: http://migration.iom.int/docs/2016_Flows_to_Europe_Overview.pdf. Accessed 4 March 2019
Jiang X, Silver DL, Hu B, Souza EN, Matwin S (2016) Fishing activity detection from AIS data using autoencoders. Proceedings of the 29th Canadian conference on artificial intelligence on advances in artificial intelligence - volume 9673 (New York, NY, USA, 2016), pp 33–39
Joo R, Bertrand S, Chaigneau A, Ñiquen M (2011) Optimization of an artificial neural network for identifying fishing set positions from VMS data: an example from the Peruvian anchovy purse seine fishery. Ecol Model 222(4):1048–1059. https://doi.org/10.1016/j.ecolmodel.2010.08.039
Lee J-G, Han J, Li X, Gonzalez H (2008) TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proc VLDB Endow 1(1):1081–1094. https://doi.org/10.14778/1453856.1453972
Liu M, Wang M, Wang J, Li D (2013) Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: application to the recognition of orange beverage and Chinese vinegar. Sensors Actuators B Chem 177(Feb. 2013):970–980. https://doi.org/10.1016/j.snb.2012.11.071
Marzuki MI, Gaspar P, Garello R, Kerbaol V, Fablet R (2017) Fishing gear identification from vessel-monitoring-system-based fishing vessel trajectories. IEEE J Ocean Eng 689–699. https://doi.org/10.1109/JOE.2017.2723278
Mazzarella F, Vespe M, Damalas D, Osio G (2014) Discovering vessel activities at sea using AIS data: mapping of fishing footprints. 17th International conference on information fusion (FUSION) (Jul. 2014), pp 1–7
Natale F, Gibin M, Alessandrini A, Vespe M, Paulrud A (2015) Mapping fishing effort through AIS data. PLoS One 10(6):e0130746. https://doi.org/10.1371/journal.pone.0130746
Palmer M, Quetglas A, Guijarro B, Moranta J, Ordines F, Massutí E (2009) Performance of artificial neural networks and discriminant analysis in predicting fishing tactics from multispecific fisheries. Can J Fish Aquat Sci 66(2):224–237. https://doi.org/10.1139/F08-208
Poļevskis J, Krastiņš M, Korāts G, Skorodumovs A, Trokšs J (2012) Methods for processing and interpretation of AIS signals corrupted by noise and packet collisions. Latv J Phys Tech Sci 49(3):25–31. https://doi.org/10.2478/v10047-012-0015-3
Rocha JAMR, Times VC, Oliveira G, Alvares LO, Bogorny V (2010) DB-SMoT: a direction-based spatio-temporal clustering method. 2010 5th IEEE international conference intelligent systems (Jul. 2010), pp 114–119
Russo T, Parisi A, Prorgi M, Boccoli F, Cignini I, Tordoni M, Cataudella S (2011) When behaviour reveals activity: assigning fishing effort to métiers based on VMS data using artificial neural networks. Fish Res 111(1):53–64. https://doi.org/10.1016/j.fishres.2011.06.011
de Souza EN, Boerder K, Matwin S, Worm B (2016) Improving fishing pattern detection from satellite AIS using data mining and machine learning. PLoS One 11(7):e0158248. https://doi.org/10.1371/journal.pone.0158248
Spiliopoulos G, Zissis D, Chatzikokolakis K (2017) A big data driven approach to extracting global trade patterns. In International workshop on mobility analytics for Spatio-temporal and social data (Sep. 2017), pp 109–121.
Strobl C, Boulesteix A-L, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinf 8(1):25. https://doi.org/10.1186/1471-2105-8-25
Strobl C, Malley J, Tutz G (2009) An introduction to recursive partitioning: rationale, application and characteristics of classification and regression trees, bagging and random forests. Psychol Methods 14(4):323–348. https://doi.org/10.1037/a0016973
Data2.unhcr.org. (2019) Situation Mediterranean Situation. [online] Available at: https://data2.unhcr.org/en/situations/mediterranean. Accessed 4 March 2019
Varlamis I, Tserpes K, Sardianos C (2018) Detecting search and rescue Missions from AIS data. 2018 IEEE 34th International Conference on Data Engineering Workshops (ICDEW) (Paris, Apr 2018), pp 60–65
de Vries GKD, van Someren M (2012) Machine learning for vessel trajectories using compression, alignments and domain knowledge. Expert Syst Appl 39(18):13426–13439. https://doi.org/10.1016/j.eswa.2012.05.060
Yang M, Zou Y, Fang L (2012) Collision and detection performance with three overlap signal collisions in space-based AIS reception. 2012 IEEE 11th international conference on trust, security and privacy in computing and communications (Jun. 2012), pp 1641–1648
Zheng Y, Liu L, Wang L, Xie X (2008) Learning transportation mode from raw Gps data for geographic applications on the web. Proceedings of the 17th international conference on world wide web (New York, NY, USA, 2008), pp 247–256
Recommendation ITU-R M.1371-5: Technical characteristics for an automatic identification system using time-division multiple access in the VHF maritime mobile band (2014) [ebook] International Telecommunication Union - Radiocommunication sector. Available at: https://www.itu.int/dms_pubrec/itu-r/rec/m/R-REC-M.1371-5-201402-I!!PDF-E.pdf. Accessed 4 March 2019
Acknowledgements
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732310 and supported by AWS Cloud Credits for Research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chatzikokolakis, K., Zissis, D., Spiliopoulos, G. et al. A comparison of supervised learning schemes for the detection of search and rescue (SAR) vessel patterns. Geoinformatica 25, 601–622 (2021). https://doi.org/10.1007/s10707-019-00365-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-019-00365-y