Social networks are usually used by citizens to report or complain about traffic incidents that affect their daily mobility. Automatically finding traffic-related reports and extracting useful information from them is not a trivial task, due to the informal language used in social networks, to the lack of geographic metadata, and to the large amount of non traffic-related publications. In this article, we address this problem by combining Machine Learning and Natural Language Processing techniques. Our approach (a) filters publications that report traffic incidents in social networks, (b) extracts geographic information from the textual content of the publications, and (c) provides a broadcasting service that clusters all the reports of the same incident. We compared the performance of our approach with state of the art approaches and with a popular traffic-specific social network, obtaining promising results.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
We decided to keep two repeated letters at maximum because some words and names have repetitions of letters (e.g. ‘Saavedra’ and ‘calle’ in Spanish or ‘street’ and ‘traffic’ in English).
Stopwords are irrelevant words such as articles or prepositions that have a high probability of occurrence in the publications, regardless of the topic discussed.
Albuquerque, F. C., Casanova, M. A., Lopes, H., Redlich, L. R., de Macedo, J. A. F., Lemos, M., de Carvalho, M. T. M., & Renso, C. (2016). A methodology for traffic-related twitter messages interpretation. Computers in Industry, 78(May), 57–69.
Bothorel, Cécile, Neal Lathia, Romain Picot-Clemente, and Anastasios Noulas. 2018. “Location recommendation with social media data.” In Lecture Notes in Computer Science, 624–653.
Carvalho. (2010). Real-Time Sensing of Traffic Information in Twitter Messages. In Edited by Rosaldo Rossetti. Master: Universidade do Porto.
D’Andrea, E., Ducange, P., Lazzerini, B., & Marcelloni, F. (2015). Real-time detection of traffic from twitter stream analysis. IEEE Transactions on Intelligent Transportation Systems, 16(4), 2269–2283.
Dabiri, Sina, and Kevin Heaslip. 2019. “Developing a twitter-based traffic event detection model using deep learning architectures.” Expert Systems with Applications.https://doi.org/10.1016/j.eswa.2018.10.017.
Endarnoto, Sri Krisna, Sonny Pradipta, Anto Satriyo Nugroho, and James Purnama. 2011. “Traffic Condition Information Extraction & Visualization from social media twitter for android Mobile application.” In Proceedings of the 2011 International Conference on Electrical Engineering and Informatics, 1–4.
Ertekin, Seyda, Jian Huang, Leon Bottou, and Lee Giles. 2007. “Learning on the Border: Active Learning in Imbalanced Data Classification.” In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, 127–36. CIKM ‘07. New York, NY, USA: ACM.
Gu, Y., Qian, Z., & Chen, F. (2016). From twitter to detector: Real-time traffic incident detection using social media data. Transportation Research Part C: Emerging Technologies, 67(June), 321–342.
Kamran, Shoaib, and Olivier Haas. 2007. “A Multilevel Traffic Incidents Detection Approach: Identifying Traffic Patterns and Vehicle Behaviours Using Real-Time GPS Data.” In Intelligent Vehicles Symposium, 2007 IEEE, 912–17. IEEE.
Köknar-Tezel, Suzan, and Longin Jan Latecki. 2009. “Improving SVM classification on imbalanced data sets in distance spaces.” In 2009 Ninth IEEE International Conference on Data Mining, 259–267.
Kuflik, T., Minkov, E., Nocera, S., Grant-Muller, S., Gal-Tzur, A., & Shoor, I. (2017). Automating a framework to extract and analyse transport related social media content: The potential and the challenges. Transportation Research Part C: Emerging Technologies, 77(April), 275–291.
Levenshtein, V. 1996. “Lower Bounds on Cross-Correlation of Codes.” In Proceedings of ISSSTA’95 International Symposium on Spread Spectrum Techniques and Applications, 2:657–61 vol.2.
Mandal, A. K., & Sen, R. (2014). Supervised learning methods for Bangla web document categorization. International Journal of Artificial Intelligence & Applications, 5(5), 93–105.
Pereira, João, Arian Pasquali, Pedro Saleiro, and Rosaldo Rossetti. 2017. “Transportation in social media: An automatic classifier for travel-related tweets.” Progress in Artificial Intelligencehttps://doi.org/10.1007/978-3-319-65340-2_30.
Ritter, Alan, Sam Clark, Mausam, and Oren Etzioni. 2011. “Named Entity Recognition in Tweets: An Experimental Study.” In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1524–34. EMNLP ‘11. Stroudsburg, PA, USA: Association for Computational Linguistics.
Schneider, Karl-Michael. 2005. “Techniques for improving the performance of naive Bayes for text classification.” In Lecture Notes in Computer Science, 682–93.
Schulz, Axel, Petar Ristoski, and Heiko Paulheim. 2013. “I See a Car Crash: Real-Time Detection of Small Scale Incidents in Microblogs: 25th International Conference, CAiSE 2013, Valencia, Spain, June 17–21, 2013. Proceedings.” In Advanced Information Systems Engineering, edited by Camille Salinesi, Moira C. Norrie, and Óscar Pastor, 7908:22–33. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin Heidelberg.
Wanichayapong, Napong, Wasawat Pruthipunyaskul, Wasan Pattara-Atikom, and Pimwadee Chaovalit. 2011. “Social-Based Traffic Information Extraction and Classification.” In 2011 11th International Conference on ITS Telecommunications, 107–12.
Yang, Zhenglu, Jianjun Yu, and Masaru Kitsuregawa. 2010. “Fast algorithms for top-K approximate string matching.” In AAAI. Vol. 1463.http://www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/viewFile/1939/2234.
Zhou, Xiaokang, and Qun Jin. 2011. “Dynamical User Networking and Profiling Based on Activity Streams for Enhanced Social Learning.” Advances in Web-Based Learning - ICWL 2011.https://doi.org/10.1007/978-3-642-25813-8_23.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Vallejos, S., Alonso, D.G., Caimmi, B. et al. Mining Social Networks to Detect Traffic Incidents. Inf Syst Front 23, 115–134 (2021). https://doi.org/10.1007/s10796-020-09994-3
- Social networks
- Natural language processing
- Machine learning
- Traffic incident detection