Abstract
This article reviews different methodologies used to conduct political analysis using various sources of information available on the Internet. In some societies, the use of social networks has a significant impact on the political field with society, and various methodologies have been used to analyze various political aspects and the strategies to be followed. The purpose of this paper is to understand these methodologies in order to provide potential voters with information to make informed decisions. First, the necessary terminology on web scraping is reviewed, and then, some examples of projects for political analysis that have used web scraping are presented. Finally, the conclusions are presented.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ulbricht, L.: Scraping the demos. Digitalization, web scraping and the democratic project. Democratization 27(3), 426–442 (2020)
Yu, M., Krehbiel, M., Thompson, S., Miljkovic, T.: An exploration of gender gap using advanced data science tools: actuarial research community. Scientometrics, 1–23 (2020)
Anglin, K.L.: Gather-narrow-extract: a framework for studying local policy variation using web-scraping and natural language processing. J. Res. Edu. Effectiveness 12(4), 685–706 (2019)
Mahdavi, P.: Scraping public co-occurrences for statistical network analysis of political elites. Polit. Sci. Res. Methods 7(2), 385–392 (2019)
Schrenk, M.: Webbots, spiders, and screen scrapers, a guide to developing internet agent with PHP/CUR, 2nd edn (2012)
Mustafaraj, E., Lurie, E., Devine, C.: The case for voter-centered audits of search engines during political elections, January. In: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 559–569 (2020)
Rahman, R.U., Wadhwa, D., Bali, A., Tomar, D.S.: The emerging threats of web scrapping to web applications security and their defense mechanism. In: Encyclopedia of Criminal Activities and the Deep Web, pp. 788–809. IGI Global (2020)
Jiao, J., Bai, S.: An empirical analysis of Airbnb listings in forty American cities. Cities 99, 102618 (2020)
Aizenberg, E., Hanegraaff, M.: Is politics under increasing corporate sway? A longitudinal study on the drivers of corporate access. West Eur. Polit. 43(1), 181–202 (2020)
De Stefano, D., Fuccella, V., Vitale, M.P., Zaccarin, S.: Using web scraping techniques to derive co-authorship data: insights from a case study. In SIS May 2018. 49th Scientific Meeting of the Italian Statistical Society, pp. 1–6. Pearson (2018)
Hopkins, D.J., King, G.: A method of automated nonparametric content analysis for social science. Am. J. Polit. Sci. 54(1), 229–247 (2010)
Maerz, S.F., Schneider, C.Q.: Comparing public communication in democracies and autocracies: automated text analyses of speeches by heads of government. Qual. Quan. 1–29 (2019)
Joby, P.P.: Expedient information retrieval system for web pages using the natural language modeling. J. Artif. Intell. 2(02), 100–110 (2020)
Dorle, S., Pise, N.: Political sentiment analysis through social media. In: February 2018 Second International Conference on Computing Methodologies and Communication (ICCMC), pp. 869–873. IEEE (2018)
Mitchell, R.: Web scraping with Python: Collecting more data from the modern web. O’Reilly Media, Inc. (2018)
Matt, T., Pang, B., Lillian, L.: Get out the vote: determining support or opposition from congressional floor-debate transcripts proceedings of EMNLP, pp 327–335 (2006)
Wilkerson, J., Casas, A.: Large-scale computerized text analysis in political science: opportunities and challenges. Annu. Rev. Polit. Sci. 20, 529–544 (2017)
Viloria, A., Varela, N., Lezama, O.B.P., Llinás, N.O., Flores, Y., Palma, H.H., … Marín-González, F.: Classification of digitized documents applying neural networks. In: Lecture Notes in Electrical Engineering, Vol. 637, pp. 213–220. Springer. https://doi.org/10.1007/978-981-15-2612-1_20 (2020)
Kamatkar, S.J., Kamble, A., Viloria, A., Hernández-Fernandez, L., García Cali, E.: Database performance tuning and query optimization. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 10943 LNCS, pp. 3–11. Springer. https://doi.org/10.1007/978-3-319-93803-5_1 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Varela, N., Lezama, O.B.P., Charris, M. (2021). Web Scraping and Naïve Bayes Classification for Political Analysis. In: Pandian, A.P., Palanisamy, R., Ntalianis, K. (eds) Proceedings of International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1272. Springer, Singapore. https://doi.org/10.1007/978-981-15-8443-5_1
Download citation
DOI: https://doi.org/10.1007/978-981-15-8443-5_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8442-8
Online ISBN: 978-981-15-8443-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)