Abstract
Social science research using Text Mining tools requires—due to the lack of a canonical heuristics in the digital humanities—a blended reading approach. Integrating quantitative and qualitative analyses of complex textual data progressively, blended reading brings up various requirements for the implementation of Text Mining infrastructures. The article presents the Leipzig Corpus Miner (LCM), developed in the joint research project ePol—Post-Democracy and Neoliberalism and responding to social science research requirements. The functionalities offered by the LCM may serve as best practice of processing data in accordance with blended reading.
Notes
http://www.epol-projekt.de; for the heuristic interest articulated by the Political Science branch of the project, see Lemke and Schaal [16: 3–19].
Wiedemann et al. [27: pp. 101 ff].
Currently we are trying to optimize classification results, before we run final classifications for different sub collections. For now we achieve F1 = 0.613 and accuracy = 0.867 on our category of neoliberal argumentation (interrater reliability during manual annotation phase: Krippendorf’s alpha = 0.76).
References
Baßler M (1995) Einleitung. In: Baßler M (ed) New Historicism. Literaturgeschichte als Poetik der Kultur. Fischer, Frankfurt a. M., pp 7–28
Drucker J (2011) Humanities approaches to graphical display. Digital Humanities Quaterly 5, http://digitalhumanities.org/dhq/vol/5/1/000091/000091.html. Accessed 17 April 2014
Dumm S, Lemke M (2013) Argumentmarker. Definition, Generierung und Anwendung im Rah- men eines semi-automatischen Dokument-Retrieval-Verfahrens, Hamburg/Leipzig (= ePol Discussion Paper 3). http://www.epol-projekt.de/wp-content/uploads/2014/10/Discussion-Paper-epol-3_dumm_lemke_CC.pdf. Accessed 1 Dec 2014
Evangelopoulos N, Zhang X, Prybutok VR (2012) Latent semantic analysis: five methodological recommendations. Eur J Inf Syst 21:70–86
Ferrucci D, Lally A (2004) UIMA. An architectural approach to unstructured information processing in the corporate research environment. Nat Lang Eng 10(3–4):327–348
Früh W (2009) Inhaltsanalyse. Theorie und Praxis. UVK, Konstanz
Gadamer HG (1968) Klassische und philosophische Hermeneutik. In: Grondin J (ed) Gadamer-Lesebuch. Mohr Siebeck, Tübingen, pp 32–57
Heyer G, Quasthoff U, Wittig T (2008) Text Mining: Wissensrohstoff Text. IT lernen. W3L GmbH, Herdecke
Husserl E (1976) Die Krisis der europäischen Wissenschaften und die transzendentale Phänomenologie. Eine Einleitung in die phänomenologische Philosophie. Biemel v. W. (ed) Husserliana, vol 6. Nijhoff, The Hague
Husserl E (1980) Phantasie, Bildbewusstsein, Erinnerung. Zur Phänomenologie der anschaulichen Vergegenwärtigungen. In: Marbach E (ed) Husserliana, vol. 23. Springer, The Hague
Ihde D (1998) Expanding hermeneutics. Visualism in science. Northwestern University Press, Evanston
Ihde D (2012) Experimental Phenomenology. Multistables. State University of New York, New York
Kath R, Schaal GS, Dumm S (2015, forthcoming) New visual hermeneutics. Scharloth J, Bubenhofer N (eds) ZGL-Sonderheft Automatisierte Textanalyse. http://www.degruyter.com/view/j/zfgl
Keim D, Kohlhammer J, Ellis G, Mansmann F (eds) (2010) Mastering the information age. Solving problems with visual analytics. http://www.diglib.eg.org. Accessed 14 May 2014
Lemke M (2014) Frequenzanalyse und Diktionäransatz, Hamburg/Leipzig (eTMV 1/5). http://www.epol-projekt.de/wp-content/uploads/2014/10/eTMV_1.pdf
Lemke M, Schaal GS (2014) Ökonomisierung und Politikfeldanalyse. Eine ideengeschichtliche und theoretische Rekonstruktion des Neoliberalismus in der Postdemokratie. Schaal GS, Lemke M, Ritzi C (eds) Die Ökonomisierung der Politik in Deutschland. Eine vergleichende Politikfeldanalyse. Springer VS, Wiesbaden, pp 3–19
Lemke M, Stulpe A (2015, forthcoming) Text und soziale Wirklichkeit. Theoretische Grundlagen und empirische Anwendung von Text-Mining-Verfahren in sozialwissenschaftlicher Perspektive. Scharloth J, Bubenhofer N (eds) ZGL-Sonderheft Automatisierte Textanalyse. http://www.degruyter.com/view/j/zfgl
Mayring P (2010) Qualitative Inhaltsanalyse. Grundlagen und Techniken, 11th edn. Beltz, Weinheim
Montrose L (1995) Die Renaissance behaupten. Die Poetik und Politik der Kultur. In: Baßler M (ed), New Historicism. Literaturgeschichte als Poetik der Kultur. Fischer, Frankfurt a. M., pp 60–93
Moretti F (2000) Conjectures on world literature. New Left Rev 1(1):54–68
Moretti F (2007) Graphs, maps, trees. Abstract models for literary history. Verso, London
Niehr T (1999) Halbautomatische Erforschung des öffentlichen Sprachgebrauchs oder Vom Nutzen computerlesbarer Textkorpora. ZGL 27(2):205–214
Niekler A, Wiedemann G, Heyer G (2014) Leipzig Corpus Miner. A Text Mining Infrastructure for Qualitative Data Analysis. http://hal.archives-ouvertes.fr/hal-01005878/. Accessed 30 Sept 2014
Niekler A, Wiedemann G, Dumm S, Heyer, G (2014) Creating dictionaries for argument identification by reference data, Poster presented at DHd2014, Passau, http://asv.informatik.uni-leipzig.de/publication/file/254/Poster_A0_dhd2014_final.pdf. Accessed 1 Dec 2014
Stone PJ (1966) The general inquirer: A computer approach to content analysis. MIT Press, Cambridge
Wiedemann G (2013) Opening up to Big Data: Computer-Assisted Analysis of Textual Data in Social Sciences. FQS, 14(2). http://www.qualitative-research.net/index.php/fqs/article/view/1949. Accessed 30 Sept 2014
Wiedemann G, Lemke M, Niekler A (2013) Postdemokratie und Neoliberalismus – Zur Nutzung neoliberaler Argumentation in der Bundesrepublik Deutschland 1949–2011. Ein Werkstattbericht. ZPTh 4(1):99–115
Wiedemann G, Niekler A (2014) Document Retrieval for Large Scale Content Analysis using Contextualized Dictionaries. http://hal.archives-ouvertes.fr/hal-01005879/. Accessed 30 Sept 2014
Acknowledgements
ePol is a joint research project of the Institute for Political Science, specialization on Political Theory at Helmut-Schmidt-University Hamburg (Prof. Dr. Gary Schaal) and the Natural Language Processing Group, Department of Computer Science, University of Leipzig (Prof. Dr. Gerhard Heyer). The project is funded by the Federal ministry of education and research (BMBF; FKZ 01UG1231A and B).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lemke, M., Niekler, A., Schaal, G. et al. Content Analysis between Quality and Quantity. Datenbank Spektrum 15, 7–14 (2015). https://doi.org/10.1007/s13222-014-0174-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13222-014-0174-x