Analyse qualitativer Daten mit dem „Leipzig Corpus Miner“

Wiedemann, Gregor; Niekler, Andreas

doi:10.1007/978-3-658-07224-7_3

Analyse qualitativer Daten mit dem „Leipzig Corpus Miner“

Gregor Wiedemann³ &
Andreas Niekler³

Chapter
First Online: 29 October 2015

7449 Accesses
9 Citations
1 Altmetric

Zusammenfassung

Der Leipzig Corpus Miner (LCM) ist eine Webanwendung, die verschiedene Text Mining-Verfahren für die Analyse großer Mengen qualitativer Daten bündelt. Durch eine einfach zu bedienende Benutzeroberfläche ermöglicht der LCM Volltextzugriff auf 3,5 Millionen Zeitungstexte, die nach Suchbegriffen und Metadaten zu Subkollektionen gefiltert werden können. Auf dem Gesamtdatenbestand sowie auf den Subkollektionen können verschiedene computergestützte Auswertungsverfahren angewendet und zu Analyseworkflows kombiniert werden. Damit ermöglicht der LCM die empirische Analyse sozialwissenschaftlicher Fragestellungen auf Basis großer Dokumentkollektionen, wobei qualitative und quantitative Analyseschritte miteinander verschränkt werden können. Dieser Artikel gibt einen Überblick über die Analysekapazitäten und mögliche Workflows zur Anwendung des LCM.

Abstract

The Leipzig Corpus Miner (LCM) is a text mining infrastructure which integrates a wide range of computer-assisted text analysis technologies. Via a user-friendly web interface the LCM provides full text access to more than 3.5 million German newspaper documents which can be retrieved by key terms or filtered by meta-data to generate thematically coherent sub-collections. On such sub-collections, a variety of analysis algorithms can be applied – either as single process or integrated into a more complex workflow of multiple analysis steps. In providing such workflows the LCM enables qualitatively oriented social scientists to approach their research questions by analysis of text in new ways. The application of corpus linguistic, lexicometric measures and machine learning algorithms on large document sets enables them to combine qualitative and quantitative perspectives on their data. The article provides an overview of linguistic preprocessing, text analysis capabilities and possible workflows of the LCM.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Literatur

Bordag, Stefan, 2008: A Comparison of Co-Occurrence and Similarity Measures as Simulations of Context. In: Alexander Gelbukh (Hg.), Computational Linguistics and Intelligent Text Processing, Bd. 4919, Berlin, 52–63; http://dx.doi.org/10.1007/978-3-540-78135-6_5, 09.04.2015.
Blei, David M., 2012: Probabilistic topic models. In: Communications of the ACM 55, 77–84.
Article Google Scholar
Burzan, Nicole, 2015: Rezension: Udo Kuckartz (2014). Mixed Methods. Methodologie, Forschungsdesigns und Analyseverfahren. In: Forum Qualitative Sozialforschung / Forum:Qualitative Social Research 16; http://nbn-resolving.de/urn:nbn:de:0114-fqs1501160, 03.02.2015.
Heyer, Gerhard / Quasthoff, Uwe / Wittig, Thomas, 2006: Text Mining: Wissensrohstoff Text – Konzepte, Algorithmen, Ergebnisse, Bochum.
Google Scholar
Hopkins, Daniel J. / King, Gary, 2010: A Method of Automated Nonparametric Content Analysis for Social Science. In: American Journal for Political Science 54, 229–247.
Article Google Scholar
Joachims, Thorsten, 1998: Text Categorization with Support Vector Machines: Learning with Many Relevant Features; http://www.cs.cornell.edu/people/tj/publications/joachims_98a.pdf, 20.01.2015.
Kuckartz, Udo, 2014 (Hg.): Mixed Methods. Methodologie Forschungsdesigns und Analyseverfahren. Wiesbaden.
Google Scholar
Lazer, David / Pentland, Alex / Adamic, Lada / Aral, Sinan / Barabási, Albert-László / Brewer, Devon / Christakis, Nicholas / Contractor, Noshir / Fowler James / Gutman, Myron / Jebara, Tony / King, Gary / Macy, Michael / Roy, Deb / Van Alstyne, Marshall, 2009: Computational Social Science. In: Science 323, 721–723; DOI: 10.1126/science. 1167742, 09.04.2015.
Article Google Scholar
Lemke, Matthias / Niekler, Andreas / Schaal, Gary S. / Wiedemann, Gregor, 2015: Content Analysis between Quality and Quantity. Fulfilling Blended-Reading Requirements for the Social Sciences with a Scalable Text Mining Infrastructure. In: Datenbank Spektrum; DOI: 10.1007/s13222-014-0174-x, 09.04.2015
Google Scholar
Lemke, Matthias / Stulpe, Alexander, 2015: Text und soziale Wirklichkeit. Theoretische Grundlagen und empirische Anwendung durch Text Mining Verfahren am Beispiel des Bigrams ‚soziale Marktwirtschaft‘. In: Zeitschrift für Germanistische Linguistik 43, 52–83.
Google Scholar
Niekler, Andreas / Wiedemann, Gregor / Heyer, Gerhard, 2014: Leipzig Corpus Miner – A Text Mining Infrastructure for Qualitative Data Analysis. In: Terminology and Knowledge Engineering 2014.
Google Scholar
Settles, Burr, 2010: Active Learning Literature Survey; http://burrsettles.com/pub/settles.activelearning.pdf, 28.01.2015.
Turney, Peter D. / Pantel, Patrick, 2010: From Frequency to Meaning: Vector Space Models of Semantics. In: Journal of Artificial Intelligence Research 37, 141–188; http://www.jair.org/media/2934/live-2934-4846-jair.pdf, 17.05.2014.
Google Scholar
Wiedemann, Gregor, 2013: Opening up to Big Data: Computer-Assisted Analysis of Textual Data in Social Sciences. In: Forum Qualitative Sozialforschung / Forum: Qualitative Social Research 14; http://nbn-resolving.de/urn:nbn:de:0114-fqs1302231, 09.04.2015.

Download references

Author information

Authors and Affiliations

Leipzig, Deutschland
Gregor Wiedemann & Andreas Niekler

Authors

Gregor Wiedemann
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Niekler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gregor Wiedemann .

Editor information

Editors and Affiliations

Helmut-Schmidt-Universität/ Universität der Bundeswehr Hamburg, Hamburg, Germany
Matthias Lemke
Institut für Informatik, Universität Leipzig, Leipzig, Germany
Gregor Wiedemann

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wiedemann, G., Niekler, A. (2016). Analyse qualitativer Daten mit dem „Leipzig Corpus Miner“. In: Lemke, M., Wiedemann, G. (eds) Text Mining in den Sozialwissenschaften. Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-07224-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-658-07224-7_3
Published: 29 October 2015
Publisher Name: Springer VS, Wiesbaden
Print ISBN: 978-3-658-07223-0
Online ISBN: 978-3-658-07224-7
eBook Packages: Social Science and Law (German Language)

Publish with us

Policies and ethics