Text Mining and Data Visualization: Exploring Cultural Formations and Structural Changes in Fifty Years of Eighteenth-Century Poetry Criticism (1967–2018)

Hall, Billy

doi:10.1007/978-3-030-54913-8_5

Billy Hall²

257 Accesses

Abstract

This chapter uses quantitative methods to identify thematic shifts in the past fifty years of criticism focused on eighteenth-century poetry. The author uses computational tools to examine trends in poetry criticism reflected by essays published in two flagship journals in the field: Eighteenth-Century Studies and The Eighteenth Century: Theory and Interpretation. His goal is to identify patterns in the dataset that have shaped our understanding of eighteenth-century poetry. By using algorithmic manipulation, k-means clustering, Latent Dirchlet Allocation (LDA) topic modeling, and data visualization, the author analyses patterns in attention to various texts and/or poets and makes inferences about disciplinary focus, direction of disciplinary practice, and the impact of gender on the poetic canon.

Models return us to the process—the tools, techniques and practices—through which we construct our knowledge of phenomena that exceed our direct observation.

—Andrew Piper, “Think Small: On Literary Modeling” (2017)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Jennifer Keith, “Why Poetry?,” The Eighteenth Century 48, no. 1 (2007): 87, https://doi.org/10.1353/ecy.2007.0002
2.
Ibid., 91.
3.
John Sitter, The Cambridge Introduction to Eighteenth-Century Poetry (Cambridge: Cambridge University Press, 2011), 2.
4.
Andrew Piper, “Think Small: On Literary Modelling,” PMLA 132, no. 3 (2017): 651, https://doi.org/10.1632/pmla.2017.132.3.651
5.
Willard McCarty, Humanities Computing (New York: Palgrave Macmillan, 2005), 27.
6.
I have not included in this study monographs or essays in collections partly for practical reasons and partly because I wanted to map poetry criticism as broadly as possible.
7.
All computational work was done with R, a statistical programing language commonly used in data science.
8.
This is a bit uneven in part because TEC, which was originally Studies in Burke, did not begin its run until 1978, while ECS began its run in 1967.
9.
Such errors include non-ascii characters, collapsed words, and encoding problems when images were read by the OCR software and converted to plain text from a PDF.
10.
To rank terms, I used term frequency-inverse document frequency (tf-idf) statistics. Term frequency (tf) is simply how many times a term appears in a document. Common terms will have a higher tf while more specialized words will have a lower tf. Inverse-document frequency (idf) is computed by dividing the total number of documents in a corpus by the number of documents that contain a specific term. “Poet,” for example, might have a relatively high term frequency in an essay about poetry. However, that high frequency is offset by the idf, which adjusts to account for how many documents there are in an entire corpus with the word “poet” in them. Consequently, terms in a document with a high tf-idf score are likely indicators of a document’s content, while terms in a document with a low td-idf score are not likely indicators of a document’s content. In this case, I combined the tf-idf score for each term into one aggregate td-idf score and then selected only those essays with an aggregate td-idf score of 0.002 or higher.
11.
Gerald Graff, Professing Literature: An Institutional History (Chicago: University of Chicago Press, 1987); Richard Ohmann, Politics of Letters (Middletown: Wesleyan University Press, 1987).
12.
Piper, “Think Small,” 651.
13.
Andrew Goldstone and William E. Underwood, “The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us,” New Literary History 45, no. 3 (2014): 379, https://doi.org/10.1353/nlh.2014.0025
14.
Michael Gavin’s “Historical Text Networks: The Sociology of Early English Criticism” takes a quantitative approach to what he calls “historical text networks” derived from metadata of works contained in the Early English Books Online (EEBO) database. For details, see Michael Gavin, “Historical Text Networks: The Sociology of Early English Criticism,” Eighteenth-Century Studies 50, no. 1 (2016): 53–80, https://doi.org/10.1353/ecs.2016.0041
15.
See Stephen Ramsey, “Special Section: Reconceiving Text Analysis: Toward an Algorithmic Criticism,” Literary and Linguistic Computing 18, no. 2 (2003): 167–74, https://doi.org/10.1093/llc/18.2.167; Goldstone and Underwood, “The Quiet Transformations of Literary Studies,” 359–84; and Dan Edelstein, “Enlightenment Scholarship by the Numbers: dfr.jstor.org, Dirty Quantification, and the Future of the Lit Review,” Republics of Letters 4, no. 1 (2014): 1–26, https://arcade.stanford.edu/sites/default/files/article_pdfs/ROFL_v5_Edelstein_final.pdf
16.
Matthew Jockers, Macroanalysis: Digital Methods & Literary History (Urbana: University of Illinois Press, 2013); Andrew Piper, Enumerations: Data and Literary Study (Chicago: University of Chicago Press, 2018); and McCarty, Humanities Computing.
17.
McCarty, Humanities Computing, 27.
18.
See McCarty, Humanities Computing; the essays collected in Julia Flanders and Fotis Jannidis, eds., The Shape of Data in Digital Humanities: Modeling Texts and Text-Based Resources (London: Routledge, 2018); Piper, “Think Small;” Arianna Ciula and Oyvind Eide, “Modelling in the Digital Humanities: Signs in Context,” Digital Scholarship in the Humanities 32, no. 1 (2017): i33-i46, https://doi.org/10.1093/llc/fqw045; and Graeme Simsion, Data Modeling: Theory and Practice (New Jersey: Technics Publications, 2007).
19.
Julia Flanders and Fotis Jannidis, “Data Modeling,” in A New Companion to Digital Humanities, ed. Susan Schreibman, Raymond G. Siemens, and John Unsworth (Malden: Wiley/Blackwell, 2016), 230.
20.
Piper, “Think Small,” 652.
21.
Ibid.
22.
For a concise introduction to text mining, see Matthew L. Jockers and Ted Underwood, “Text-Mining the Humanities,” in Schreibman, Siemens, and Unsworth, A New Companion to Digital Humanities, 291–306. The field-specific literature on text mining is vast and often complicated for humanists not trained in statistics and computer science. Ashok N. Srivastava and Mehran Sahami, eds., Text Mining: Classification, Clustering, and Applications (Boca Raton, FL: CRC Press, 2009) and Michael W. Berry, ed., Survey of Text Mining: Clustering, Classification, and Retrieval (New York: Springer-Verlag, 2004) are both excellent introductions to methods for text mining and knowledge discovery. More practical and hands on approaches to text mining include Julia Silge and David Robinson, Text Mining with R: A Tidy Approach (Sebastopol, CA: O’Reilly, 2017); Matthew L. Jockers, Text Analysis with R for Students of Literature (New York: Springer, 2014); Kasper Welbers, Wouter Van Atteveldt, and Kenneth Benoit, “Text Analysis in R,” Communication Methods and Measures 11, no. 4 (2017): 245–65, https://doi.org/10.1080/19312458.2017.1387238; Ted Kwartler, Text Mining in Practice with R (Hoboken: John Wiley & Sons, 2017); and Hadley Wickham and Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (Sebastopol, CA: O’Reily, 2017).
23.
David M. Blei, “Topic Modeling and Digital Humanities,” Journal of Digital Humanities 2, no. 1 (2012): n.p., http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/. The literature on topic modeling is large and ranges from technical essays on statistics and algorithm performance to application in humanities to tutorials. For useful overviews of topic modeling in the humanities, see Scott Weingart’s post, “Topic Modeling for Humanists: A Guided Tour,” The Scottbot Irregular (blog), July 25, 2012, http://scottbot.net/2012/07/; Ted Underwood’s various posts on The Stone and the Shell; and Andrew Pipers’ posts, “Topic Modelling Literary Studies: Topic Stability, Part 1,” Txtlab (blog), May 28, 2010, https://txtlab.org/2018/05/topic-modelling-literary-studies-part-1-topic-stability/, and “Topic Stability, Part 2,” Txtlab (blog), June 7, 2018, https://txtlab.org/2018/06/topic-stability-part-2/. The essays in The Journal of the Digital Humanities 1, no. 1 (2012) also provide an excellent introduction to topic modeling. David M. Blei has also written several essays both technical and introductory; among them, “Probabilistic Topic Models,” Communications of the ACM 55, no. 4 (2012): 77–84, https://doi.org/10.1145/2133806.2133826 is one of the clearest introductions. For reading and interpreting topic models, see Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M. Blei, “Reading Tea Leaves: How Humans Interpret Topic Models,” in Advances in Neural Information Processing Systems 22, ed. Y. Bengio, D. Schuurmans, J. D. Lafferty, C.K.I. Williams, and A. Cullota (Vancouver: Curran Associates, 2009), 288–96.
24.
The “bag-of-words” (BoW) approach does not include word order or other semantic markers. Instead, a BoW algorithm generates a unique vocabulary for a text or text collection and some unit of measurement—simple word frequency or tf-idf are common measures. Because the BoW method is common in natural language processing and text mining, there are a number of tutorials available. A good introduction can be found in Yin Zhang, Rong Jin, and Zhi-Hua Zhou, “Understanding Bag-of-Words model: A Statistical Framework,” International Journal of Machine Learning and Cybernetics 1, no. 1–4 (2010): 43–52, https://doi.org/10.1007/s13042-010-0001-0
25.
Ted Underwood, “Algorithmic Modeling: Or, Modeling Data We Do Not Yet Understand,” in Flanders and Jannidis, The Shape of Data in Digital Humanities, 261. I discuss topic modeling in detail below.
26.
The k-means algorithm is normally used for document clustering when browsing a collection for similar documents or other search engine implementations. For details, see E. Laxmi Lydia, P. Govindasamy, S. K. Lakshmanaprabu, and D. Ramya, “Document Clustering Based on Text Mining K-Means Algorithm Using Euclidean Distance Similarity,” Journal of Advanced Research in Dynamical and Control Systems 10, no. 2 (2018): 208–14; and Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut, “A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques,” arXiv e-print (2017), arXiv:1707.02919.
27.
I used here the standard Gibbs version of LDA in the “topicsmodels” package for R, with minimal tweaking of the default setting. However, I arrived at the number fifty after cross-validating with the algorithms from the “ldatuning” package.
28.
Underwood points out this problem in his article, “Algorithmic Modeling.” As of now, I have only begun to experiment with spherical k-means and other versions of the k-means algorithm that try to overcome this limitation.
29.
The essays were selected based on the highest tf-idf score for each genre term. I then handchecked the three genre lists to ensure that the included essays did, in fact, deal primarily with genre-based material.
30.
To minimize the impact of dimensionality, I varied the sparsity of the genre group document matrix to limit the number of words used by the algorithm. This reduction is, of course, not trivial because it means that document clustering will be based on a set of words that appear within a range of very common to very uncommon. A very sparse matrix, for instance, will contain all the words, many of which will only appear in one or a handful of documents. On the other hand, a matrix with very low sparsity will only contain those words that appear in all or nearly all documents. The idea of multiple models was to get closer to a set of words that will be representative of an essay by eliminating idiosyncratic words and, at the same time, not becoming so generic that the words fail to distinguish between documents and become, therefore, useless for clustering.
31.
I used the tf-idf scores to separate essays that had a high tf-idf score on a list of poetry terms (i.e., poet, poets, poem, poems, poetry, poetics), and to obtain an aggregate score for all poetry terms. Using those numbers, I removed articles with an aggregate tf-idf poetry terms score below 0.002, which resulted in a PCC of 310 essays.
32.
The term was coined in Felicity Nussbaum and Laura Brown, eds., The New Eighteenth Century: Theory, Politics, English Literature (North Yorkshire: Methuen, 1987).
33.
As an unsupervised method, k-means, like topic modeling, requires the modeler to determine a value for k that determines the number of clusters the algorithm will generate. For the k-means analysis of the poetry corpus, I used the elbow method and the Hopkins factor for estimating clusterability and the number of clusters.
34.
Each iteration set k to 3, 6, and 9 and included terms ranging from 25,521 to 1318 to 318 (sparsity of 93%, 51%, and 25%, respectively), with the key feature being the tf-idf score for each word in the matrix.
35.
I also ran k-means on two more versions of the corpus (one with author names and one with author names removed), using raw term frequency as the main feature. In each of the eighteen iterations, the algorithm performed poorly at clustering these essays.
36.
Latent Dirchlet Allocation (LDA) is a statistical way of organizing data according to similarity. Topic modeling that uses LDA allows for creating a nuanced model of a text corpus because it moves beyond mere word frequency to generate lists of words likely to occur together in a document. The assumption is that every document is a collection of topics, so LDA assigns a probability for each topic to occur in each individual document.
37.
An stm is a version of the LDA algorithm that allows for document-level metadata to be included in the modeling process. For technical information and an overview of the stm features, see Margaret E. Roberts, Brandon M. Stewart, and Dusting Tingley, “stm: An R Package for Structural Topic Models,” Journal of Statistical Software 91, no. 2 (2019): 1–40, https://doi.org/10.18637/jss.v091.i02
38.
Ibid., 1–2.
39.
The main issue with network graphs representing a model in its entirety is that, in practice, every topic is connected to every node, which does not translate into very readable and meaningful visualizations. Using multidimensional scaling allows for presenting the entire model, without eliminating data, in a legible way. Ted Underwood has an excellent illustration of these issues in “Visualizing Topic Models,” The Stone and the Shell (blog), November 11, 2012, https://tedunderwood.com/2012/11/11/visualizing-topic-models/. See also the conversation in comments.
40.
LDAvis offers a two-dimensional alternative to the network graph for capturing and interpreting the model when zoomed out to visualize the entire model. It is browser-based and is best viewed in its dynamic form.
41.
As far as I can tell, the multidimensional scaling (mds) done in LDAvis is a distance method of Jensen-Shannon and the multidimensional scaling handled by the cmdscale function, which is, in turn, plotted in a two-dimensional space. See Carson Sievert and Kenneth E. Shirley, “LDAvis: A Method for Visualizing and Interpreting Topics,” in Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, ed. Jason Chang, Spence Green, Marti Hearst, Jeffrey Heer, and Philipp Koehn (Baltimore, MA: Association for Computational Linguistics, 2014), 63–70, https://nlp.stanford.edu/events/illvi2014/papers/sievert-illvi2014.pdf
42.
See Benjamin M. Schmidt, “Modeling Time,” in Flanders and Jannidis, The Shape of Data in Digital Humanities, 150–66.
43.
The early prevalence of these topics in ECS is partly explained by the fact that TEC does not begin its print run until 1979.
44.
Roger Lonsdale, Eighteenth-Century Women Poets (Oxford: Oxford University Press, 1989).
45.
Paula Backscheider and Catherine Ingrassia, eds. British Women Poets of the Long Eighteenth Century (Baltimore: Johns Hopkins University Press, 2009).
46.
Janet M. Todd, Feminist Literary History (Cambridge: Polity Press, 1988); Margaret J. M. Ezell, Writing Women’s Literary History (Baltimore: Johns Hopkins University Press, 1993); Carol Barash, English Women’s Poetry, 1649–1714 (Oxford: Clarendon Press, 1996); Isobel Armstrong, Victorian Poetry: Poetry, Poetics, and Politics (London: Routledge, 1993); Isobel Armstrong and Virginia Blain, eds., Women’s Poetry in the Enlightenment: The Making of a Canon, 1730–1820 (New York: St. Martin’s Press, 1999).
47.
David Shuttleton, “Poetry,” in The Cambridge Companion to Women’s Writing in Britain, 1660–1789, ed. Catherine Ingrassia (Cambridge: Cambridge University Press, 2015), 103.
48.
Ibid.
49.
Ibid.
50.
David Fairer and Christine Gerard, eds., Eighteenth-Century Poetry: An Annotated Anthology, 3rd ed. (West Sussex: Wiley Blackwell, 2015). The Blackwell anthology is currently the only major non-gender-specific volume dedicated to eighteenth-century poetry, which makes it an important locus for a general picture of what poets and poems matter at the moment. Moreover, it cut a middle ground between anthologies dedicated to eighteenth-century poetry written by women and earlier anthologies that were by default dedicated to male eighteenth-century poets. See, for instance, Louis I. Bredvold, Robert K. Root, and George Sherburn, eds., Eighteenth-Century Poetry and Prose (New York: Ronald Press, 1932) and Geoffrey Tillotson, Paul Fussell, and Marshall Waingrow, eds., Eighteenth-Century English Literature (New York: Harcourt, Brace, & World, 1969), which combined have only three female poets versus over one hundred male poets.
51.
A bigram is a sequence of two adjacent elements from a string of tokens; in our case, the authors’ first and second names. Because poets are often referred to only by their last name, I also tested the last names of poets as a unigram, that is, an n-gram consisting of a single item from a sequence. While the relative frequencies went up, the overall map of the poets across time did not.
52.
McCarty, Humanities Computing, 25.

Bibliography

Allahyari, Mehdi, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut. 2017. A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques. arXiv e-print. arXiv:1707.02919.
Google Scholar
Armstrong, Isobel. 1993. Victorian Poetry: Poetry, Poetics, and Politics. London: Routledge.
Google Scholar
Armstrong, Isobel, and Virginia Blain, eds. 1999. Women’s Poetry in the Enlightenment: The Making of a Canon, 1730–1820. New York: St. Martin’s Press.
Google Scholar
Backscheider, Paula, and Catherine Ingrassia, eds. 2009. British Women Poets of the Long Eighteenth Century. Baltimore: Johns Hopkins University Press.
Google Scholar
Barash, Carol. 1996. English Women’s Poetry, 1649–1714. Oxford: Clarendon Press.
Google Scholar
Berry, Michael W. 2004. Survey of Text Mining: Clustering, Classification, and Retrieval. New York: Springer.
Book Google Scholar
Blei, David M. 2012a. Probabilistic Topic Models. Communications of the ACM 55 (4): 77–84. https://doi.org/10.1145/2133806.2133826.
Article Google Scholar
———. 2012b. Topic Modeling and Digital Humanities. Journal of Digital Humanities 2 (1): n.p. http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/
Bredvold, Louis I., Robert K. Root, and George Sherburn, eds. 1932. Eighteenth-Century Poetry and Prose. New York: Ronald Press.
Google Scholar
Chang, Jonathan, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M. Blei. 2009. Reading Tea Leaves: How Humans Interpret Topic Models. In Advances in Neural Information Processing Systems, ed. Y. Bengio, D. Schuurmans, J.D. Lafferty, C.K.I. Williams, and A. Cullota, vol. 22, 288–296. Vancouver: Curran Associates.
Google Scholar
Ciula, Arianna, and Oyvind Eide. 2017. Modelling in the Digital Humanities: Signs in Context. Digital Scholarship in the Humanities 32 (1): i33–i46. https://doi.org/10.1093/llc/fqw045.
Article Google Scholar
Edelstein, Dan. 2014. Enlightenment Scholarship by the Numbers: dfr.jstor.org, Dirty Quantification, and the Future of the Lit Review. Republics of Letters 4 (1): 1–26. https://arcade.stanford.edu/sites/default/files/article_pdfs/ROFL_v5_Edelstein_final.pdf
Ezell, Margaret J.M. 1993. Writing Women’s Literary History. Baltimore: Johns Hopkins University Press.
Google Scholar
Fairer, David, and Christine Gerard, eds. 2015. Eighteenth-Century Poetry: An Annotated Anthology. 3rd ed. West Sussex: Wiley Blackwell.
Google Scholar
Flanders, Julia, and Fotis Jannidis, eds. 2018. The Shape of Data in Digital Humanities: Modeling Texts and Text-Based Resources. London: Routledge.
Google Scholar
———. Data Modeling. In Schreibman, Siemens, and Unsworth, A New Companion to Digital Humanities, 229–237.
Google Scholar
Gavin, Michael. 2016. Historical Text Networks: The Sociology of Early English Criticism. Eighteenth-Century Studies 50 (1): 53–80. https://doi.org/10.1353/ecs.2016.0041.
Article Google Scholar
Goldstone, Andrew, and William E. Underwood. 2014. The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us. New Literary History 45 (3): 359–384. https://doi.org/10.1353/nlh.2014.0025.
Article Google Scholar
Graff, Gerald. 1987. Professing Literature: An Institutional History. Chicago: University of Chicago Press.
Google Scholar
Jockers, Matthew L. 2013. Macroanalysis: Digital Methods & Literary History. Urbana: University of Illinois Press.
Book Google Scholar
———. 2014. Text Analysis with R for Students of Literature. New York: Springer.
Google Scholar
Jockers, Matthew L., and Ted Underwood. Text-Mining the Humanities. In Schreibman, Siemens, and Unsworth, A New Companion to Digital Humanities, 291–306.
Google Scholar
Keith, Jennifer. 2007. Why Poetry? The Eighteenth Century 48 (1): 87–91. https://doi.org/10.1353/ecy.2007.0002.
Article Google Scholar
Kwartler, Ted. 2017. Text Mining in Practice with R. Hoboken: Wiley.
Book Google Scholar
Lonsdale, Roger. 1989. Eighteenth-Century Women Poets. Oxford: Oxford University Press.
Google Scholar
Lydia, E. Laxmi, P. Govindasamy, S.K. Lakshmanaprabu, and D. Ramya. 2018. Document Clustering Based on Text Mining K-Means Algorithm Using Euclidean Distance Similarity. Journal of Advanced Research in Dynamical and Control Systems 10 (2): 208–214.
Google Scholar
McCarty, Willard. 2005. Humanities Computing. New York: Palgrave Macmillan.
Book Google Scholar
Nussbaum, Felicity, and Laura Brown, eds. 1987. The New Eighteenth Century: Theory, Politics, English Literature. North Yorkshire: Methuen.
Google Scholar
Ohmann, Richard. 1987. Politics of Letters. Middletown: Wesleyan University Press.
Google Scholar
Piper, Andrew. 2017. Think Small: On Literary Modelling. PMLA 132 (3): 651–658. https://doi.org/10.1632/pmla.2017.132.3.651.
Article Google Scholar
———. 2018. Enumerations: Data and Literary Study. Chicago: University of Chicago Press.
Book Google Scholar
Ramsey, Stephen. 2003. Special Section: Reconceiving Text Analysis: Toward an Algorithmic Criticism. Literary and Linguistic Computing 18 (2): 167–174. https://doi.org/10.1093/llc/18.2.167.
Article Google Scholar
Roberts, Margaret E., Brandon M. Stewart, and Dusting Tingley. 2019. stm: An R Package for Structural Topic Models. Journal of Statistical Software 91 (2): 1–40. https://doi.org/10.18637/jss.v091.i02.
Article Google Scholar
Schmidt, Benjamin M. Modeling Time. In Flanders and Jannidis, The Shape of Data in Digital Humanities, 150–166.
Google Scholar
Schreibman, Susan, Raymond G. Siemens, and John Unsworth, eds. 2016. A New Companion to Digital Humanities. Malden: Wiley/Blackwell.
Google Scholar
Shuttleton, David. 2015. Poetry. In The Cambridge Companion to Women’s Writing in Britain, 1660–1789, ed. Catherine Ingrassia, 103–117. Cambridge: Cambridge University Press.
Chapter Google Scholar
Sievert, Carson, and Kenneth E. Shirley. 2014. LDAvis: A Method for Visualizing and Interpreting Topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, ed. Jason Chang, Spence Green, Marti Hearst, Jeffrey Heer, and Philipp Koehn, 63–70. Baltimore: Association for Computational Linguistics. https://nlp.stanford.edu/events/illvi2014/papers/sievert-illvi2014.pdf.
Chapter Google Scholar
Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. Sebastopol: O’Reilly.
Google Scholar
Simsion, Graeme. 2007. Data Modeling: Theory and Practice. Bradley Beach: Technics Publications.
Google Scholar
Sitter, John. 2011. The Cambridge Introduction to Eighteenth-Century Poetry. Cambridge: Cambridge University Press.
Book Google Scholar
Srivastava, Ashok N., and Mehran Sahami, eds. 2009. Text Mining: Classification, Clustering, and Applications. Boca Raton: CRC Press.
Google Scholar
Tillotson, Geoffrey, Paul Fussell, and Marshall Waingrow, eds. 1969. Eighteenth-Century English Literature. New York: Harcourt, Brace, & World.
Google Scholar
Todd, Janet M. 1988. Feminist Literary History. Cambridge: Polity Press.
Google Scholar
Underwood, Ted. Algorithmic Modeling: Or, Modeling Data We Do Not Yet Understand. In Flanders and Jannidis, The Shape of Data in Digital Humanities, 250–263.
Google Scholar
Welbers, Kasper, Wouter Van Atteveldt, and Kenneth Benoit. 2017. Text Analysis in R. Communication Methods and Measures 11 (4): 245–265. https://doi.org/10.1080/19312458.2017.1387238.
Article Google Scholar
Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol: O’Reily.
Google Scholar
Zhang, Yin, Rong Jin, and Zhi-Hua Zhou. 2010. Understanding Bag-of-Words Model: A Statistical Framework. International Journal of Machine Learning and Cybernetics 1 (1–4): 43–52. https://doi.org/10.1007/s13042-010-0001-0.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Brigham Young University, Provo, UT, USA
Billy Hall

Authors

Billy Hall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Billy Hall .

Editor information

Editors and Affiliations

Zayed University, Abu Dhabi, United Arab Emirates
Ileana Baird

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hall, B. (2021). Text Mining and Data Visualization: Exploring Cultural Formations and Structural Changes in Fifty Years of Eighteenth-Century Poetry Criticism (1967–2018). In: Baird, I. (eds) Data Visualization in Enlightenment Literature and Culture . Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-54913-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-54913-8_5
Published: 24 March 2021
Publisher Name: Palgrave Macmillan, Cham
Print ISBN: 978-3-030-54912-1
Online ISBN: 978-3-030-54913-8
eBook Packages: Literature, Cultural and Media StudiesLiterature, Cultural and Media Studies (R0)

Publish with us

Policies and ethics