Skip to main content

Text Mining and Data Visualization: Exploring Cultural Formations and Structural Changes in Fifty Years of Eighteenth-Century Poetry Criticism (1967–2018)

  • Chapter
  • First Online:
Data Visualization in Enlightenment Literature and Culture
  • 257 Accesses

Abstract

This chapter uses quantitative methods to identify thematic shifts in the past fifty years of criticism focused on eighteenth-century poetry. The author uses computational tools to examine trends in poetry criticism reflected by essays published in two flagship journals in the field: Eighteenth-Century Studies and The Eighteenth Century: Theory and Interpretation. His goal is to identify patterns in the dataset that have shaped our understanding of eighteenth-century poetry. By using algorithmic manipulation, k-means clustering, Latent Dirchlet Allocation (LDA) topic modeling, and data visualization, the author analyses patterns in attention to various texts and/or poets and makes inferences about disciplinary focus, direction of disciplinary practice, and the impact of gender on the poetic canon.

Models return us to the process—the tools, techniques and practices—through which we construct our knowledge of phenomena that exceed our direct observation.

—Andrew Piper, “Think Small: On Literary Modeling” (2017)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Jennifer Keith, “Why Poetry?,” The Eighteenth Century 48, no. 1 (2007): 87, https://doi.org/10.1353/ecy.2007.0002

  2. 2.

    Ibid., 91.

  3. 3.

    John Sitter, The Cambridge Introduction to Eighteenth-Century Poetry (Cambridge: Cambridge University Press, 2011), 2.

  4. 4.

    Andrew Piper, “Think Small: On Literary Modelling,” PMLA 132, no. 3 (2017): 651, https://doi.org/10.1632/pmla.2017.132.3.651

  5. 5.

    Willard McCarty, Humanities Computing (New York: Palgrave Macmillan, 2005), 27.

  6. 6.

    I have not included in this study monographs or essays in collections partly for practical reasons and partly because I wanted to map poetry criticism as broadly as possible.

  7. 7.

    All computational work was done with R, a statistical programing language commonly used in data science.

  8. 8.

    This is a bit uneven in part because TEC, which was originally Studies in Burke, did not begin its run until 1978, while ECS began its run in 1967.

  9. 9.

    Such errors include non-ascii characters, collapsed words, and encoding problems when images were read by the OCR software and converted to plain text from a PDF.

  10. 10.

    To rank terms, I used term frequency-inverse document frequency (tf-idf) statistics. Term frequency (tf) is simply how many times a term appears in a document. Common terms will have a higher tf while more specialized words will have a lower tf. Inverse-document frequency (idf) is computed by dividing the total number of documents in a corpus by the number of documents that contain a specific term. “Poet,” for example, might have a relatively high term frequency in an essay about poetry. However, that high frequency is offset by the idf, which adjusts to account for how many documents there are in an entire corpus with the word “poet” in them. Consequently, terms in a document with a high tf-idf score are likely indicators of a document’s content, while terms in a document with a low td-idf score are not likely indicators of a document’s content. In this case, I combined the tf-idf score for each term into one aggregate td-idf score and then selected only those essays with an aggregate td-idf score of 0.002 or higher.

  11. 11.

    Gerald Graff, Professing Literature: An Institutional History (Chicago: University of Chicago Press, 1987); Richard Ohmann, Politics of Letters (Middletown: Wesleyan University Press, 1987).

  12. 12.

    Piper, “Think Small,” 651.

  13. 13.

    Andrew Goldstone and William E. Underwood, “The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us,” New Literary History 45, no. 3 (2014): 379, https://doi.org/10.1353/nlh.2014.0025

  14. 14.

    Michael Gavin’s “Historical Text Networks: The Sociology of Early English Criticism” takes a quantitative approach to what he calls “historical text networks” derived from metadata of works contained in the Early English Books Online (EEBO) database. For details, see Michael Gavin, “Historical Text Networks: The Sociology of Early English Criticism,” Eighteenth-Century Studies 50, no. 1 (2016): 53–80, https://doi.org/10.1353/ecs.2016.0041

  15. 15.

    See Stephen Ramsey, “Special Section: Reconceiving Text Analysis: Toward an Algorithmic Criticism,” Literary and Linguistic Computing 18, no. 2 (2003): 167–74, https://doi.org/10.1093/llc/18.2.167; Goldstone and Underwood, “The Quiet Transformations of Literary Studies,” 359–84; and Dan Edelstein, “Enlightenment Scholarship by the Numbers: dfr.jstor.org, Dirty Quantification, and the Future of the Lit Review,” Republics of Letters 4, no. 1 (2014): 1–26, https://arcade.stanford.edu/sites/default/files/article_pdfs/ROFL_v5_Edelstein_final.pdf

  16. 16.

    Matthew Jockers, Macroanalysis: Digital Methods & Literary History (Urbana: University of Illinois Press, 2013); Andrew Piper, Enumerations: Data and Literary Study (Chicago: University of Chicago Press, 2018); and McCarty, Humanities Computing.

  17. 17.

    McCarty, Humanities Computing, 27.

  18. 18.

    See McCarty, Humanities Computing; the essays collected in Julia Flanders and Fotis Jannidis, eds., The Shape of Data in Digital Humanities: Modeling Texts and Text-Based Resources (London: Routledge, 2018); Piper, “Think Small;” Arianna Ciula and Oyvind Eide, “Modelling in the Digital Humanities: Signs in Context,” Digital Scholarship in the Humanities 32, no. 1 (2017): i33-i46, https://doi.org/10.1093/llc/fqw045; and Graeme Simsion, Data Modeling: Theory and Practice (New Jersey: Technics Publications, 2007).

  19. 19.

    Julia Flanders and Fotis Jannidis, “Data Modeling,” in A New Companion to Digital Humanities, ed. Susan Schreibman, Raymond G. Siemens, and John Unsworth (Malden: Wiley/Blackwell, 2016), 230.

  20. 20.

    Piper, “Think Small,” 652.

  21. 21.

    Ibid.

  22. 22.

    For a concise introduction to text mining, see Matthew L. Jockers and Ted Underwood, “Text-Mining the Humanities,” in Schreibman, Siemens, and Unsworth, A New Companion to Digital Humanities, 291–306. The field-specific literature on text mining is vast and often complicated for humanists not trained in statistics and computer science. Ashok N. Srivastava and Mehran Sahami, eds., Text Mining: Classification, Clustering, and Applications (Boca Raton, FL: CRC Press, 2009) and Michael W. Berry, ed., Survey of Text Mining: Clustering, Classification, and Retrieval (New York: Springer-Verlag, 2004) are both excellent introductions to methods for text mining and knowledge discovery. More practical and hands on approaches to text mining include Julia Silge and David Robinson, Text Mining with R: A Tidy Approach (Sebastopol, CA: O’Reilly, 2017); Matthew L. Jockers, Text Analysis with R for Students of Literature (New York: Springer, 2014); Kasper Welbers, Wouter Van Atteveldt, and Kenneth Benoit, “Text Analysis in R,” Communication Methods and Measures 11, no. 4 (2017): 245–65, https://doi.org/10.1080/19312458.2017.1387238; Ted Kwartler, Text Mining in Practice with R (Hoboken: John Wiley & Sons, 2017); and Hadley Wickham and Garrett Grolemund, R for Data Science: Import, Tidy, Transform, Visualize, and Model Data (Sebastopol, CA: O’Reily, 2017).

  23. 23.

    David M. Blei, “Topic Modeling and Digital Humanities,” Journal of Digital Humanities 2, no. 1 (2012): n.p., http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/. The literature on topic modeling is large and ranges from technical essays on statistics and algorithm performance to application in humanities to tutorials. For useful overviews of topic modeling in the humanities, see Scott Weingart’s post, “Topic Modeling for Humanists: A Guided Tour,” The Scottbot Irregular (blog), July 25, 2012, http://scottbot.net/2012/07/; Ted Underwood’s various posts on The Stone and the Shell; and Andrew Pipers’ posts, “Topic Modelling Literary Studies: Topic Stability, Part 1,” Txtlab (blog), May 28, 2010, https://txtlab.org/2018/05/topic-modelling-literary-studies-part-1-topic-stability/, and “Topic Stability, Part 2,” Txtlab (blog), June 7, 2018, https://txtlab.org/2018/06/topic-stability-part-2/. The essays in The Journal of the Digital Humanities 1, no. 1 (2012) also provide an excellent introduction to topic modeling. David M. Blei has also written several essays both technical and introductory; among them, “Probabilistic Topic Models,” Communications of the ACM 55, no. 4 (2012): 77–84, https://doi.org/10.1145/2133806.2133826 is one of the clearest introductions. For reading and interpreting topic models, see Jonathan Chang, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M. Blei, “Reading Tea Leaves: How Humans Interpret Topic Models,” in Advances in Neural Information Processing Systems 22, ed. Y. Bengio, D. Schuurmans, J. D. Lafferty, C.K.I. Williams, and A. Cullota (Vancouver: Curran Associates, 2009), 288–96.

  24. 24.

    The “bag-of-words” (BoW) approach does not include word order or other semantic markers. Instead, a BoW algorithm generates a unique vocabulary for a text or text collection and some unit of measurement—simple word frequency or tf-idf are common measures. Because the BoW method is common in natural language processing and text mining, there are a number of tutorials available. A good introduction can be found in Yin Zhang, Rong Jin, and Zhi-Hua Zhou, “Understanding Bag-of-Words model: A Statistical Framework,” International Journal of Machine Learning and Cybernetics 1, no. 1–4 (2010): 43–52, https://doi.org/10.1007/s13042-010-0001-0

  25. 25.

    Ted Underwood, “Algorithmic Modeling: Or, Modeling Data We Do Not Yet Understand,” in Flanders and Jannidis, The Shape of Data in Digital Humanities, 261. I discuss topic modeling in detail below.

  26. 26.

    The k-means algorithm is normally used for document clustering when browsing a collection for similar documents or other search engine implementations. For details, see E. Laxmi Lydia, P. Govindasamy, S. K. Lakshmanaprabu, and D. Ramya, “Document Clustering Based on Text Mining K-Means Algorithm Using Euclidean Distance Similarity,” Journal of Advanced Research in Dynamical and Control Systems 10, no. 2 (2018): 208–14; and Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut, “A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques,” arXiv e-print (2017), arXiv:1707.02919.

  27. 27.

    I used here the standard Gibbs version of LDA in the “topicsmodels” package for R, with minimal tweaking of the default setting. However, I arrived at the number fifty after cross-validating with the algorithms from the “ldatuning” package.

  28. 28.

    Underwood points out this problem in his article, “Algorithmic Modeling.” As of now, I have only begun to experiment with spherical k-means and other versions of the k-means algorithm that try to overcome this limitation.

  29. 29.

    The essays were selected based on the highest tf-idf score for each genre term. I then handchecked the three genre lists to ensure that the included essays did, in fact, deal primarily with genre-based material.

  30. 30.

    To minimize the impact of dimensionality, I varied the sparsity of the genre group document matrix to limit the number of words used by the algorithm. This reduction is, of course, not trivial because it means that document clustering will be based on a set of words that appear within a range of very common to very uncommon. A very sparse matrix, for instance, will contain all the words, many of which will only appear in one or a handful of documents. On the other hand, a matrix with very low sparsity will only contain those words that appear in all or nearly all documents. The idea of multiple models was to get closer to a set of words that will be representative of an essay by eliminating idiosyncratic words and, at the same time, not becoming so generic that the words fail to distinguish between documents and become, therefore, useless for clustering.

  31. 31.

    I used the tf-idf scores to separate essays that had a high tf-idf score on a list of poetry terms (i.e., poet, poets, poem, poems, poetry, poetics), and to obtain an aggregate score for all poetry terms. Using those numbers, I removed articles with an aggregate tf-idf poetry terms score below 0.002, which resulted in a PCC of 310 essays.

  32. 32.

    The term was coined in Felicity Nussbaum and Laura Brown, eds., The New Eighteenth Century: Theory, Politics, English Literature (North Yorkshire: Methuen, 1987).

  33. 33.

    As an unsupervised method, k-means, like topic modeling, requires the modeler to determine a value for k that determines the number of clusters the algorithm will generate. For the k-means analysis of the poetry corpus, I used the elbow method and the Hopkins factor for estimating clusterability and the number of clusters.

  34. 34.

    Each iteration set k to 3, 6, and 9 and included terms ranging from 25,521 to 1318 to 318 (sparsity of 93%, 51%, and 25%, respectively), with the key feature being the tf-idf score for each word in the matrix.

  35. 35.

    I also ran k-means on two more versions of the corpus (one with author names and one with author names removed), using raw term frequency as the main feature. In each of the eighteen iterations, the algorithm performed poorly at clustering these essays.

  36. 36.

    Latent Dirchlet Allocation (LDA) is a statistical way of organizing data according to similarity. Topic modeling that uses LDA allows for creating a nuanced model of a text corpus because it moves beyond mere word frequency to generate lists of words likely to occur together in a document. The assumption is that every document is a collection of topics, so LDA assigns a probability for each topic to occur in each individual document.

  37. 37.

    An stm is a version of the LDA algorithm that allows for document-level metadata to be included in the modeling process. For technical information and an overview of the stm features, see Margaret E. Roberts, Brandon M. Stewart, and Dusting Tingley, “stm: An R Package for Structural Topic Models,” Journal of Statistical Software 91, no. 2 (2019): 1–40, https://doi.org/10.18637/jss.v091.i02

  38. 38.

    Ibid., 1–2.

  39. 39.

    The main issue with network graphs representing a model in its entirety is that, in practice, every topic is connected to every node, which does not translate into very readable and meaningful visualizations. Using multidimensional scaling allows for presenting the entire model, without eliminating data, in a legible way. Ted Underwood has an excellent illustration of these issues in “Visualizing Topic Models,” The Stone and the Shell (blog), November 11, 2012, https://tedunderwood.com/2012/11/11/visualizing-topic-models/. See also the conversation in comments.

  40. 40.

    LDAvis offers a two-dimensional alternative to the network graph for capturing and interpreting the model when zoomed out to visualize the entire model. It is browser-based and is best viewed in its dynamic form.

  41. 41.

    As far as I can tell, the multidimensional scaling (mds) done in LDAvis is a distance method of Jensen-Shannon and the multidimensional scaling handled by the cmdscale function, which is, in turn, plotted in a two-dimensional space. See Carson Sievert and Kenneth E. Shirley, “LDAvis: A Method for Visualizing and Interpreting Topics,” in Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, ed. Jason Chang, Spence Green, Marti Hearst, Jeffrey Heer, and Philipp Koehn (Baltimore, MA: Association for Computational Linguistics, 2014), 63–70, https://nlp.stanford.edu/events/illvi2014/papers/sievert-illvi2014.pdf

  42. 42.

    See Benjamin M. Schmidt, “Modeling Time,” in Flanders and Jannidis, The Shape of Data in Digital Humanities, 150–66.

  43. 43.

    The early prevalence of these topics in ECS is partly explained by the fact that TEC does not begin its print run until 1979.

  44. 44.

    Roger Lonsdale, Eighteenth-Century Women Poets (Oxford: Oxford University Press, 1989).

  45. 45.

    Paula Backscheider and Catherine Ingrassia, eds. British Women Poets of the Long Eighteenth Century (Baltimore: Johns Hopkins University Press, 2009).

  46. 46.

    Janet M. Todd, Feminist Literary History (Cambridge: Polity Press, 1988); Margaret J. M. Ezell, Writing Women’s Literary History (Baltimore: Johns Hopkins University Press, 1993); Carol Barash, English Women’s Poetry, 1649–1714 (Oxford: Clarendon Press, 1996); Isobel Armstrong, Victorian Poetry: Poetry, Poetics, and Politics (London: Routledge, 1993); Isobel Armstrong and Virginia Blain, eds., Women’s Poetry in the Enlightenment: The Making of a Canon, 1730–1820 (New York: St. Martin’s Press, 1999).

  47. 47.

    David Shuttleton, “Poetry,” in The Cambridge Companion to Women’s Writing in Britain, 1660–1789, ed. Catherine Ingrassia (Cambridge: Cambridge University Press, 2015), 103.

  48. 48.

    Ibid.

  49. 49.

    Ibid.

  50. 50.

    David Fairer and Christine Gerard, eds., Eighteenth-Century Poetry: An Annotated Anthology, 3rd ed. (West Sussex: Wiley Blackwell, 2015). The Blackwell anthology is currently the only major non-gender-specific volume dedicated to eighteenth-century poetry, which makes it an important locus for a general picture of what poets and poems matter at the moment. Moreover, it cut a middle ground between anthologies dedicated to eighteenth-century poetry written by women and earlier anthologies that were by default dedicated to male eighteenth-century poets. See, for instance, Louis I. Bredvold, Robert K. Root, and George Sherburn, eds., Eighteenth-Century Poetry and Prose (New York: Ronald Press, 1932) and Geoffrey Tillotson, Paul Fussell, and Marshall Waingrow, eds., Eighteenth-Century English Literature (New York: Harcourt, Brace, & World, 1969), which combined have only three female poets versus over one hundred male poets.

  51. 51.

    A bigram is a sequence of two adjacent elements from a string of tokens; in our case, the authors’ first and second names. Because poets are often referred to only by their last name, I also tested the last names of poets as a unigram, that is, an n-gram consisting of a single item from a sequence. While the relative frequencies went up, the overall map of the poets across time did not.

  52. 52.

    McCarty, Humanities Computing, 25.

Bibliography

  • Allahyari, Mehdi, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, and Krys Kochut. 2017. A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques. arXiv e-print. arXiv:1707.02919.

    Google Scholar 

  • Armstrong, Isobel. 1993. Victorian Poetry: Poetry, Poetics, and Politics. London: Routledge.

    Google Scholar 

  • Armstrong, Isobel, and Virginia Blain, eds. 1999. Women’s Poetry in the Enlightenment: The Making of a Canon, 1730–1820. New York: St. Martin’s Press.

    Google Scholar 

  • Backscheider, Paula, and Catherine Ingrassia, eds. 2009. British Women Poets of the Long Eighteenth Century. Baltimore: Johns Hopkins University Press.

    Google Scholar 

  • Barash, Carol. 1996. English Women’s Poetry, 1649–1714. Oxford: Clarendon Press.

    Google Scholar 

  • Berry, Michael W. 2004. Survey of Text Mining: Clustering, Classification, and Retrieval. New York: Springer.

    Book  Google Scholar 

  • Blei, David M. 2012a. Probabilistic Topic Models. Communications of the ACM 55 (4): 77–84. https://doi.org/10.1145/2133806.2133826.

    Article  Google Scholar 

  • ———. 2012b. Topic Modeling and Digital Humanities. Journal of Digital Humanities 2 (1): n.p. http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/

  • Bredvold, Louis I., Robert K. Root, and George Sherburn, eds. 1932. Eighteenth-Century Poetry and Prose. New York: Ronald Press.

    Google Scholar 

  • Chang, Jonathan, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M. Blei. 2009. Reading Tea Leaves: How Humans Interpret Topic Models. In Advances in Neural Information Processing Systems, ed. Y. Bengio, D. Schuurmans, J.D. Lafferty, C.K.I. Williams, and A. Cullota, vol. 22, 288–296. Vancouver: Curran Associates.

    Google Scholar 

  • Ciula, Arianna, and Oyvind Eide. 2017. Modelling in the Digital Humanities: Signs in Context. Digital Scholarship in the Humanities 32 (1): i33–i46. https://doi.org/10.1093/llc/fqw045.

    Article  Google Scholar 

  • Edelstein, Dan. 2014. Enlightenment Scholarship by the Numbers: dfr.jstor.org, Dirty Quantification, and the Future of the Lit Review. Republics of Letters 4 (1): 1–26. https://arcade.stanford.edu/sites/default/files/article_pdfs/ROFL_v5_Edelstein_final.pdf

  • Ezell, Margaret J.M. 1993. Writing Women’s Literary History. Baltimore: Johns Hopkins University Press.

    Google Scholar 

  • Fairer, David, and Christine Gerard, eds. 2015. Eighteenth-Century Poetry: An Annotated Anthology. 3rd ed. West Sussex: Wiley Blackwell.

    Google Scholar 

  • Flanders, Julia, and Fotis Jannidis, eds. 2018. The Shape of Data in Digital Humanities: Modeling Texts and Text-Based Resources. London: Routledge.

    Google Scholar 

  • ———. Data Modeling. In Schreibman, Siemens, and Unsworth, A New Companion to Digital Humanities, 229–237.

    Google Scholar 

  • Gavin, Michael. 2016. Historical Text Networks: The Sociology of Early English Criticism. Eighteenth-Century Studies 50 (1): 53–80. https://doi.org/10.1353/ecs.2016.0041.

    Article  Google Scholar 

  • Goldstone, Andrew, and William E. Underwood. 2014. The Quiet Transformations of Literary Studies: What Thirteen Thousand Scholars Could Tell Us. New Literary History 45 (3): 359–384. https://doi.org/10.1353/nlh.2014.0025.

    Article  Google Scholar 

  • Graff, Gerald. 1987. Professing Literature: An Institutional History. Chicago: University of Chicago Press.

    Google Scholar 

  • Jockers, Matthew L. 2013. Macroanalysis: Digital Methods & Literary History. Urbana: University of Illinois Press.

    Book  Google Scholar 

  • ———. 2014. Text Analysis with R for Students of Literature. New York: Springer.

    Google Scholar 

  • Jockers, Matthew L., and Ted Underwood. Text-Mining the Humanities. In Schreibman, Siemens, and Unsworth, A New Companion to Digital Humanities, 291–306.

    Google Scholar 

  • Keith, Jennifer. 2007. Why Poetry? The Eighteenth Century 48 (1): 87–91. https://doi.org/10.1353/ecy.2007.0002.

    Article  Google Scholar 

  • Kwartler, Ted. 2017. Text Mining in Practice with R. Hoboken: Wiley.

    Book  Google Scholar 

  • Lonsdale, Roger. 1989. Eighteenth-Century Women Poets. Oxford: Oxford University Press.

    Google Scholar 

  • Lydia, E. Laxmi, P. Govindasamy, S.K. Lakshmanaprabu, and D. Ramya. 2018. Document Clustering Based on Text Mining K-Means Algorithm Using Euclidean Distance Similarity. Journal of Advanced Research in Dynamical and Control Systems 10 (2): 208–214.

    Google Scholar 

  • McCarty, Willard. 2005. Humanities Computing. New York: Palgrave Macmillan.

    Book  Google Scholar 

  • Nussbaum, Felicity, and Laura Brown, eds. 1987. The New Eighteenth Century: Theory, Politics, English Literature. North Yorkshire: Methuen.

    Google Scholar 

  • Ohmann, Richard. 1987. Politics of Letters. Middletown: Wesleyan University Press.

    Google Scholar 

  • Piper, Andrew. 2017. Think Small: On Literary Modelling. PMLA 132 (3): 651–658. https://doi.org/10.1632/pmla.2017.132.3.651.

    Article  Google Scholar 

  • ———. 2018. Enumerations: Data and Literary Study. Chicago: University of Chicago Press.

    Book  Google Scholar 

  • Ramsey, Stephen. 2003. Special Section: Reconceiving Text Analysis: Toward an Algorithmic Criticism. Literary and Linguistic Computing 18 (2): 167–174. https://doi.org/10.1093/llc/18.2.167.

    Article  Google Scholar 

  • Roberts, Margaret E., Brandon M. Stewart, and Dusting Tingley. 2019. stm: An R Package for Structural Topic Models. Journal of Statistical Software 91 (2): 1–40. https://doi.org/10.18637/jss.v091.i02.

    Article  Google Scholar 

  • Schmidt, Benjamin M. Modeling Time. In Flanders and Jannidis, The Shape of Data in Digital Humanities, 150–166.

    Google Scholar 

  • Schreibman, Susan, Raymond G. Siemens, and John Unsworth, eds. 2016. A New Companion to Digital Humanities. Malden: Wiley/Blackwell.

    Google Scholar 

  • Shuttleton, David. 2015. Poetry. In The Cambridge Companion to Women’s Writing in Britain, 1660–1789, ed. Catherine Ingrassia, 103–117. Cambridge: Cambridge University Press.

    Chapter  Google Scholar 

  • Sievert, Carson, and Kenneth E. Shirley. 2014. LDAvis: A Method for Visualizing and Interpreting Topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, ed. Jason Chang, Spence Green, Marti Hearst, Jeffrey Heer, and Philipp Koehn, 63–70. Baltimore: Association for Computational Linguistics. https://nlp.stanford.edu/events/illvi2014/papers/sievert-illvi2014.pdf.

    Chapter  Google Scholar 

  • Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. Sebastopol: O’Reilly.

    Google Scholar 

  • Simsion, Graeme. 2007. Data Modeling: Theory and Practice. Bradley Beach: Technics Publications.

    Google Scholar 

  • Sitter, John. 2011. The Cambridge Introduction to Eighteenth-Century Poetry. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Srivastava, Ashok N., and Mehran Sahami, eds. 2009. Text Mining: Classification, Clustering, and Applications. Boca Raton: CRC Press.

    Google Scholar 

  • Tillotson, Geoffrey, Paul Fussell, and Marshall Waingrow, eds. 1969. Eighteenth-Century English Literature. New York: Harcourt, Brace, & World.

    Google Scholar 

  • Todd, Janet M. 1988. Feminist Literary History. Cambridge: Polity Press.

    Google Scholar 

  • Underwood, Ted. Algorithmic Modeling: Or, Modeling Data We Do Not Yet Understand. In Flanders and Jannidis, The Shape of Data in Digital Humanities, 250–263.

    Google Scholar 

  • Welbers, Kasper, Wouter Van Atteveldt, and Kenneth Benoit. 2017. Text Analysis in R. Communication Methods and Measures 11 (4): 245–265. https://doi.org/10.1080/19312458.2017.1387238.

    Article  Google Scholar 

  • Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. Sebastopol: O’Reily.

    Google Scholar 

  • Zhang, Yin, Rong Jin, and Zhi-Hua Zhou. 2010. Understanding Bag-of-Words Model: A Statistical Framework. International Journal of Machine Learning and Cybernetics 1 (1–4): 43–52. https://doi.org/10.1007/s13042-010-0001-0.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Billy Hall .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hall, B. (2021). Text Mining and Data Visualization: Exploring Cultural Formations and Structural Changes in Fifty Years of Eighteenth-Century Poetry Criticism (1967–2018). In: Baird, I. (eds) Data Visualization in Enlightenment Literature and Culture . Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-030-54913-8_5

Download citation

Publish with us

Policies and ethics