Planning for Text Analytics

Anandarajan, Murugan; Hill, Chelsey; Nolan, Thomas

doi:10.1007/978-3-319-95663-3_3

Murugan Anandarajan⁶,
Chelsey Hill⁷ &
Thomas Nolan⁸

Part of the book series: Advances in Analytics and Data Science ((AADS,volume 2))

Abstract

This chapter encourages readers to consider the reason for their analysis to chart the correct path for conducing it. This chapter outlines the process for planning the text analytics process. The chapter starts by asking the analyst to consider the objective, data availability, cost, and outcome desired. Analysis paths are then shown as possible ways to achieve the goal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In Microsoft Excel, random numbers can be generated using the function = RANDBETWEEN. The function requires minimum and maximum values as inputs. In the example the function would be = RANDBETWEEN(1,20), and the function would need to be copied to four cells to produce four random numbers between 1 and 20.

References

Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc.
Google Scholar
Boudah, D. J. (2011). Identifying a research problem and question and searching relevant literature. In Conducting educational research: Guide to completing a major project. Thousand Oaks: SAGE Publications.
Chapter Google Scholar
Cukier, K. (2010). Data, data everywhere: A special report on managing information. Economist Newspaper.
Google Scholar
Feinerer, I., Hornik, K., & Meyer, D. (2008). Text Mining Infrastructure in R. Journal of Statistical Software, 25(5): 1–54. http://www.jstatsoft.org/v25/i05/.
Feldman, R., & Sanger, J. (2007). The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge: Cambridge University Press.
Google Scholar
Granello, D. H., & Wheaton, J. E. (2004). Online data collection: Strategies for research. Journal of Counseling & Development, 82(4), 387–393.
Article Google Scholar
Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244.
Article Google Scholar
Kabanoff, B. (1996). Computers can read as well as count: How computer-aided text analysis can benefit organisational research. Trends in organizational behavior, 3, 1–22.
Google Scholar
Krippendorff, K. (2004). Reliability in content analysis: Some common misconceptions and recommendations. Human communication research, 30(3), 411–433.
Google Scholar
Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244.
Article Google Scholar
Krippendorff, K. (2012). Content analysis: An introduction to its methodology. Thousand Oaks: Sage.
Google Scholar
Krippendorff, K., & Bock, M. A. (2009). The content analysis reader. Thousand Oaks: Sage.
Google Scholar
Kroenke, D. M., & Auer, D. J. (2010). Database processing (Vol. 6). Upper Saddle River: Prentice Hall.
Google Scholar
Lin, F. R., Hsieh, L. S., & Chuang, F. T. (2009). Discovering genres of online discussion threads via text mining. Computers & Education, 52(2), 481–495.
Article Google Scholar
Marshall, M. N. (1996). Sampling for qualitative research. Family Practice, 13(6), 522–526.
Article Google Scholar
Neuendorf, K. A. (2016). The content analysis guidebook. Sage.
Google Scholar
Pipino, L. L., Lee, Y. W., & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211–218.
Article Google Scholar
Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4), 3–13.
Google Scholar
Scheaffer, R. L., Mendenhall, W., III, Ott, R. L., & Gerow, K. G. (2011). Elementary survey sampling. Boston: Cengage Learning.
Google Scholar
Scheaffer, R. L., Mendenhall, W., III, Ott, R. L., & Gerow, K. G. (2011). Elementary survey sampling. Boston: Cengage Learning.
Google Scholar
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1–47.
Article Google Scholar
Shapiro, G., & Markoff, J. (1997). A Matter of Definition. In C.W. Roberts (Ed.), Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Transcripts, Mahwah, NJ: Lawrence Erlbaum Associates.
Google Scholar
Silge, J., & Robinson, D. (2016). tidytext: Text Mining and Analysis Using Tidy Data Principles in R. Journal of Statistical Software, 1(3).
Article Google Scholar
Stepchenkova, S. (2012). Content analysis. In L. Dwyer et al. (ed.), Handbook of research methods in tourism: Quantitative and qualitative approaches (pp. 443–458). Edward Elger Publishing.
Google Scholar
Stone, P.J. (1997). Thematic text analysis. In C.W. Roberts (Ed.), Text Analysis for the Social Sciences: Methods for Drawing Statistical Inferences from Texts and Transcripts (pp. 35-54). Mahwah, NJ: Lawrence Erlbaum Associates.
Google Scholar
Ur-Rahman, N., & Harding, J. A. (2012). Textual data mining for industrial knowledge management and text classification: A business oriented approach. Expert Systems with Applications, 39(5), 4729-4739.
Article Google Scholar
Webb, L. M., & Wang, Y. (2014). Techniques for sampling online text-based data sets. In Big data management, technologies, and applications (pp. 95–114). Hershey: IGI Global.
Chapter Google Scholar
Wiedemann, G. (2013). Opening up to big data: Computer-assisted analysis of textual data in social sciences. Historical Social Research/Historische Sozialforschung, 38(4), 332–357.
Google Scholar
Yang, Y. (1996). Sampling strategies and learning efficiency in text categorization. In M. Hearst & H. Hirsh (Eds.), AAAI spring symposium on machine learning in information access (pp. 88–95). Menlo Park: AAAI Press.
Google Scholar
Yu, C. H., Jannasch-Pennell, A., & DiGangi, S. (2011). Compatibility between text mining and qualitative research in the perspectives of grounded theory, content analysis, and reliability. The Qualitative Report, 16(3), 730.
Google Scholar
Zanasi, A. (2005). Text mining tools. In Text Mining and its Applications to Intelligence, CRM and Knowledge Management. WIT Press, Southampton Boston, 315–327.
Chapter Google Scholar
Zhai, C., & Massung, S. (2016). Text data management and analysis: A practical introduction to information retrieval and text mining. San Rafael: Morgan & Claypool.
Google Scholar

Author information

Authors and Affiliations

LeBow College of Business, Drexel University, Philadelphia, PA, USA
Murugan Anandarajan
Feliciano School of Business, Montclair State University, Montclair, NJ, USA
Chelsey Hill
Mercury Data Science, Houston, TX, USA
Thomas Nolan

Authors

Murugan Anandarajan
View author publications
You can also search for this author in PubMed Google Scholar
Chelsey Hill
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Nolan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Anandarajan, M., Hill, C., Nolan, T. (2019). Planning for Text Analytics. In: Practical Text Analytics. Advances in Analytics and Data Science, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-95663-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-95663-3_3
Published: 20 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95662-6
Online ISBN: 978-3-319-95663-3
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

Planning for Text Analytics

Abstract

Access this chapter

Notes

References

Further Reading

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Planning for Text Analytics

Abstract

Access this chapter

Notes

References

Further Reading

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation