Abstract
Textual data has become increasingly common in business analytic data sets. While concept-based text mining offers a method of extracting meaningful information from text data, methods for monitoring of customer perceptions of business processes and products that are discussed in customer-generated documents are not immediately available. We explore the results of two text-mining algorithms and review issues observed in the data that affect uploading the results onto a newly proposed methodological monitoring platform analogous to statistical process control charts. Finally, we discuss several topics for future research in text mining.
Similar content being viewed by others
References
Allen, H., Gearan, P., Rexer, K.: In: 5th Annual Data Mining Survey—2011 Survey Summary Report. http://www.rexeranalytics.com/Data-Miner-Survey-Results-2011.html (2011). Accessed 31 July 2012
Ashton, T., Evangelopoulos, N.: Control charts for customer comments: a case study and a research agenda. In: Proceeding of the Southwest Decision Sciences Institute, pp. 661–669 (2012)
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Bradford, R.: An empirical study of required dimensionality for large scale latent semantic indexing applications. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 153–162. ACM, New York (2008)
Browne, M.: An overview of analytic rotation in exploratory factor analysis. Multivar. Behav. Res. 36(1), 111–150 (2001)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Stat. Data Anal. 52, 3913–3927 (2008)
Evangelopoulos, N., Zhang, X., Prybutok, V.: Latent semantic analysis: five methodological recommendations. Eur. J. Inf. Syst. 21, 70–86 (2012)
Franzosi, R., Doyle, S., McClelland, L., Rankin, C., Vicari, S.: Quantitative narrative analysis software options compared: PC-ACE and CAQDAS (ATLAS.ti, MAXqda, and NVivo). Qual. Quant. (2012). doi:10.1007/s11135-012-9714-3
Gaussier, E., Goutte, C.: Relationship between PLSA and NMF and implications. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 601–602 (2005)
Grun, B., Hornik, K.: TopicModels: an R package for fitting topic models. J. Stat. Softw. 40(13), 1–30 (2011)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference, pp. 50–57. ACM, New York (1999)
IBM: Mastering new challenges in text analytics. ftp://public.dhe.ibm.com/common/ssi/rep_wh/n/IMW14301USEN/IMW14301USEN.PDF (2010). Accessed 21 Dec 2012
Intel IT Center: Peer research—big data analytics. http://www.intel.com/content/www/us/en/big-data/data-insights-peer-research-report.html?wapkw=big+data (2012). Accessed 24 Aug 2012
Kintsch, W., Mangalath, P.: The construct of meaning. Top. Cogn. Sci. 3, 346–370 (2011)
Laundauer, T., Dumais, S.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)
Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Leech, N., Onwuegbuzie, A.: Qualitative data analysis: a compendium of techniques and a framework for selection for school psychology research and beyond. Sch. Psychol. Q. 23(4), 587–604 (2008)
Lifchitz, A., Jhean-Larose, S., Denhière, G.: Effect of tuned parameters on a LSA multiple choice questions answering model. Behav. Res. Methods 41(4), 1201–1209 (2009)
Lo, S.: Web service quality control based on text mining using support vector machine. Expert Syst. Appl. 34, 603–610 (2008)
Merlo, A., Goodman, A., McClenaghan, B., Fritz, S.: Participants’ perspectives on the feasibility of a novel, intensive, task-specific intervention for individuals with chronic stroke: A qualitative analysis. Phys. Ther. 93(2), 147–157 (2013)
Nakov, P., Popova, A., Mateev, P.: Weight functions impact on LSA performance. In: Proceedings of the EuroConference Recent Advances in Natural Language Processing, pp. 187–193 (2001)
Nakov, P., Valchanova, E., Angelova, G.: Towards deeper understanding of the LSA performance. In: Proceeding Recent Advances in Natural Language Processing, pp. 297–306 (2003)
Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)
Patton, M.: Enhancing the quality and credibility of qualitative analysis. Health Serv. Res. 34(5 Part II), 1189–1208 (1999)
Poortman, C., Schildkamp, K.: Alternative quality standards in qualitative research? Qual. Quant. 46, 1726–1751 (2012)
Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Porter, M.: Snowball: A Language for Stemming Algorithms. http://snowball.tartarus.org/texts/introduction.html. (2001). Accessed 6 Aug 2012
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org/ (2012)
Riordan, B., Jones, M.: Redundancy in perceptual and linguistic experience: comparing feature-based and distributional models of semantic representation. Top. Cogn. Sci. 3, 303–345 (2011)
Russom, P.: TDWI Best Practices Report: Big Data Analytics. The Data Warehouse Institute. http://tdwi.org/research/2011/12/sas_best-practices-report-q4-big-data-analytics/asset.aspx?tc=assetpg (2011). Accessed 27 Oct 2011
SAS: Getting Started with SAS Text Miner 12.1. http://support.sas.com/documentation/onlinedoc/txtminer/12.1/tmgs.pdf (2012). Accessed 21 Dec 2012
Swanborn, P.: A common base for quality control criteria in quantitative and qualitative research. Qual. Quant. 30, 19–35 (1996)
Zikopoulos, P.C., Eaton, C., deRoos, D., Deutsch, T., Lapis, G.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill, New York (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ashton, T., Evangelopoulos, N. & Prybutok, V. Extending monitoring methods to textual data: a research agenda. Qual Quant 48, 2277–2294 (2014). https://doi.org/10.1007/s11135-013-9891-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-013-9891-8