Skip to main content
Log in

Extending monitoring methods to textual data: a research agenda

  • Published:
Quality & Quantity Aims and scope Submit manuscript

Abstract

Textual data has become increasingly common in business analytic data sets. While concept-based text mining offers a method of extracting meaningful information from text data, methods for monitoring of customer perceptions of business processes and products that are discussed in customer-generated documents are not immediately available. We explore the results of two text-mining algorithms and review issues observed in the data that affect uploading the results onto a newly proposed methodological monitoring platform analogous to statistical process control charts. Finally, we discuss several topics for future research in text mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Allen, H., Gearan, P., Rexer, K.: In: 5th Annual Data Mining Survey—2011 Survey Summary Report. http://www.rexeranalytics.com/Data-Miner-Survey-Results-2011.html (2011). Accessed 31 July 2012

  • Ashton, T., Evangelopoulos, N.: Control charts for customer comments: a case study and a research agenda. In: Proceeding of the Southwest Decision Sciences Institute, pp. 661–669 (2012)

  • Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    Google Scholar 

  • Bradford, R.: An empirical study of required dimensionality for large scale latent semantic indexing applications. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 153–162. ACM, New York (2008)

  • Browne, M.: An overview of analytic rotation in exploratory factor analysis. Multivar. Behav. Res. 36(1), 111–150 (2001)

    Article  Google Scholar 

  • Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  • Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Stat. Data Anal. 52, 3913–3927 (2008)

    Article  Google Scholar 

  • Evangelopoulos, N., Zhang, X., Prybutok, V.: Latent semantic analysis: five methodological recommendations. Eur. J. Inf. Syst. 21, 70–86 (2012)

    Article  Google Scholar 

  • Franzosi, R., Doyle, S., McClelland, L., Rankin, C., Vicari, S.: Quantitative narrative analysis software options compared: PC-ACE and CAQDAS (ATLAS.ti, MAXqda, and NVivo). Qual. Quant. (2012). doi:10.1007/s11135-012-9714-3

  • Gaussier, E., Goutte, C.: Relationship between PLSA and NMF and implications. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 601–602 (2005)

  • Grun, B., Hornik, K.: TopicModels: an R package for fitting topic models. J. Stat. Softw. 40(13), 1–30 (2011)

    Google Scholar 

  • Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the Twenty-Second Annual International SIGIR Conference, pp. 50–57. ACM, New York (1999)

  • IBM: Mastering new challenges in text analytics. ftp://public.dhe.ibm.com/common/ssi/rep_wh/n/IMW14301USEN/IMW14301USEN.PDF (2010). Accessed 21 Dec 2012

  • Intel IT Center: Peer research—big data analytics. http://www.intel.com/content/www/us/en/big-data/data-insights-peer-research-report.html?wapkw=big+data (2012). Accessed 24 Aug 2012

  • Kintsch, W., Mangalath, P.: The construct of meaning. Top. Cogn. Sci. 3, 346–370 (2011)

    Article  Google Scholar 

  • Laundauer, T., Dumais, S.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211–240 (1997)

    Article  Google Scholar 

  • Lee, D., Seung, H.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)

    Article  Google Scholar 

  • Leech, N., Onwuegbuzie, A.: Qualitative data analysis: a compendium of techniques and a framework for selection for school psychology research and beyond. Sch. Psychol. Q. 23(4), 587–604 (2008)

    Article  Google Scholar 

  • Lifchitz, A., Jhean-Larose, S., Denhière, G.: Effect of tuned parameters on a LSA multiple choice questions answering model. Behav. Res. Methods 41(4), 1201–1209 (2009)

    Article  Google Scholar 

  • Lo, S.: Web service quality control based on text mining using support vector machine. Expert Syst. Appl. 34, 603–610 (2008)

    Article  Google Scholar 

  • Merlo, A., Goodman, A., McClenaghan, B., Fritz, S.: Participants’ perspectives on the feasibility of a novel, intensive, task-specific intervention for individuals with chronic stroke: A qualitative analysis. Phys. Ther. 93(2), 147–157 (2013)

    Google Scholar 

  • Nakov, P., Popova, A., Mateev, P.: Weight functions impact on LSA performance. In: Proceedings of the EuroConference Recent Advances in Natural Language Processing, pp. 187–193 (2001)

  • Nakov, P., Valchanova, E., Angelova, G.: Towards deeper understanding of the LSA performance. In: Proceeding Recent Advances in Natural Language Processing, pp. 297–306 (2003)

  • Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)

    Article  Google Scholar 

  • Patton, M.: Enhancing the quality and credibility of qualitative analysis. Health Serv. Res. 34(5 Part II), 1189–1208 (1999)

    Google Scholar 

  • Poortman, C., Schildkamp, K.: Alternative quality standards in qualitative research? Qual. Quant. 46, 1726–1751 (2012)

    Article  Google Scholar 

  • Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  • Porter, M.: Snowball: A Language for Stemming Algorithms. http://snowball.tartarus.org/texts/introduction.html. (2001). Accessed 6 Aug 2012

  • R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org/ (2012)

  • Riordan, B., Jones, M.: Redundancy in perceptual and linguistic experience: comparing feature-based and distributional models of semantic representation. Top. Cogn. Sci. 3, 303–345 (2011)

    Article  Google Scholar 

  • Russom, P.: TDWI Best Practices Report: Big Data Analytics. The Data Warehouse Institute. http://tdwi.org/research/2011/12/sas_best-practices-report-q4-big-data-analytics/asset.aspx?tc=assetpg (2011). Accessed 27 Oct 2011

  • SAS: Getting Started with SAS Text Miner 12.1. http://support.sas.com/documentation/onlinedoc/txtminer/12.1/tmgs.pdf (2012). Accessed 21 Dec 2012

  • Swanborn, P.: A common base for quality control criteria in quantitative and qualitative research. Qual. Quant. 30, 19–35 (1996)

    Article  Google Scholar 

  • Zikopoulos, P.C., Eaton, C., deRoos, D., Deutsch, T., Lapis, G.: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw-Hill, New York (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Triss Ashton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ashton, T., Evangelopoulos, N. & Prybutok, V. Extending monitoring methods to textual data: a research agenda. Qual Quant 48, 2277–2294 (2014). https://doi.org/10.1007/s11135-013-9891-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11135-013-9891-8

Keywords

Navigation