Abstract
This paper presents and explores the idea of deriving numerical indicators from texts, that is, converting text data to numerical data that has predictive or diagnostic value. One application of such a general capability is to the provisional identification of networks, or rather, of associations within networks. Conversely, given a network structure among entities that are associated with various texts, the network structure can itself contribute usefully to construction of indicators derived from texts. The focus of the paper is on basic concepts and methods for deriving indicators from texts. Much research remains to be done.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Archak, N., Ghose, A., Ipeirotis, P.: Show me the money! Deriving the pricing power of product features by mining customer reviews. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2007), San Jose, CA. ACM (August 2007)
Beinhocker, E.D.: The origin of wealth: Evolution, complexity, and the radical remaking of economics. Harvard Business School Press, Boston (2006)
Balakrishnan, K., Ghose, A., Ipeirotis, P.: The impact of information disclosure on stock market returns: The Sarlanes-Oxley Act and the role of media as an information intermediary. In: Workshop on Economics and Information Security (WEIS 2008) (Dartmouth College), File (2008), http://weis2008.econinfosec.org/papers/Ghose.pdf
Blair, D.C., Kimbrough, S.O.: Exemplary documents: a foundation for information retrieval design. Information Processing and Management 38(3), 363–379 (2002)
Cecchini, M.: Quantifying the risk of financial events using kernel methods and information retrieval, Ph.D.thesis, University of Florida, Gainesville, FL (2005)
Chen, G.T., Kimbrough, S., Lee, T.: A note on automated support for product application discovery. In: Dutta, A., Goes, P. (eds.) Proceedings of the Fourteenth Annual Workshop on Information Technologies and Systems (WITS 2004), Washington, D.C, pp. 128–133 (2004)
Dworman, G.O., Kimbrough, S.O., Patch, C.: On pattern-directed search of archives and collections. Journal of the American Society for Information Science 51(1), 14–23 (2000)
Dworman, G.O.: Pattern-oriented access to document collections, Ph.D. thesis, University of Pennsylvania, Philadelphia, PA, Available as a working paper, Department of Operations and Information Management (1999)
Feldman, R., Sanger, J.: The text mining handbook: Advanced approaches in analyzing unstructured data. Cambridge University Press, Cambridge (2007)
Fukuyama, F.: Trust. The Free Press, New York (1995)
Glickman, T.S., Terry, K.S.: Using the news to develop a worldwide database of hazardous events: A report of the results of a 75-day experiment, with recommendations for further action, National Science Foundation research grant no. SBR-9309369 report, Center for Risk Management, Resources for the Future, Washington, DC (1994)
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of Knowledge Discovery in Databases, KDD 2004 (2004)
Jackson, P., Moulinier, I.: Natural language processing for online applications: Text retrieval. John Benjamins Publishing Company, Amsterdam (2002)
Kimbrough, S.O., MacMillan, I., Ranieri, J.: Process and system for matching products and markets. United States Patent 7,257,568 (August 14, 2007), www.uspto.gov
Kimbrough, S.O., MacMillan, I., Ranieri, J., Thompson, J.D.: Categorized document bases. United States Patent Application 20070106662 (May 10, 2007), http://www.uspto.gov
Konchady, M.: Text mining application programming. Charles River Media, Boston (2006)
Lee, T.Y.: Use-centric mining of customer reviews. In: Proceedings of the 2004 Workshop on Information Technology and Systems, WITS (2004)
Lee, T.: Learning industry-specific voluntary disclosures from SEC 10-K regulatory filings, Winter Information Systems Conference (University of Utah, UT) (March 2008)
Liu, B., Hu, M., Cheng, J.: Opinion observer: Analyzing and comparing opinions on the web. In: Proceedings of WWW 2005 (2005)
Li, F.: Do stock market investors understand the risk sentiment of corporate annual reports? In: Working paper SSRN 898181, University of Michigan, Ann Arbor, MI (2006)
Lauw, H.W., Lim, E.-P., Pang, H.: TUBE (Text-cUBE) for discovering documentary evidence of associations among entities. In: Proceedings of the 22nd Annual ACM Symposium on Applied Computing, SAC 2007, Seoul, Korea, March 11-15, pp. 824–828. ACM (2007), http://www.acm.org/conferences/sac/sac2007/ ; Indicators from Texts 29 (2009)
Lee, T., Li, S., Wei, R.: Needs-centric searching and ranking based on customer reviews. In: IEEE Conference on Electronic Commerce, Washington, D.C. IEEE (July 2008)
Mieszkowski, K.: Steal this bookmark!, Salon, www.salon.com (February 2005), http://dir.salon.com/story/tech/feature/2005/02/08/tagging/index.html
Moens, M.-F.: Automatic indexing and abstracting of document texts. The Information Retrieval Series, vol. 6. Springer, Germany (2000) ISBN: 978-0-7923-7793-1
Nasukawa, T., Yi, J.: Sentiment analysis: Capturing favorability using natural language processing. In: Proceedings of the Second International Conference on Knowledge Capture (K-CAP 2003) (October 2003)
Popescu, A.-M., Etzioni, O.: Extracting product features and opinions from reviews. In: Proceedings of HLTEMNLP (2005)
Putnam, R.D.: Bowling alone: The collapse and revival of American community. Simon & Schuster, New York (2000)
Scaffidi, C., Bierhoff, K., Chang, E., Felker, M., Ng, H., Chun, J.: Red Opal: Product-feature scoring from reviews. In: ACM Conference on Electronic Commerce, San Diego, CA. ACM (June 2007)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)
Voss, J.: Tagging, folksonomy & co - renaissance of manual indexing? In: Proceedings of the International Symposium of Information Science, pp. 234–254 (2007)
Yi, J., Nasukawa, T., Bunescu, R., Niblack, W.: Sentiment analyzer: Extracting of sentiments towards a given topic using NLP techniques. In: The Third IEEE International Conference on Data Mining, ICDM 2003 (November 2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kimbrough, S.O., Lee, T.Y., Oktem, U. (2012). On Deriving Indicators from Texts. In: Dolk, D., Granat, J. (eds) Modeling for Decision Support in Network-Based Services. Lecture Notes in Business Information Processing, vol 42. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27612-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-27612-5_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27611-8
Online ISBN: 978-3-642-27612-5
eBook Packages: Computer ScienceComputer Science (R0)