Abstract
This chapter is concerned with the process of encoding texts into numerical vectors as their representations, and its overview will be presented in Sect. 3.1.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2014)
Hyvarinen, A., Oja, E.: Independent component analysis: algorihtms and applications. Neural Netw. 4–5, 411–430 (2000)
Jo, T.: The Implementation of Dynamic Document Organization Using the Integration of Text Clustering and Text Categorization, University of Ottawa (2006)
Jo, T.: Modified version of SVM for text categorization. Int. J. Fuzzy Log. Intell. Syst. 8, 52–60 (2008)
Jo, T.: Inverted Index based modified version of KNN for text categorization. J. Inf. Process. Syst. 4, 17–26 (2008)
Jo, T.: Neural text categorizer for exclusive text categorization. J. Inf. Process. Syst. 4, 77–86 (2008)
Jo, T.: NTC (Neural Text Categorizer): neural network for text categorization. Int. J. Inf. Stud. 2, 83–96 (2010)
Jo, T.: Definition of table similarity for news article classification. In: The Proceedings of Fourth International Conference on Data Mining, pp. 202–207 (2012)
Jo, T.: Index optimization with KNN considering similarities among features. In: The Proceedings of 14th International Conference on Advances in Information and Knowledge Engineering, pp. 120–124 (2015)
Jo, T.: Normalized table matching algorithm as approach to text categorization. Soft Comput. 19, 839–849 (2015)
Jo, T.: Keyword extraction by KNN considering feature similarities. In: The Proceedings of The 2nd International Conference on Advances in Big Data Analysis, pp. 64–68 (2015)
Jo, T.: KNN based word categorization considering feature similarities. In: The Proceedings of 17th International Conference on Artificial Intelligence, pp. 343–346 (2015)
Jo, T., Cho, D.: Index based approach for text categorization. Int. J. Math. Comput. Simul. 2, 127–132 (2008)
Jo, T., Japkowicz, N.: Text clustering using NTSO. In: The Proceedings of IJCNN, pp. 558–563 (2005)
Jo, T., Lee, M., Kim, Y.: String vectors as a representation of documents with numerical vectors in text categorization. J. Converg. Inf. Technol. 2 66–73 (2007)
Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM-Self organizing maps of document collections. Neurocomputing 21, 101–117 (1998)
Leslie, C.S., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch String Kernels for Discriminative Protein Classification. Bioinformatics 20, 467–476 (2004)
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification with string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)
Poole, D.: Linear Algebra: A Modern Introduction. Brooks/Collen, Pacific Grove (2003)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Jo, T. (2019). Text Encoding. In: Text Mining. Studies in Big Data, vol 45. Springer, Cham. https://doi.org/10.1007/978-3-319-91815-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-91815-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91814-3
Online ISBN: 978-3-319-91815-0
eBook Packages: EngineeringEngineering (R0)