ICMWT 2017: Mobile and Wireless Technologies 2017 pp 465-469 | Cite as
Large Scale Text Classification with Efficient Word Embedding
Abstract
This article offers an empirical exploration on the efficient use of word-level convolutional neural networks (word-CNN) for large-scale text classification. Generally, the word-CNNs are difficult to train on large-scale datasets as the size of word embedding dramatically increases as the size of vocabulary increases. In order to handle this issue, this paper presents a de-noise approach to word embedding. We compare our model with several recently proposed CNN models on publicly available dataset. The experimental results show that proposed method improves the usefulness of word-CNN and increases the accuracy of text classification.
Keywords
Text Classification Sentiment Analysis Convolutional Neural Network Vocabulary Size Distinct WordNotes
Acknowledgments
This research was supported by the MISP(Ministry of Science, ICT & Future Planning), Korea, under the National Program for Excellence in SW) supervised by the IITP(Institute for Information & communications Technology Promotion).
References
- 1.LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
- 2.Lai S, Xu L, Liu K, Zhao J (2015) Recurrent convolutional neural networks for text classification. In: Proceedings of AAAI, vol 333, pp 2267–2273Google Scholar
- 3.Yih W, He X, Meek C (2014) Semantic parsing for single-relation question answering. In: Proceedings of the 52th annual meeting of the association for computational linguistics, pp 643–648Google Scholar
- 4.Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
- 5.dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of 24th international conference on computational linguistics, pp 69–78Google Scholar
- 6.Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Advances in neural information processing systems, pp 649–657Google Scholar
- 7.Mikolov T, et al. (2013) Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119Google Scholar
- 8.Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 655–665Google Scholar
- 9.Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S et al (2015) DBPedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant Web 6(2):167–195Google Scholar
- 10.