An Approach to Tweets Categorization by Using Machine Learning Classifiers in Oil Business

Aldahawi, Hanaa; Allen, Stuart

doi:10.1007/978-3-319-18117-2_40

Hanaa Aldahawi¹⁴ &
Stuart Allen¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9042))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

3393 Accesses
1 Citations

Abstract

The rapid growth in social media data has motivated the development of a real time framework to understand and extract the meaning of the data. Text categorization is a well-known method for understanding text. Text categorization can be applied in many forms, such as authorship detection and text mining by extracting useful information from documents to sort a set of documents automatically into predefined categories. Here, we propose a method for identifying those who posted the tweets into categories. The task is performed by extracting key features from tweets and subjecting them to a machine learning classifier. The research shows that this multi-classification task is very difficult, in particular the building of a domain-independent machine learning classifier. Our problem specifically concerned tweets about oil companies, most of which were noisy enough to affect the accuracy. The analytical technique used here provided structured and valuable information for oil companies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alchemy API, AlchemyAPI,Inc. (2015), http://www.alchemyapi.com/
Amazon Mechanical Turk, https://www.mturk.com
Alag, S.: Collective intelligence in action. Manning, New York (2009)
Google Scholar
Aldahawi, H., Allen, S.: Twitter Mining in the Oil Business: A Sentiment Analysis Approach. In: The 3rd International Conference on Cloud and Green Computing (CGC). IEEE (2013)
Google Scholar
Billsus, D., Pazzani, M.: User Modeling for Adaptive News Access. User Modeling and User-Adapted Interaction 10(2-3), 147–180 (2000)
Article Google Scholar
Bollen, J., Mao, H., Zeng, X.: Twitter Mood Predicts the Stock Market. Journal of Computational Science 2(1), 1–8 (2011)
Article Google Scholar
Fournier, S., Avery, J.: The uninvited brand. Business Horizons 54(3), 193–207 (2011)
Article Google Scholar
Ghazanfar, M.A.: Robust, Scalable, and Practical Algorithms for Recommender Systems, University of Southampto (2012)
Google Scholar
Ghazanfar, M.A., Prügel-Bennett, A.: The Advantage of Careful Imputation Sources in Sparse Data-Environment of Recommender Systems: Generating Improved SVD-based Recommendations. Informatica (Slovenia) 37(1), 61–92 (2013)
Google Scholar
Ghazanfar, M.A., Prügel-Bennett, A., Szedmak, S.: Kernel-Mapping Recommender System Algorithms. Information Sciences 208, 81–104 (2012)
Article Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: an Update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Article Google Scholar
Jindal, N., Liu, B.: Review Spam Detection. In: The 16th International Conference on World Wide Web. ACM (2007)
Google Scholar
Mooney, R.J., Roy, L.: Content-Based Book Recommending Using Learning for Text Categorisation. In: The 5th ACM Conference on Digital Libraries. ACM (2000)
Google Scholar
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs Up?: Sentiment Classification Using Machine Learning Techniques. In: The ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10. Association for Computational Linguistics (2002)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)
Article Google Scholar
Twitter, Twitter,Inc. (2015), https://about.twitter.com/company
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kauffman, San Francisco (1999)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)
Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision 73(2), 213–238 (2007)
Article Google Scholar
Zhang, T., Popescul, A., Dom, B.: Linear Prediction Models with Graph Regularization for Web-Page Categorisation. In: The 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM (2006)
Google Scholar
Zhang, X., Fuehres, H., Gloor, P.A.: Predicting Stock Market Indicators Through Twitter “I Hope It Is Not as Bad as I Fear”. Procedia-Social and Behavioral Sciences 26, 55–62 (2011)
Article Google Scholar
Poria, S., Gelbukh, A., Cambria, E., Yang, P., Hussain, A., Durrani, T.: Merging SenticNet and WordNet-Affect emotion lists for sentiment analysis. In: 2012 IEEE 11th International Conference on Signal Processing (ICSP), October 21-25, vol. 2, pp. 1251–1255 (2012)
Google Scholar
Poria, S., Cambria, E., Winterstein, G., Huang, G.-B.: Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems 69, 45–63 (2014), http://dx.doi.org/10.1016/j.knosys.2014.05.005 ISSN 0950-7051
Poria, S., Gelbukh, A., Das, D., Bandyopadhyay, S.: Fuzzy Clustering for Semi-supervised Learning–Case Study: Construction of an Emotion Lexicon. In: Batyrshin, I., González Mendoza, M. (eds.) MICAI 2012, Part I. LNCS, vol. 7629, pp. 73–86. Springer, Heidelberg (2013)
Chapter Google Scholar
Cambria, E., Fu, J., Bisio, F., Poria, S.: AffectiveSpace 2: Enabling Affective Intuition for Concept-Level Sentiment Analysis. In: Twenty-ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Poria, S., Cambria, E., Hussain, A., Huang, G.-B.: Towards an intelligent framework for multimodal affective data analysis. Neural Networks 63, 104–116 (2015), http://dx.doi.org/10.1016/j.neunet.2014.10.005 ISSN 0893-6080
Poria, S., Cambria, E., Ku, L.-W., Gui, C., Gelbukh, A.: A rule-based approach to aspect extraction from product reviews. In: SocialNLP 2014, vol. 28 (2014)
Google Scholar
Poria, S., Gelbukh, A., Cambria, E., Das, D., Bandyopadhyay, S.: Enriching SenticNet polarity scores through semi-supervised fuzzy clustering. In: 2012 IEEE 12th International Conference on Data Mining Workshops (ICDMW), pp. 709–716. IEEE (2012)
Google Scholar
Poria, S., Gelbukh, A., Hussain, A., Bandyopadhyay, S., Howard, N.: Music genre classification: A semi-supervised approach. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Rodríguez, J.S., di Baja, G.S. (eds.) MCPR 2012. LNCS, vol. 7914, pp. 254–263. Springer, Heidelberg (2013)
Chapter Google Scholar
Poria, S., Gelbukh, A., Cambria, E., Hussain, A., Huang, G.-B.: EmoSenticSpace: A novel framework for affective common-sense reasoning. Knowledge-Based Systems 69, 108–123 (2014)
Article Google Scholar
Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., Bandyopadhyay, S.: Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining. IEEE Intelligent Systems 28(2), 31–38 (2013), doi:10.1109/MIS.2013.4
Article Google Scholar
Poria, S., Agarwal, B., Gelbukh, A., Hussain, A., Howard, N.: Dependency-based semantic parsing for concept-level text analysis. In: Gelbukh, A. (ed.) CICLing 2014, Part I. LNCS, vol. 8403, pp. 113–127. Springer, Heidelberg (2014)
Chapter Google Scholar
Poria, S., Gelbukh, A., Agarwal, B., Cambria, E., Howard, N.: Common sense knowledge based personality recognition from text. In: Castro, F., Gelbukh, A., González, M. (eds.) MICAI 2013, Part II. LNCS, vol. 8266, pp. 484–496. Springer, Heidelberg (2013)
Chapter Google Scholar
Cambria, E., Poria, S., Gelbukh, A., Kwok, K.: Sentic API: A common-sense based API for concept-level sentiment analysis. In: Proceedings of the 4th Workshop on Making Sense of Microposts (# Microposts2014), co-located with the 23rd International World Wide Web Conference (WWW 2014), Seoul, Korea. CEUR Workshop Proceedings, vol. 1141, pp. 19–24 (2014)
Google Scholar
Agarwal, B., Poria, S., Mittal, N., Gelbukh, A., Hussain, A.: Concept-Level Sentiment Analysis with Dependency-Based Semantic Parsing: A Novel Approach. In: Cognitive Computation, pp. 1–13 (2015)
Google Scholar
Poria, S., Cambria, E., Howard, N., Huang, G.-B., Hussain, A.: Fusing Audio, Visual and Textual Clues for Sentiment Analysis from Multimodal Content. Neurocomputing (2015)
Google Scholar
Chikersal, P., Poria, S., Cambria, E.: SeNTU: Sentiment analysis of tweets by combining a rule-based classifier with supervised learning. In: Proceedings of the International Workshop on Semantic Evaluation, SemEval 2015 (2015)
Google Scholar
Minhas, S., Poria, S., Hussain, A., Hussainey, K.: A review of artificial intelligence and biologically inspired computational approaches to solving issues in narrative financial disclosure. In: Liu, D., Alippi, C., Zhao, D., Hussain, A. (eds.) BICS 2013. LNCS, vol. 7888, pp. 317–327. Springer, Heidelberg (2013)
Chapter Google Scholar
Pakray, P., Poria, S., Bandyopadhyay, S., Gelbukh, A.: Semantic textual entailment recognition using UNL. Polibits 43, 23–27 (2011)
Google Scholar
Das, D., Poria, S., Bandyopadhyay, S.: A classifier based approach to emotion lexicon construction. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 320–326. Springer, Heidelberg (2012)
Chapter Google Scholar
Sidorov, G.: Should syntactic n-grams contain names of syntactic relations. International Journal of Computational Linguistics and Applications 5(1), 139–158 (2014)
MathSciNet Google Scholar
Sidorov, G., Gelbukh, A., Gómez-Adorno, H., Pinto, D.: Soft Similarity and Soft Cosine Measure: Similarity of Features in Vector Space Model. Computación y Sistemas 18(3) (2014)
Google Scholar
Sidorov, G., Kobozeva, I., Zimmerling, A., Chanona-Hernández, L., Kolesnikova, O.: Modelo computacional del diálogo basado en reglas aplicado a un robot guía móvil. Polibits 50, 35–42 (2014)
Google Scholar
Ben-Ami, Z., Feldman, R., Rosenfeld, B.: Using Multi-View Learning to Improve Detection of Investor Sentiments on Twitter. Computación y Sistemas 18(3) (2014)
Google Scholar
Das, N., Ghosh, S., Gonçalves, T., Quaresma, P.: Comparison of Different Graph Distance Metrics for Semantic Text Based Classification. Polibits 49, 51–57 (2014)
Google Scholar
Alonso-Rorís, V.M., Gago, J.M.S., Rodríguez, R.P., Costa, C.R., Carballa, M.A.G., Rifón, L.A.: Information Extraction in Semantic, Highly-Structured, and Semi-Structured Web Sources. Polibits 49, 69–75 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Informatics, Cardiff University, Cardiff, UK
Hanaa Aldahawi & Stuart Allen

Authors

Hanaa Aldahawi
View author publications
You can also search for this author in PubMed Google Scholar
Stuart Allen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hanaa Aldahawi .

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico DF, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aldahawi, H., Allen, S. (2015). An Approach to Tweets Categorization by Using Machine Learning Classifiers in Oil Business. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9042. Springer, Cham. https://doi.org/10.1007/978-3-319-18117-2_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-18117-2_40
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18116-5
Online ISBN: 978-3-319-18117-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics