Abstract
Automated argument stance (pro/contra) detection is a challenging text categorization problem, especially if said arguments are to be detected for new topics. In previous research, we designed and evaluated an explainable machine learning based classifier. It was capable to achieve 96% F1 for argument stance recognition within the same topic and 60% F1 for previously unseen topics, which informed our hypothesis, that there are two sets of features in argument stance recognition: General features and topic specific features. An advantage of the described system is its quick transferability to new problems. Besides providing further details about the developed C3 TFIDF-SVM classifier, we investigate the classifiers effectiveness for different text categorization problems spanning two natural languages. Besides the quick transferability, the generation of human readable explanations about why specific results were achieved is a key feature of the described approach. We further investigate the generated explanation understandability and conduct a survey about how understandable the classifier’s explanations are.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
US National Library of Medicine National Institutes of Health pubmed.gov. https://www.ncbi.nlm.nih.gov/pubmed/. Accessed 17 Sep 2019
European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance); OJ L, 4 May, 2016, vol. 119, pp. 1–88 (2016)
Clos, J., Wiratunga, N., Massie, S.: Towards explainable text classification by jointly learning lexicon and modifier terms. In: IJCAI-17 Workshop on Explainable AI (XAI) (2017)
Lippi, M., Torroni, P.: Argument mining: a machine learning perspective. In: Black, E., Modgil, S., Oren, N. (eds.) TAFA 2015. LNCS (LNAI), vol. 9524, pp. 163–176. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-28460-6_10
Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C: Semeval-2016 task 6: detecting stance in tweets. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), pp. 31–41 (2016)
Eljasik-Swoboda, T., Engel, F., Hemmje, M.: Using topic specific features for argument stance recognition. In: Proceedings of the 8th International Conference on Data Science, Technology and Applications (DATA 2019), pp. 13–22 (2019). ISBN:978-989-758-377-3
Mohammad, S.M., Sobhani, P., Kiritchenko, S.: Stance and sentiment in tweets. ACM Trans. Internet Technol. Argument. Soc. Media 17, 1–23 (2016)
Stab, C., Miller, T., Schiller, B., Rai, P., Gurevych, I.: Cross-topic argument mining from heterogeneous sources. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018) (2018)
Same Side Stance Classification. https://sameside.webis.de/. Accessed 24 Sep 2019
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)
Bader, S., Hitzler, P.: Dimensions of neural-symbolic integration – a structured survey. arXiv preprint arXiv:cs/0511042 (2005)
Swoboda, T., Kaufmann, M., Hemmje, M.: Toward cloud-based classification and annotation support. In: Proceedings of the 6th International Conference on Cloud Computing and Services Science (CLOSER 2016), vol. 2, pp. 131–237 (2016)
McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 115–133 (1943)
Helbig, H., Scherer, A.: Kurs 1830: Neuronale Netze. University of Hagen, Germany (2011)
Arel, I., Rose, D.C., Karnowski, T.P.: Deep machine learning – a new frontier in artificial intelligence research. In: IEEE Computational Intelligence Magazine, USA, November issue, pp. 13–18 (2010)
Vapnik, V.N., Chervonenkis, A.Y.: On a class of algorithms of learning pattern recognition. Framework of the Generalised Portrait Method, Oб oднoм клacce aлгopитмoв oбyчeния pacпoзнaвaнию oбpaзoв, Aвтoмaтикa и тeлeмexaникa (1964)
Chang, C., Lin, C.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representation in vector space. In: Proceedings of Workshop at ICLR (2013)
Pennington, J., Socher, R., Manning, C.: GloBe: global vectors for word representation. In: Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)
Zanzotto, F.M., Korkontzelos, I., Fallucchi, F., Manandhar, S.: Estimating linear models for compositional distributed semantics. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1263–1271 (2010)
Kusner, M.J., Sun, Y., Kolkin, N., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning (2015)
Dai, X., Bikdash, M., Meyer, M.: From social media to public health surveillance: word embedding based clustering method for twitter classification. In: Proceedings of SoutheastCon, pp. 1–7 (2017). https://doi.org/10.1109/secon.2017.7925400
Eljasik-Swoboda, T., Kaufmann, M., Hemmje, M.: No target function classifier – fast unsupervised text categorization using semantic spaces. In: Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), pp. 35–46 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805v2 (2019)
Wolff, E.: Microservices – Flexible Software Architecture. Pearson Education, USA (2017)
Dropwizard: Production-ready, out of the box. https://dropwizard.io. Accessed 12 Sep 2019
Enterprise Container Platform | Docker. https://www.docker.com/. Accessed 30 Sep 2019
Peldszus, A.: An annotated corpus of argumentative microtexts. https://github.com/peldszus/arg-microtexts. Accessed 15 Mar 2019
Wiegand, M., Siegel, M., Ruppenhofer, J.: Overview of the GermEval 2018 shared task on the identification of offensive language. In: Proceedings of the GermEval, Vienna, Austria (2018)
Coucke, A., et al.: Snipts voice platform, an embedded spoken language understanding system for private-by-design voice interfaces. arXiv:1805.10190 (2018)
Acknowledgements
This work has been funded by the Deutsche Forschungsgemeinschaft (DFG) within the project Empfehlungsrationalisierung, Grant Number 376059226, as part of the Priority Program “Robust Argumentation Machines (RATIO)” (SPP-1999).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Eljasik-Swoboda, T., Engel, F., Hemmje, M. (2020). Explainable and Transferrable Text Categorization. In: Hammoudi, S., Quix, C., Bernardino, J. (eds) Data Management Technologies and Applications. DATA 2019. Communications in Computer and Information Science, vol 1255. Springer, Cham. https://doi.org/10.1007/978-3-030-54595-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-54595-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54594-9
Online ISBN: 978-3-030-54595-6
eBook Packages: Computer ScienceComputer Science (R0)