Abstract
In this paper, we address the task of cross-cultural deception detection. Using crowdsourcing, we collect four deception datasets, two in English (one originating from United States and one from India), one from Romanian speakers, and one in Spanish obtained from speakers from Mexico, covering three predetermined topics. We also collect two additional datasets, one for English from United States and one for Romanian, where the topic is not pre-specified. We run comparative experiments to evaluate the accuracies of deception classifiers built for each culture, and also to analyze classification differences within and across cultures. Our results show that we can leverage cross-cultural information, either through translation or equivalent semantic categories, and build deception classifiers with a performance ranging between 60–70 %.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We use the SVM classifier implemented in the Weka toolkit, with its default settings.
- 2.
- 3.
References
Almela A, Valencia-García R, Cantos P (2012) Seeing through deception: a computational approach to deceit detection in written communication. In: Proceedings of the workshop on computational approaches to deception detection. Association for Computational Linguistics, Avignon, pp 15–22. http://www.aclweb.org/anthology/W12-0403
DePaulo B, Lindsay J, Malone B, Muhlenbruck L, Charlton K, Cooper H (2003) Cues to deception. Psychol Bull 129(1):74–118
Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers, ACL ’12, vol 2. Association for Computational Linguistics, Stroudsburg, pp 171–175. http://dl.acm.org/citation.cfm?id=2390665.2390708
Fofiu A (2012) The romanian version of the liwc2001 dictionary and its application for text analysis with yoshikoder. Studia Universitatis Babes-Bolyai-Sociologia 57(2):139–151
Fornaciari T, Poesio M (2013) Automatic deception detection in italian court cases. Artif Intell Law 21(3):303–340. doi:10.1007/s10506-013-9140-4. http://dx.doi.org/10.1007/s10506-013-9140-4
Lewis C, George J (2008) Cross-cultural deception in social networking sites and face-to-face communication. Comput Human Behav 24(6):2945–2964. doi:10.1016/j.chb.2008.05.002. http://dx.doi.org/10.1016/j.chb.2008.05.002
Lewis C, George J, Giordano G (2009) A cross-cultural comparison of computer-mediated deceptive communication. In: Proceedings of Pacific Asia conference on information systems
Mihalcea R, Strapparava C (2009) The lie detector: explorations in the automatic recognition of deceptive language. In: Proceedings of the Association for Computational Linguistics (ACL 2009), Singapore
Newman M, Pennebaker J, Berry D, Richards J (2003) Lying words: predicting deception from linguistic styles. Personal Soc Psychol Bull 29:665–675
Ott M, Choi Y, Cardie C, Hancock J (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies - HLT ’11, vol 1. Association for Computational Linguistics, Stroudsburg, pp 309–319. URL http://dl.acm.org/citation.cfm?id=2002472.2002512
Peng H, Xiaoling C, Na C, Chandramouli R, Subbalakshmi P (2011) Adaptive context modeling for deception detection in emails. In: Proceedings of the 7th international conference on machine learning and data mining in pattern recognition, MLDM’11. Springer, Berlin/Heidelberg, pp 458–468. http://dl.acm.org/citation.cfm?id=2033831.2033870
Pennebaker J, Francis M (1999) Linguistic inquiry and word count: LIWC. Erlbaum Publishers, Mahwah
Ramírez-Esparza N, Pennebaker JW, García FA, Suriá Martínez R, et al (2007) La psicología del uso de las palabras: un programa de computadora que analiza textos en español (The psychology of word use: a computer program that analyzes texts in Spanish), pp 85–99
Rubin V (2010) On deception and deception detection: content analysis of computer-mediated stated beliefs. Proc Am Soc Inf Sci Technol 47(1):1–10. doi:10.1002/meet.14504701124. http://dx.doi.org/10.1002/meet.14504701124
Toma C, Hancock J (2010) Reading between the lines: linguistic cues to deception in online dating profiles. In: Proceedings of the 2010 ACM conference on computer supported cooperative work, CSCW ’10. ACM, New York, pp 5–8. doi:10.1145/1718918.1718921. http://doi.acm.org/10.1145/1718918.1718921
Toma C, Hancock J, Ellison N (2008) Separating fact from fiction: an examination of deceptive self-presentation in online dating profiles. Personal Soc Psychol Bull 34(8):1023–1036. doi:10.1177/0146167208318067. http://psp.sagepub.com/content/34/8/1023.abstract
Xu Q, Zhao H (2012) Using deep linguistic features for finding deceptive opinion spam. In: Proceedings of COLING 2012: posters. The COLING 2012. Organizing Committee, Mumbai, pp 1341–1350. http://www.aclweb.org/anthology/C12-2131
Zhang H, Wei S, Tan H, Zheng J (2009) Deception detection based on svm for chinese text in cmc. In: Sixth international conference on information technology: new generations, ITNG ’09, pp 481–486. doi:10.1109/ITNG.2009.66
Zhou L, Shi Y, Zhang, D (2008) A statistical language modeling approach to online deception detection. IEEE Trans Knowl Data Eng 20(8):1077–1081. doi:10.1109/TKDE.2007.190624. http://dx.doi.org/10.1109/TKDE.2007.190624
Zhou L, Twitchell D, Qin T, Burgoon J, Nunamaker J (2003) An exploratory study into deception detection in text-based computer-mediated communication. In: Proceedings of the 36th annual Hawaii international conference on system sciences (HICSS’03) - Track1 - HICSS ’03, vol 1. IEEE Computer Society, Washington, p 44.2. http://dl.acm.org/citation.cfm?id=820748.821356
Acknowledgements
This material is based in part upon work supported by National Science Foundation awards #1344257 and #1355633 and by DARPA-BAA-12-47 DEFT grant #12475008. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the Defense Advanced Research Projects Agency.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Perez-Rosas, V., Bologa, C., Burzo, M., Mihalcea, R. (2014). Deception Detection Within and Across Cultures. In: Biemann, C., Mehler, A. (eds) Text Mining. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-12655-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-12655-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12654-8
Online ISBN: 978-3-319-12655-5
eBook Packages: Computer ScienceComputer Science (R0)