Skip to main content
Book cover

Text Mining pp 157–175Cite as

Deception Detection Within and Across Cultures

  • Chapter
  • First Online:

Abstract

In this paper, we address the task of cross-cultural deception detection. Using crowdsourcing, we collect four deception datasets, two in English (one originating from United States and one from India), one from Romanian speakers, and one in Spanish obtained from speakers from Mexico, covering three predetermined topics. We also collect two additional datasets, one for English from United States and one for Romanian, where the topic is not pre-specified. We run comparative experiments to evaluate the accuracies of deception classifiers built for each culture, and also to analyze classification differences within and across cultures. Our results show that we can leverage cross-cultural information, either through translation or equivalent semantic categories, and build deception classifiers with a performance ranging between 60–70 %.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We use the SVM classifier implemented in the Weka toolkit, with its default settings.

  2. 2.

    http://www.liwc.net/descriptiontable1.php.

  3. 3.

    http://www.bing.com/dev/en-us/dev-center.

References

  1. Almela A, Valencia-García R, Cantos P (2012) Seeing through deception: a computational approach to deceit detection in written communication. In: Proceedings of the workshop on computational approaches to deception detection. Association for Computational Linguistics, Avignon, pp 15–22. http://www.aclweb.org/anthology/W12-0403

  2. DePaulo B, Lindsay J, Malone B, Muhlenbruck L, Charlton K, Cooper H (2003) Cues to deception. Psychol Bull 129(1):74–118

    Article  Google Scholar 

  3. Feng S, Banerjee R, Choi Y (2012) Syntactic stylometry for deception detection. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics: short papers, ACL ’12, vol 2. Association for Computational Linguistics, Stroudsburg, pp 171–175. http://dl.acm.org/citation.cfm?id=2390665.2390708

  4. Fofiu A (2012) The romanian version of the liwc2001 dictionary and its application for text analysis with yoshikoder. Studia Universitatis Babes-Bolyai-Sociologia 57(2):139–151

    Google Scholar 

  5. Fornaciari T, Poesio M (2013) Automatic deception detection in italian court cases. Artif Intell Law 21(3):303–340. doi:10.1007/s10506-013-9140-4. http://dx.doi.org/10.1007/s10506-013-9140-4

  6. Lewis C, George J (2008) Cross-cultural deception in social networking sites and face-to-face communication. Comput Human Behav 24(6):2945–2964. doi:10.1016/j.chb.2008.05.002. http://dx.doi.org/10.1016/j.chb.2008.05.002

  7. Lewis C, George J, Giordano G (2009) A cross-cultural comparison of computer-mediated deceptive communication. In: Proceedings of Pacific Asia conference on information systems

    Google Scholar 

  8. Mihalcea R, Strapparava C (2009) The lie detector: explorations in the automatic recognition of deceptive language. In: Proceedings of the Association for Computational Linguistics (ACL 2009), Singapore

    Google Scholar 

  9. Newman M, Pennebaker J, Berry D, Richards J (2003) Lying words: predicting deception from linguistic styles. Personal Soc Psychol Bull 29:665–675

    Article  Google Scholar 

  10. Ott M, Choi Y, Cardie C, Hancock J (2011) Finding deceptive opinion spam by any stretch of the imagination. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies - HLT ’11, vol 1. Association for Computational Linguistics, Stroudsburg, pp 309–319. URL http://dl.acm.org/citation.cfm?id=2002472.2002512

  11. Peng H, Xiaoling C, Na C, Chandramouli R, Subbalakshmi P (2011) Adaptive context modeling for deception detection in emails. In: Proceedings of the 7th international conference on machine learning and data mining in pattern recognition, MLDM’11. Springer, Berlin/Heidelberg, pp 458–468. http://dl.acm.org/citation.cfm?id=2033831.2033870

  12. Pennebaker J, Francis M (1999) Linguistic inquiry and word count: LIWC. Erlbaum Publishers, Mahwah

    Google Scholar 

  13. Ramírez-Esparza N, Pennebaker JW, García FA, Suriá Martínez R, et al (2007) La psicología del uso de las palabras: un programa de computadora que analiza textos en español (The psychology of word use: a computer program that analyzes texts in Spanish), pp 85–99

    Google Scholar 

  14. Rubin V (2010) On deception and deception detection: content analysis of computer-mediated stated beliefs. Proc Am Soc Inf Sci Technol 47(1):1–10. doi:10.1002/meet.14504701124. http://dx.doi.org/10.1002/meet.14504701124

  15. Toma C, Hancock J (2010) Reading between the lines: linguistic cues to deception in online dating profiles. In: Proceedings of the 2010 ACM conference on computer supported cooperative work, CSCW ’10. ACM, New York, pp 5–8. doi:10.1145/1718918.1718921. http://doi.acm.org/10.1145/1718918.1718921

  16. Toma C, Hancock J, Ellison N (2008) Separating fact from fiction: an examination of deceptive self-presentation in online dating profiles. Personal Soc Psychol Bull 34(8):1023–1036. doi:10.1177/0146167208318067. http://psp.sagepub.com/content/34/8/1023.abstract

  17. Xu Q, Zhao H (2012) Using deep linguistic features for finding deceptive opinion spam. In: Proceedings of COLING 2012: posters. The COLING 2012. Organizing Committee, Mumbai, pp 1341–1350. http://www.aclweb.org/anthology/C12-2131

  18. Zhang H, Wei S, Tan H, Zheng J (2009) Deception detection based on svm for chinese text in cmc. In: Sixth international conference on information technology: new generations, ITNG ’09, pp 481–486. doi:10.1109/ITNG.2009.66

    Google Scholar 

  19. Zhou L, Shi Y, Zhang, D (2008) A statistical language modeling approach to online deception detection. IEEE Trans Knowl Data Eng 20(8):1077–1081. doi:10.1109/TKDE.2007.190624. http://dx.doi.org/10.1109/TKDE.2007.190624

  20. Zhou L, Twitchell D, Qin T, Burgoon J, Nunamaker J (2003) An exploratory study into deception detection in text-based computer-mediated communication. In: Proceedings of the 36th annual Hawaii international conference on system sciences (HICSS’03) - Track1 - HICSS ’03, vol 1. IEEE Computer Society, Washington, p 44.2. http://dl.acm.org/citation.cfm?id=820748.821356

Download references

Acknowledgements

This material is based in part upon work supported by National Science Foundation awards #1344257 and #1355633 and by DARPA-BAA-12-47 DEFT grant #12475008. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the Defense Advanced Research Projects Agency.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Veronica Perez-Rosas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Perez-Rosas, V., Bologa, C., Burzo, M., Mihalcea, R. (2014). Deception Detection Within and Across Cultures. In: Biemann, C., Mehler, A. (eds) Text Mining. Theory and Applications of Natural Language Processing. Springer, Cham. https://doi.org/10.1007/978-3-319-12655-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12655-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12654-8

  • Online ISBN: 978-3-319-12655-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics