Overview

Authors:

Jacques Savoy ⁰

Jacques Savoy
1. Department of Computer Science, University of Neuchatel, Neuchâtel, Switzerland
View author publications

You can also search for this author in PubMed Google Scholar

Presents various machine-learning models used to solve various stylometric questions like authorship attribution, author profiling, or detecting fake news
Illustrates the approaches discussed using three real case studies: on Elena Ferrante, tweet bots, and political speeches by US presidents over the last 230 years
Complemented by a Github website with additional examples and datasets in R

12k Accesses
27 Citations
3 Altmetric

This is a preview of subscription content, log in via an institution to check access.

Access this book

eBook USD 129.00

Price excludes VAT (USA)

Hardcover Book USD 169.99

Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Other ways to access

Licence this eBook for your library

Institutional subscriptions

Table of contents (11 chapters)

Front Matter

Pages i-xix

Download chapter PDF
Fundamental Concepts and Models
1. Front Matter
  
  Pages 1-2
  
  Download chapter PDF
2. Introduction to Stylistic Models and Applications
  
  Jacques Savoy
  
  Pages 3-17
3. Basic Lexical Concepts and Measurements
  
  Jacques Savoy
  
  Pages 19-32
4. Distance-Based Approaches
  
  Jacques Savoy
  
  Pages 33-51
Advanced Models and Evaluation
1. Front Matter
  
  Pages 53-54
  
  Download chapter PDF
2. Evaluation Methodology and Test Corpora
  
  Jacques Savoy
  
  Pages 55-81
3. Features Identification and Selection
  
  Jacques Savoy
  
  Pages 83-108
4. Machine Learning Models
  
  Jacques Savoy
  
  Pages 109-151
5. Advanced Models for Stylometric Applications
  
  Jacques Savoy
  
  Pages 153-187
Cases Studies
1. Front Matter
  
  Pages 189-190
  
  Download chapter PDF
2. Elena Ferrante: A Case Study in Authorship Attribution
  
  Jacques Savoy
  
  Pages 191-210
3. Author Profiling of Tweets
  
  Jacques Savoy
  
  Pages 211-227
4. Applications to Political Speeches
  
  Jacques Savoy
  
  Pages 229-249
5. Conclusion
  
  Jacques Savoy
  
  Pages 251-253
Back Matter

Pages 255-286

Download chapter PDF

Keywords

About this book

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science.

The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learningmodels. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period ofca. 230 years.

A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.

Authors and Affiliations

Department of Computer Science, University of Neuchatel, Neuchâtel, Switzerland

Jacques Savoy

About the author

Jacques Savoy is a Full Professor of Computer Science at the University of Neuchatel (Switzerland). His research interests mainly include natural language processing and particularly information retrieval for languages other than English (European, Asian, and Indian) as well as multilingual and cross-lingual information retrieval. For many years he has participated in various evaluations campaigns (TREC, CLEF, NTCIR, FIRE) dealing with these questions. His current research interests focus on the statistical modeling and evaluation of natural language processing such as text clustering and categorization, as well as authorship attribution.

Bibliographic Information

Book Title: Machine Learning Methods for Stylometry
Book Subtitle: Authorship Attribution and Author Profiling
Authors: Jacques Savoy
DOI: https://doi.org/10.1007/978-3-030-53360-1
Publisher: Springer Cham
eBook Packages: Computer Science, Computer Science (R0)
Copyright Information: Springer Nature Switzerland AG 2020
Hardcover ISBN: 978-3-030-53359-5Published: 29 September 2020
eBook ISBN: 978-3-030-53360-1Published: 28 September 2020
Edition Number: 1
Number of Pages: XIX, 286
Number of Illustrations: 10 b/w illustrations, 101 illustrations in colour
Topics: Natural Language Processing (NLP), Information Storage and Retrieval, Computational Linguistics, Machine Learning, Library Science

Publish with us

Policies and ethics

Machine Learning Methods for Stylometry

Overview

Access this book

Other ways to access

Table of contents (11 chapters)

Front Matter

Fundamental Concepts and Models

Front Matter

Advanced Models and Evaluation

Front Matter

Cases Studies

Front Matter

Back Matter

Keywords

About this book

Authors and Affiliations

Department of Computer Science, University of Neuchatel, Neuchâtel, Switzerland

About the author

Bibliographic Information

Publish with us

Search

Navigation