Amazon cover image
Image from Amazon.com

Machine Learning Methods for Stylometry [electronic resource] : Authorship Attribution and Author Profiling /

By: Contributor(s): Material type: TextTextPublisher: Cham : Springer International Publishing : Imprint: Springer, 2020Edition: 1st ed. 2020Description: XIX, 286 p. 111 illus., 101 illus. in color. online resourceContent type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9783030533601
Subject(s): Additional physical formats: Printed edition:: No title; Printed edition:: No titleDDC classification:
  • 006.35 23
LOC classification:
  • QA76.9.N38
Online resources:
Contents:
Part I: Fundamental Concepts and Models -- 1. Introduction to Stylistic Models and Applications -- 2. Basic Lexical Concepts and Measurements -- 3. Distance-Based Approaches -- Part II: Advanced Models and Evaluation -- 4. Evaluation Methodology and Test Corpora -- 5. Features Identification and Selection -- 6. Machine Learning Models -- 7. Advanced Models for Stylometric Applications -- Part III: Cases Studies -- 8. Elena Ferrante: A Case Study in Authorship Attribution -- 9. Author Profiling of Tweets -- 10. Applications to Political Speeches -- 11. Conclusion.
In: Springer Nature eBookSummary: This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science. The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learning models. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period of ca. 230 years. A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
No physical items for this record

Part I: Fundamental Concepts and Models -- 1. Introduction to Stylistic Models and Applications -- 2. Basic Lexical Concepts and Measurements -- 3. Distance-Based Approaches -- Part II: Advanced Models and Evaluation -- 4. Evaluation Methodology and Test Corpora -- 5. Features Identification and Selection -- 6. Machine Learning Models -- 7. Advanced Models for Stylometric Applications -- Part III: Cases Studies -- 8. Elena Ferrante: A Case Study in Authorship Attribution -- 9. Author Profiling of Tweets -- 10. Applications to Political Speeches -- 11. Conclusion.

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science. The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learning models. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend’s saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period of ca. 230 years. A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author’s Github website.

There are no comments on this title.

to post a comment.
© 2024 IIIT-Delhi, library@iiitd.ac.in