Temporal predictive regression models for linguistic style analysis

Authors

  • Carmen Klaussner Trinity College Dublin
  • Carl Vogel Trinity College Dublin

Keywords:

Natural Language Processing, text analysis, statistics

Abstract

This paper presents work on modelling language change over time. In particular we use different feature types, i.e.~character, word stem, part-of-speech and word ngrams to predict the publication year of texts. We do this for two different corpora, one containing texts published over an approximately fifty year period, from two individual authors and one larger set containing a variety of text types and authors to approximate an average language style over time, for the same temporal span as the two authors.  Our linear regression models achieve good accuracy in the two authors case and very good results in the case of the reference set.

DOI:

https://doi.org/10.15398/jlm.v6i1.177

Full article

Published

2018-08-31

How to Cite

Klaussner, C., & Vogel, C. (2018). Temporal predictive regression models for linguistic style analysis. Journal of Language Modelling, 6(1), 175–222. https://doi.org/10.15398/jlm.v6i1.177

Issue

Section

Articles