Inferring inflection classes with description length

Authors

  • Sacha Beniamine Université Paris Diderot, Laboratoire de linguistique formelle
  • Olivier Bonami Université Paris Diderot, Laboratoire de linguistique formelle
  • Benoît Sagot Inria

Keywords:

morphology, MDL, inflection classes

Abstract

We discuss the notion of an inflection class system, a traditional ingredient of the description of inflection systems of nontrivial complexity. We distinguish systems of microclasses, which partition a set of lexemes in classes with identical behavior, and systems of macroclasses, which group lexemes that are similar enough in a few larger classes. On the basis of the intuition that macroclasses should contribute to a concise description of the system, we propose one algorithmic method for inferring macroclasses from raw inflectional paradigms, based on minimisation of the description length of the system under a given strategy for identifying morphological alternations in paradigms. We then exhibit classifications produced by our implementation on French and European Portuguese conjugation data, and argue that they constitute an appropriate systematisation of traditional classifications. To arrive at such a concincing systematisation, it is crucial though that we use a local approach to class similarity (based on pairwise comparisons of paradigm cells) rather than a global approach (based on simultaneous comparison of all cells). We conclude that it is indeed possible to infer inflectional macroclasses objectively.

DOI:

https://doi.org/10.15398/jlm.v5i3.184

Full article

Published

2018-02-27

How to Cite

Beniamine, S., Bonami, O., & Sagot, B. (2018). Inferring inflection classes with description length. Journal of Language Modelling, 5(3), 465–525. https://doi.org/10.15398/jlm.v5i3.184

Issue

Section

Articles