Journal of Language Modelling https://jlm.ipipan.waw.pl/index.php/JLM <p>Journal of Language Modelling is a free (for readers and authors alike) open-access peer-reviewed journal aiming to bridge the gap between theoretical linguistics and natural language processing. Although typical articles are concerned with linguistic generalisations – either with their application in natural language processing, or with their discovery in language corpora – possible topics range from linguistic analyses which are sufficiently precise to be implementable to mathematical models of aspects of language, and further to computational systems making non-trivial use of linguistic insights.</p> <p><br />Papers are reviewed within less than three months of their receipt, and they appear as soon as they have been accepted – there are no delays typical of traditional paper journals. Accepted articles are then collected in half-yearly numbers and yearly volumes, with continuous page numbering, and are made available as hard copies via print on demand, at a nominal fee. On the other hand, Journal of Language Modelling has a fully traditional view of quality: all papers are carefully refereed by at least three reviewers (usually including at least one member of the Editorial Board) and they are only accepted if they adhere to the highest scientific, typographic and stylistic standards.<br /><br />Apart from full-length articles, the journal also accepts squibs and polemics with other papers. All journal content appears on the <a href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 <span class="cc-license-title">International</span> Licence</a>. JLM is indexed by <a href="https://www.scopus.com/results/results.uri?src=s&amp;sot=b&amp;sdt=b&amp;origin=searchbasic&amp;rr=&amp;sl=39&amp;s=SRCTITLE(Journal%20of%20Language%20Modelling)&amp;searchterm1=Journal%20of%20Language%20Modelling&amp;searchTerms=&amp;connectors=&amp;field1=SRCTITLE&amp;fields=">SCOPUS</a>, <a href="https://dbh.nsd.uib.no/publiseringskanaler/erihplus/periodical/info?id=480322">ERIH PLUS</a>, <a href="http://dblp.uni-trier.de/db/journals/jlm/">DBLP</a>, <a href="https://doaj.org/toc/a339b4740e97425ea7e5a2a32655eba5">DOAJ</a>, <a href="https://www.ebsco.com/">EBSCO</a>, <a href="http://www.linguisticsabstracts.com/">Linguistics Abstracts Online</a>, <a href="https://www.mla.org/Publications/MLA-International-Bibliography/MLA-Directory-of-Periodicals/About-the-Directory-of-Periodicals">MLA Directory of Periodicals</a>, and the Polish <a href="http://www.nauka.gov.pl/aktualnosci-ministerstwo/juz-sa-nowe-listy-punktowanych-czasopism-na-2015-rok.html">Ministry of Science and Education</a> (list B). JLM is also a member of <a href="http://oaspa.org/member/journal-of-language-modelling/">OASPA</a>.<br /><br />In order to submit an article, you have to be registered. Further <strong><a href="https://jlm.ipipan.waw.pl/index.php/JLM/about/submissions">submission instructions for Authors are available here</a></strong>.</p> en-US <p><a href="http://creativecommons.org/licenses/by/3.0/" rel="license"><img style="border-width: 0;" src="http://i.creativecommons.org/l/by/3.0/80x15.png" alt="Creative Commons License" /></a><br />All content is licensed under the <a href="http://creativecommons.org/licenses/by/4.0/" target="_blank" rel="license noopener">Creative Commons Attribution 4.0 International Licence</a>.</p> jlm@ipipan.waw.pl (JLM Editors) jlm@ipipan.waw.pl (JLM Editors) Tue, 10 Dec 2024 21:55:51 +0100 OJS 3.3.0.10 http://blogs.law.harvard.edu/tech/rss 60 Computational approaches to morphological typology https://jlm.ipipan.waw.pl/index.php/JLM/article/view/431 <p>Introduction to the Special Issue.</p> Micha Elsner, Sacha Beniamine Copyright (c) 2024 Micha Elsner, Sacha Beniamine https://creativecommons.org/licenses/by/4.0 https://jlm.ipipan.waw.pl/index.php/JLM/article/view/431 Tue, 10 Dec 2024 00:00:00 +0100 Alignment everywhere all at once https://jlm.ipipan.waw.pl/index.php/JLM/article/view/360 <p class="p1">This article presents the structure of the ATLAs Alignment Module, a typological database designed to exhaustively capture languageinternal variation in argument marking (indexing and flagging). The flexible design of our database can be extended to cover further aspects of morphosyntactic alignment. We demonstrate with a small diversity sample how the database can be queried and the data aggregated at different levels of structure (e.g. for a language as a whole or for individual referential types in the form of alignment statements) for the purposes of cross-linguistic comparison. The database is made available in the Cross-Linguistic Data Formats (CLDF), and we provide code that generates an array of aggregations.</p> David Inman, Alena Witzlack-Makarevich, Natalia Chousou-Polydouri, Melvin Steiger Copyright (c) 2024 David Inman, Alena Witzlack-Makarevich, Natalia Chousou-Polydouri, Melvin Steiger https://creativecommons.org/licenses/by/4.0 https://jlm.ipipan.waw.pl/index.php/JLM/article/view/360 Tue, 10 Dec 2024 00:00:00 +0100 Zero marking in inflection https://jlm.ipipan.waw.pl/index.php/JLM/article/view/361 <p>This study examines zero marking, i.e. the absence of an overt exponent, in adjectival, nominal, and verbal inflectional morphology across languages. The first part of the study provides an overview of the distribution of zero markers in inflection paradigms using the UniMorph dataset. The results show that there is a general preference against zero marking. The distribution of zero markers varies to a great extent across languages and lemmas, the only robust trend being that they are avoided in cells that express a high number of grammatical values. The second part of this study examines the association between marker frequencies and phonological length, using the Universal Dependencies treebanks. While token frequency is a good predictor for the length of overt markers, it does not account for the occurrence of zero markers. This is taken as evidence to support a differential non-development scenario of zero marking rather than a phonetic reduction scenario.</p> Laura Becker Copyright (c) 2024 Laura Becker https://creativecommons.org/licenses/by/4.0 https://jlm.ipipan.waw.pl/index.php/JLM/article/view/361 Tue, 10 Dec 2024 00:00:00 +0100 An analogical approach to the typology of inflectional complexity https://jlm.ipipan.waw.pl/index.php/JLM/article/view/352 <p>This paper studies the inflectional complexity of nouns, verbs and adjectives in 137 datasets, across 71 languages. I follow Ackerman and Malouf (2013) in distinguishing between E(numerative) complexity and I(ntegrative) complexity. The first one encompasses aspects of inflection, like the number of principal parts, paradigm size, and number of exponents, while the second one captures the implicative relations between paradigm cells (how difficult it is to predict one cell of a paradigm knowing a different cell). I provide a formalism and computational implementation to estimate both I- and E-complexity expressed through Word and Paradigm morphology (Blevins 2006, 2016), which is flexible and powerful enough for typological research. The results show that, as suggested by Ackerman and Malouf (2013), I-complexity is relatively low across the languages in the sample, with only two clear exceptions (Navajo and Yaitepec-Chatino). The results also show that E-complexity can vary considerably crosslinguistically. Finally, I show there is a clear correlation between I- and E-complexity.</p> Matías Guzmán Naranjo Copyright (c) 2024 Matías Guzmán Naranjo https://creativecommons.org/licenses/by/4.0 https://jlm.ipipan.waw.pl/index.php/JLM/article/view/352 Tue, 10 Dec 2024 00:00:00 +0100 Corpus-based measures discriminate inflection and derivation cross-linguistically https://jlm.ipipan.waw.pl/index.php/JLM/article/view/351 <p>In morphology, a distinction is commonly drawn between inflection and derivation. However, a precise definition of this distinction which reflects the way it manifests across languages remains elusive within linguistic theory, typically being based on subjective tests. In this study, we present 4 quantitative measures which use the statistics of a raw text corpus in a language to estimate to what extent a given morphological construction changes the form and distribution of lexemes. In particular, we measure both the average and the variance of this change across lexemes. Crucially, distributional information captures syntactic and semantic properties and can be operationalised by word embeddings. Based on a sample of 26 languages, we find that we can reconstruct 89±1% of the classification of constructions into inflection and derivation in UniMorph using our 4 measures, providing large-scale cross-linguistic evidence that the concepts of inflection and derivation are associated with measurable signatures in terms of form and distribution that behave consistently across a variety of languages. We also use our measures to identify in a quantitative way whether categories of inflection which have been considered noncanonical in the linguistic literature, such as inherent inflection or transpositions, appear so in terms of properties of their form and distribution. We find that while combining multiple measures reduces the amount of overlap between inflectional and derivational constructions, there are still many constructions near the model’s decision boundary between the two categories. This indicates a gradient, rather than categorical, distinction.</p> Coleman Haley, Edoardo M. Ponti, Sharon Goldwater Copyright (c) 2024 Coleman Haley, Edoardo M. Ponti, Sharon Goldwater https://creativecommons.org/licenses/by/4.0 https://jlm.ipipan.waw.pl/index.php/JLM/article/view/351 Tue, 10 Dec 2024 00:00:00 +0100