TY - JOUR AU - Kennedy, Alistair AU - Szpakowicz, Stan PY - 2014/03/17 Y2 - 2024/03/28 TI - Evaluation of automatic updates of Roget’s Thesaurus JF - Journal of Language Modelling JA - JLM VL - 2 IS - 1 SE - Articles DO - 10.15398/jlm.v2i1.78 UR - https://jlm.ipipan.waw.pl/index.php/JLM/article/view/78 SP - 1–49 AB - Thesauri and similarly organised resources attract increasing interest of Natural Language Processing researchers. Thesauri age fast, so there is a constant need to update their vocabulary. Since a manual update cycle takes considerable time, automated methods are required. This work presents a tuneable method of measuring semantic relatedness, trained on <em>Roget’s Thesaurus</em>, which generates lists of terms related to words not yet in the <em>Thesaurus</em>. Using these lists of terms, we experiment with three methods of adding words to the <em>Thesaurus</em>. We add, with high confidence, over 5500 and 9600 new words and word senses to versions of <em>Roget’s Thesaurus</em> from 1911 and 1987 respectively. We evaluate our work both manually, and by applying the updated thesauri in three NLP tasks: selection of the best synonym from a set of candidates, pseudo-word-sense disambiguation, and SAT-style analogy problems. We find that the newly added words are of high quality. The additions significantly improve the performance of <em>Roget’s</em>-based methods in these NLP tasks. It compares favourably to the performance of <em>WordNet</em>-based methods. Our methods are general enough to work with different versions of <em>Roget’s Thesaurus</em>. ER -