Text: now in 2D! A framework for lexical expansion with contextual similarity


Chris Biemann, TU Darmstadt, Germany
Martin Riedl, TU Darmstadt, Germany

Abstract


A new metaphor of two-dimensional text for data-driven semantic modeling of natural language is proposed, which provides an entirely new angle on the representation of text: not only syntagmatic relations are annotated in the text, but also paradigmatic relations are made explicit by generating lexical expansions. We operationalize distributional similarity in a general framework for large corpora, and describe a new method to generate similar terms in context. Our evaluation shows that distributional similarity is able to produce highquality lexical resources in an unsupervised and knowledge-free way, and that our highly scalable similarity measure yields better scores in a WordNet-based evaluation than previous measures for very large corpora. Evaluating on a lexical substitution task, we find that our contextualization method improves over a non-contextualized baseline across all parts of speech, and we show how the metaphor can be applied successfully to part-of-speech tagging. A number of ways to extend and improve the contextualization method within our framework are discussed. As opposed to comparable approaches, our framework defines a model of lexical expansions in context that can generate the expansions as opposed to ranking a given list, and thus does not require existing lexical-semantic resources.

Keywords


distributional semantics; lexical expansion; contextual similarity; computational semantics

Full Text:

PDF


DOI: http://dx.doi.org/10.15398/jlm.v1i1.60

ISSN of the paper edition: 2299-856X