On an Apparent Limit to Verb Idiosyncrasy, Given a Mapping between Argument Realization and Polysemy (or Argument Optionality)

umi-umd-4882.pdf
Thomas, Scott
Perlis, Don
Oates, Tim
Full-scale natural language processing systems require lots of information on thousands of words. This is especially true for systems handling the meanings of words and phrases, and it seems especially true for the verbs of a language: at first glance at least, and when viewed as if they were argument-taking functions, verbs seem to have highly individual requirements---along at least two dimensions. 1) They vary in the range of arguments they take (further complicated by polysemy, i.e. the proliferation of their senses). And to a significant extent 2) they vary in the way in which those arguments are realized in syntax. Since arbitrary information must be stored anyway---such as the particular concept pairing with the sound and/or spelling of a word---it seems reasonable to expect to store other potentially idiosyncratic information, including what might be needed for polysemy and argument realization. But once the meanings of words are stored, it isn't completely clear how much else really needs to be stored, in principle. With a significant degree of patterning in polysemy, and in argument realization, real speakers extrapolate from known senses and realizations. To fully model the processing of natural language, there must be at least some automatic production, and/or verification, of polysemy and argument realization, from the semantics. Since there are two phenomena here (polysemy and argument realization), the interaction between them could be crucial; and indeed particular instances of this interaction appear again and again in theoretical studies of syntax and meaning. Yet the real extent of the interaction has not itself been properly investigated. To do so, we supply, for the argument-taking configurations of 3000 English verbs, the typical kind of semantic specification---on the roles of their arguments---but do a kind of high-level analysis of the resulting patterns. The results suggest a rule of co-occurrences: divergences in argument realization are in fact rigorously accompanied by divergences in polysemy or argument optionality. We argue that this implies the existence of highly productive mechanisms for polysemy and argument realization, thus setting some crucial groundwork for their eventual production by automated means.