自然言語解釈の続き - ptoolisの日記

The motivation for development of the natural logic system discussed in this paper was the inadequacy of existing infernence mechanisms. Those relying on pattern matching and semantic overlap could not make logical deductions such as those involving quantifiers. An example would be:
Premise: Every polar bear enjoyed the Arctic weather, despite the cloudiness.
Hypothesis: Every ambitious polar bear enjoyed the Arctic weather.
This inference is clearly valid, as the quantifier "every" is downward monotone, so inserting a modifier preserves truth. However, the semantic-oriented engines have missed such a deduction. The other type of inference engine existing is one which translates natural language sentences to first order logic. These tend to have high precision (low number of false positives), but poor recall (unable to identify positive cases).
Natural logic tries to parse the sentence for lexical and semantic information like the first type of algorithm above, while at the same time computing logical relationships between lexical elements after the fashion of the second type of algorithm.
The four concepts used to construct the natural logic engine are: entailment relations, projectivity, inference, and implicatives.
In the case of entailment relations and implicatives, a concept called "monotonicity" plays an important role indetermining the relationships between two tokens (lexically distinct pieces of a sentence). Words, phrases, and sentences can be either upward monotonic, downward monotonic, or non-monotonic. It seems that an upward monotone case would be one in which deleting modifiers preserves the truth. "Some of the ancient wizards grew beards."-> "Some of the wizards grew beards." Thus, "some" is upward monotone. As mentioned before, downward monotone allows modifiers to be inserted while preserving truth. "None of the wizards grew beards."->"None of the ancient wizards grew beards." Thus, "none" is downward monotone (can only decrease in scope?). A non-monotonic example is the phrase "think playing chess is fun", as it does not preserve truth upon any modification after "think", it seems.
Entailment relations describe relations of the semantic classes of two words. Natural logic employs seven operators: equality (PC=personal computer), forward entailment (pandas[bears), backward entailment (Mongols]Asians), alternation (bears|cats), negation (necromancers^non-necromancers), cover (animals u non-humans), and independence (hippopotamus#sagacious). One of the most apparent applications of these relations is in the comparison of quantifiers. For example "at least four" u "at most six". Both five and seven qualify as "at least four", but only four fits in "at most six". Both six and two are in "at most six", but only six is in "at least four.
Projectivity is used to explain how one word may affect the entailment relations of its neighboring words. One example is negation. If x and y are related by [, then neg(x)]neg(y), swapping the forward and backward entailment operators. For example, move]walk, but "didn't move"["didn't walk". Some other projections of negation include: maintaining ^ as-is and swapping | and "u".
Inference involves determining the entailment relations resulting from a series of edits. Based on the type of edit(insert, delete, substitute, or match), and in some cases the words involved, the entailment relation at the point of substitution is computed. For example, most deletions generate [ (e.g. bold pandas[pandas). A word-specific example of inference is the deletion of not ("not a panda"^panda). This relation is propagated upward through the semantic composition tree of the new sentence. e.g. "He was a bold panda."["He was a panda.". If multiple edits occur, these edits are composed into a single new entailment relation or a union of such relations.
Lastly, implicatives seem to be words which carry some implicit meaning. For example, "refused to traverse the desert" implies "did not traverse the desert". In natural logic, implicatives are given a signature of 2 signs (each +,-, or o) indicating their implication in positive and negative contexts. For example, "refuse" has signature -/o, as it is negative in a positive context and neutral in a negative context. Implicatives may generate entailment relations upon deletion. For example, the signature o/- generates ] on deletion. "Allanon attempted to use magic."]"Allanon used magic.". In addition, implicatives have a projectivity property indicating their monotonicity. These properties can be correlated to the signature. e.g. o/- is upward monotone, so [ is preserved (attempted to walk[attempted to move).
The NatLog system has five stages of processing: linguistic analysis, alignment, entailment classification, entailment projection, and entailment composition.
Linguistic analysis involves the parsing of the sentence, including division into tokens/lemmas, part of speech tagging, and phrase structure parsing. It seems that a tree is likely to have been generated, as data structures allude to tree relationships. Another aspect of the linguistic analysis stage is projectivity marking. Projective operators are stored in a list. These are words which may have projectivity (e.g. "without" is downward monotone on its prepositional phrase). I am not sure why the projectivity of "refuse to" is upward, as this was noted as being downward as an implicative. I can only assume that this is due to the fact that this verb is deleted in the hypothesis. The tree pattern for the operator (where it occurs in the parse tree) and what composition of words it affects (eg noun phrase) are also stored. These operators are referred to in the entailment projection stage. An interesting aspect of the projectivity operators is their pattern, which uses the TRegex syntax (based on TGREP2). This provides a grep-like way to pattern match elements in a syntax tree, much like xpath allows reference to nodes in an XML tree. For example, the preposition "without" is given pattern "IN