
Word experts are small expert system-like modules for processing a particular target word based on clues in the context. Typically, a word expert uses rules to test the identity and relative position of neighbouring words to infer the role of the target word in the passage. A good overview of word experts, which is partly historical, is given in (Berleant 1995).
This page is dedicated to Prolog Word Experts (or a PWEs for short, to be pronounced as "peewees"). They have been given this name in order to distinguish them from other kinds of word experts, and to emphasize the fact that they are ‘programmed in logic’. Details are given in (Lager 2000 ) but the main points can be summarized as follows. PWEs are:
From this page, you may download word expert compilers and example PWEs, and read about how to install the compilers and run the examples.
A word expert compiler translates word expert specifications into word expert procedures. Two implementations are available, one very simple and one more sophisticated. Watch this space for even more sophisticated PWE compilers!
The file pwe_vanilla_compiler.pl contains a small and simple compiler for compiling word expert specifications to 'vanilla' word expert procedures. It is identical to the compiler given in the appendix to (Lager 2000). The resulting word expert procedures can be very inefficient, however, since memoing is not implemented.
The file pwe_compiler.pl contains the source code for a compiler which is more useful in that the resulting word expert procedures:
Also, the compiler comes with:
After consulting the source for the compiler, the following predicates become available:
For efficient word expert processing, you want to use compile/1 rather than consult/1. However, if you want to inspect the outcome of compiling a PWE, use consult/1 and listing/0.
After consulting or compiling a word expert specification, additional predicates become available. Given a specification with a header of the form
word_expert F :=
where F is a Prolog atom, the predicate F/2 becomes available, where F(P,A) is true iff the feature F of the word at position P in the corpus has value A.
If the flag pwe_trace is set to on, the predicate F/3 becomes available as well, where F(P,A,T) is true iff F(P,A) is true and the list of transformation rules T were applied at P in order to derive F(P,A,T).
The way in which the compiler carries out its tasks is controlled by flags. They are described below. The flags are set with the command set/2.
Below, I offer some worked out examples of word experts in action, together with instructions on how to run them. Refer to the µ-TBL system if you would rather train your own word experts.
Word sense disambiguation is the 'classical' application of word experts. (Lager 2000 ) describes the training of a word expert capable of disambiguating occurrences of the noun interest with a 88% correctness. The relevant word expert specification is available in the file sense_pwe.pl, and a small corpus (distinct from the training corpus) on which to test it is stored in the file interest.txt. (You may prefer instead to collect your own corpus containing occurrences of interest. However, note that the PWE can only be expected to perform really well on Wall Street Journal texts, since this is what it was trained on.)
To see this word expert in action, just do the following from SICStus Prolog:
| ?- [pwe_compiler].
{consulting c:/pew/pwe_compiler.pl...}
{consulted c:/pew/pwe_compiler.pl in module user, 380 msec -824 bytes}
yes
| ?- compile(sense_rules).
{compiling c:/pew/sense_rules.pl...}
Compiled PWE sense/2 from 274 rules.
{compiled c:/pew/sense_rules.pl in module user, 14280 msec 410664 bytes}
yes
| ?- load_text('interest.txt').
626 words loaded.
| ?- sense(P,4), conc(P,5), fail.
2: The common [interest] of common ownership will supersede
213: sufficient data to support our [interest] in this opportunity . Whatever
233: old drugs cheaply and their [interest] in seeing new drugs invented
514: any appearance of conflict of [interest] or favoritism . The compromise
no
| ?-
Part of speech disambiguation is another task where word experts perform well. The specification for a part of speech expert, as well as a suitable lexicon, is available in the file pos_pwe.pl, and can be tested on the same corpus as above.
The final example of word expert processing in (Lager 2000 ) concerns NP-chunking, using an approach greatly inspired by (Ramshaw & Marcus 1995). The file np_pwe.pl contains a relevant word expert specification consisting of 100 rules. You also need to compile the part of speech disambiguating word expert described above. Thus, this also serves as an example of how word experts interact. Use the same test corpus as before.
Berleant, Daniel (1995) Engineering “Word Experts” for Word Disambiguation. Natural Language Engineering, 1 (4): 339-362. (.ps)
Brill, Eric (1995) Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging. Computational Linguistics, December 1995. (.ps)
Lager, Torbjörn (1999) The µ-TBL System: Logic Programming Tools for Transformation-Based Learning. In Proceedings of the Third International Workshop on Computational Natural Language Learning (CoNLL'99), Bergen, 1999. (.ps)
Lager, Torbjörn (2000) A Logic Programming Approach to Word Expert Engineering. In Proceedings of ACIDCA 2000: Workshop on Corpora and Natural Language Processing, Monastir, Tunisia, March 22-24 2000. (.ps)