Training a Sense Disambiguator for the Noun "Interest"

In (Lager 2000) it is shown how the µ-TBL system can be used for learning rules for word sense disambiguation as well. In this small experiment, we use a simple set of 7 templates, and (part of) a corpus of "interest" collected by Rebecca Bruce and Janyce Wiebe.

Train and test, and see what happens! From the OS prompt, run:

> ./mutbl -f examples/wsd.script

Inspect the script for information about where templates and training and test data are located.

For the report: Spend 15 minutes on trying to improve upon the result of the first run, by changing relevant parameters in the script, and perhaps by extending the set of templates. (If you want to use a larger training corpus (optional), you'll find one in file 'interest.pl' in the data directory). Look at the rules that have been generated. To what extent do you think they are genre and domain dependent (i.e. to what extend do they depend on the fact that the corpus is collected from The Wall Street Journal effect the disambiguator)? What does that entail for the 'portability' of the disambiguator between different genres and domains?