Demo Overview

This is the entry point to a couple of web-based demos, illustrating the workings of various kinds of taggers and disambiguators that have been developed by means of the µ-TBL system. Currently, there are four demos:

Brill Part of Speech Tagger for Swedish, English and Russian

In a Brill tagger, a lexical lookup module assigns exactly one tag to each occurrence of a word (usually the most frequent tag for that word type), disregarding context. Words not in the lexicon are handled separately, by means of guesser rules. A rule application module then proceeds to replace some of the tags with other tags, on the basis of what appears in the local context.

Constraint Grammar Part of Speech Tagger for Swedish and English

In a Constraint Grammar tagger, a lexical lookup module assigns sets of alternative tags to occurrences of words, disregarding context. A rule application module then removes tags from such sets, on the basis of what appears in the local context. However, in order to guarantee that each word token is left with at least one tag, the rule application module adheres to the following principle: don’t remove the last remaining tag.

Noun Phrase Chunker

A noun-phrase chunker tries to mark up all the (basic) noun phrases in a text. The idea behind this particular chunker is to view chunking as a tagging problem, and to encode the chunk structure as tags attached to each word. Since the rules for noun phrase chunking are meant to apply to part-of-speech tagged text, the chunker also includes a Brill part-of-speech tagger.

Word Sense Disambiguation Demo

This demonstrates that a sequence of replacement rules works well for disambiguating words.

Note: The performance of the above demo taggers could probably be improved a lot (by increasing the quality of lexica, by training on larger corpora, etc.). The purpose here is mainly pedagogical and to that end the demos feature something not usually found in other web-based NLP demos, namely a tracing facility. Just tick the check box, and the rules actually applied (always a subset of the available rules) will be displayed. Have fun!

Here's an older demo:

Indefinite Clause Constraint Grammar

This demonstrates the use of First Order Predicate Logic for part of speech tagging. In order to understand what is happening here, you probably have to spend some time reading the paper. This kind of tagger is kind of interesting, but I doubt that it will ever become efficient enough to be of any practical use.


© Torbjörn Lager 2000