REGI

AN INTERPRETER FOR EXTENDED REGULAR EXPRESSIONS



REGI (pronounced 'reggae') is an attempt to build an interpreter for extended regular expressions in Prolog. That is, expressions are not compiled into finite-state machines, but rather interpreted directly in Prolog. Here's why I think it's worthwhile: The set of REGI operators is a subset of the operators over regular languages and relations defined by Karttunen et al., in the context of the the Xerox Finite-State Tools (XFST). Notation differs slightly from the XFST notation, however (it is just that Prolog operators are less flexible than one would have liked...). For more information about XFST, see http://www.xrce.xerox.com/research/mltt/fst/ .
 

REGI operators

Basic operators over regular languages

A + B Concatenation
A | B (or A ; B) Union
A & B Intersection
A* Kleene star
A - B Difference

Extended operators over regular languages

A+  Kleene plus
~A  Complement
A\B Ignore
opt(A)  Optionality
$A  Containment
A => B .. C  Restriction

Basic operators over regular relations

A x B Crossproduct
A o B  Composition

Extended operators over regular relations

A -> B Replacement
A <- B Inverse replacement 
A1 -> B1, A2 -> B2 Parallel replacement 
A -> B .. C Markup
A -> B # C .. D  Conditional replacement
 

Brackets, (), are used for grouping expressions.

Special symbols

[] The empty-string language
The any symbol
 

Debugging operators

trace(A) Writes A and the pair of strings denoted by A to stdout.
 

REGI is written in SICStus Prolog, and is probably not very portable. The source is here.