Flexible Table-Driven Parsing for Natural Language Processing


Author: Linton Miller
Source: GZipped PostScript (392kb); Adobe PDF (1874kb)

Ambiguity is a major difficulty for natural language processing (NLP) systems. The longer that ambiguities in a sentence remain unresolved, the more work an NLP system may perform in considering alternative interpretations of the sentence. Thus, for efficiency, an NLP system should resolve ambiguities as early as possible in processing.

This report describes L* parsing - an algorithm for table-driven parsing, designed to permit efficient processing of natural language by facilitating the early resolution of ambiguity. The algorithm is a generalisation of GLR parsing that allows grammar rules to be used whenever they may provide useful syntactic information to an NLP system.

L* parsing defines a general framework for specifying a variety of parser control strategies. Different control strategies can be expressed by specifying exactly when grammar rules are to be used. This report presents one possible control strategy, designed to provide syntactic information that enables useful semantic and pragmatic processing, and describes a method of compiling this strategy into a parse table.

[Up to Computer Science Technical Report Archive: Home Page]