next up previous contents
Next: Terminal symbols Up: Lexical analysis Previous: Lexical analysis

  
Definitions

Definition _definition (Token)
A token is a part of the input which is separated by word separators. These word separators may be part of the token. A token can be a a word form, a multi token word or a composed word which consisting of one or more word parts.

Definition _definition (Word-form)
A word-form is a sequence of non-blank characters, starting after or with a word separator, and ending before or with a word separator. Therefore, each non-blank word separator may be part of a word form, or be a word form by itself.

Definition _definition (Multi-token word)
A multi-token-word is a sequence of word forms separated by blanks. Since it consists of word forms it may also include word separators.

Definition _definition (Composed word)
A composed word is a composition of word parts conforming to the following pattern:
$\mbox{\textit{prefix}}^{*}\mbox{\textit{infix}}^{+}\mbox{\textit{suffix}}^{*}$
This means a single infix can be a word form, but a single prefix or suffix can not!

Definition _definition (Word-part)
A word-part is a word form or multi token word with an additional hyphen at the beginning or the end, or at both sides. This hyphen has no liteal meaning, it only indicates a word part. A word part with a hyphen at the beginning is called a prefix, a word part with a hyphen at the end is a suffix and an infix is a word part with hyphens at the beginning as well as at the end.

Definition _definition (Blanks)
A blank is one of the characters space, tab, or newline. Any blanks or sequence thereof are always reduced to a single space2.1.

Definition _definition (Word separator)
A word-separator is a character used for indicating possible boundaries between word forms. All blanks are considered word separators. Furthermore the user can specify non-blank characters, which indicate word boundaries, as word separators.

Definition _definition (Invisible character)
An invisible-character is a character that should be skipped automatically if it cannot be matched. All blanks are considered invisible characters. Furthermore the user can specify non-blank characters which can be skipped if no match was found for them.


next up previous contents
Next: Terminal symbols Up: Lexical analysis Previous: Lexical analysis

2000-01-10