6.17.5.3 Define recognizers with existing translators

A recognizer is a Forth word with the stack effect ‘( c-addr u -- ... translator | 0 )’. c-addr u describes the string to be recognized. If the recognizer does not recognize the string, it returns 0. If it does recognize the string, it returns a translator, and a translator-specific amount of additional data (“...”). When performing a translator action, the translator consumes this additional data. E.g., when you perform

"5" rec-number

it pushes ‘5 translate-cell’ on the stack, and when the compilation action of translate-cell is performed, both stack items are removed from the stack. This compilation action also compiles a literal 5 into the current definition.

You typically write a recognizer as ordinary colon definition that examines the string in some way, and if the string is accepted by this recognizer, the recognizer pushes a translator and (below that) additional data. E.g., a simple variant of rec-tick can be implemented as follows:

: rec-tick ( addr u -- xt translate-cell | 0 )
    over c@ '`' = if
        1 /string find-name dup if
            name>interpret translate-cell then
        exit then
    rec-none ;

The only appropriate use of a translator (plus data) on the stack is to pass it to one of the words for performing translator actions (see Performing translator actions).

A number of translators already exist in Gforth and can be used in a recognizer you write. If none of them is appropriate for your recognizer, read the next section about defining your own translators.

For each translator, additional data is documented; a recognizer that returns a certain translator also has to return the additional data below it.

The text interpreter passes the output of the recognizer to a translator action (see Performing translator actions), which removes the translator and all the additional data from the stack, may perform additional parsing, and then invokes the interpreting run-time of the translator, or the compiling run-time, or the postponing run-time.

For each system-defined translator we specify the interpreting run-time explicitly. Unless otherwise specified the compiling run-time compiles the interpreting run-time. The postponing run-time compiles the compiling run-time.

In the rec-tick example above, if the recognizer recognizes, say, `dup, it returns xt translator, where translator is the value returned by translate-cell, and xt is the execution token of dup. So xt is the additional data for this translator. If the text interpreter then performs the compiling action, that action first removes these two stack items, and compiles code that pushes xt.

translate-name ( – translator  ) gforth-experimental

Additional data: ( nt ).
Interpreting run-time: ( ... -- ... )
Perform the interpretation semantics of nt.
Compiling run-time: ( ... -- ... )
Perform the compilation semantics of nt.

translate-cell ( – translator  ) gforth-experimental

Additional data: ( x ).
Interpreting run-time: ( -- x )

translate-dcell ( – translator  ) gforth-experimental

Additional data: ( xd ).
Interpreting run-time: ( -- dx )

translate-float ( – translator  ) gforth-experimental

Additional data: ( r ).
Interpreting run-time: ( -- r )

translate-complex ( – translator  ) gforth-experimental

Additional data: ( r1 r2 ).
Interpreting run-time: ( -- r1 r2 )

translate-string ( – translator  ) gforth-experimental

Additional data: ( c-addr1 u1 ).
Interpreting run-time: ( -- c-addr2 u2 )
c-addr2 u2 is the result of translating the \-escapes in c-addr1 u1.

scan-translate-string (  – translator  ) gforth-experimental

Additional data: ( c-addr1 u1 'ccc"' ).
Every translator action also parses until the first non-escaped ". The string c-addr u and the parsed input are concatenated, then the \-escapes are translated, giving c-addr2 u2.
Interpreting run-time: ( -- c-addr2 u2 )

translate-env ( – translator  ) gforth-experimental

Additional data: ( c-addr1 u1 ).
Interpreting run-time: ( -- c-addr2 u2 )
c-addr2 u2 is the content of the environment variable with name c-addr1 u1.

translate-to ( – translator  ) gforth-experimental

Additional data: ( n xt ).
xt belongs to a value-flavoured (or defer-flavoured) word, n is the index into the to-table: for xt (see Words with user-defined to etc.).
Interpreting run-time: ( ... -- ... )
Perform the to-action with index n in the to-table: of xt. Additional stack effects depend on n and xt.

One way to write a recognizer is to call forth-recognize on a substring, and then look at the result to see if something was recognized that the whole-string recognizer actually deals with. E.g., rec-tick and rec-dtick do this and then check whether forth-recognize has pushed nt translate-name; the benefit of this approach is that, e.g. `environment:max-n works, where rec-scope recognizes environment:max-n. The specific check for an nt used in rec-tick and rec-dtick is forth-recognize-nt?; it is implemented on top of the more general try-recognize.

try-recognize ( c-addr u xt – ... translator | 0  ) gforth-experimental

Try to recognize c-addr u with rec-forth, then execute xt ( ... translator -- ... true | false ). If xt returns 0, reset the stacks to the depths at the start of try-recognize, drop three data stack items, and push 0. Otherwise return the results of executing xt.

forth-recognize-nt? ( c-addr u – nt | 0  ) gforth-experimental “forth-recognize-nt-question”

If forth-recognize produces a result nt translate-name, return nt, otherwise 0.