A recognizer is a Forth word with the stack effect ‘( c-addr u -- ... translator | 0 )’. c-addr u describes the string to be recognized. If the recognizer does not recognize the string, it returns 0. If it does recognize the string, it returns a translator, and a translator-specific amount of additional data (“...”). When performing a translator action, the translator consumes this additional data. E.g., when you perform
"5" rec-num
it pushes ‘5 `translate-num’ on the stack, and when the
compilation action of translate-num
is performed, both stack
items are removed from the stack. This compilation action also
compiles a literal 5 is compiled into the current definition.
You typically write a recognizer as ordinary colon definition that
examines the string in some way, and if the string is accepted by this
recognizer, produces the translator and additional data. E.g., a
simple variant of rec-tick
can be implemented as follows:
: rec-tick ( addr u -- xt translate-num | 0 ) over c@ '`' = if 1 /string find-name dup if name>interpret ['] translate-num then exit then 2drop 0 ;
The only appropriate use of a translator is with one of the words for performing translator actions (see Performing translator actions). But someone thinks that it’s a clever idea to implement the tranlator token as xt of the translator, so you have to tick translators every time you use one in a recognizer.
It also means that the translator has a translator-specific stack effect. Because this stack effect encompasses three actions (interpretation, compilation, postpone), the stack effect is a mess, but if you pass the translator to one of the action-performing words, the resulting stack effect is sensible (see Performing translator actions). When you write a recognizer, the only part of the stack effect that is relevant is the one before the ‘--’, because it tells you what your recognizer should put on the stack in addition to the xt of the translator.
A number of translators already exist in Gforth and can be used in a recognizer you write. If none of them is appropriate for your recognizer, read the next section about defining your own translators.
Translate nt. The ... are there because the interpretation or compilation semantics of nt might have a stack effect.
translate a number
translate a double number
A translator for a float number.
This translator parses until the end of the string, concatenates the first part c-addr u with the parsed part, and does string translation for the result.
c-addr u describes the name of the environment variable. The translator actions produce the contents of the environment variable.
xt belongs to a value-flavoured word, n is the index into
the to-table:
for xt (see Words with user-defined to
etc.).
One way to write a recognizer is to call forth-recognize
on a
substring, and then look at the result to see if something was
recognized that the whole-string recognizer actually deals with.
E.g., rec-tick
and rec-dtick
do this and then check
whether forth-recognize
has pushed nt translate-nt
;
the benefit of this approach is that, e.g. `environment:max-n
works, where rec-scope
recognizes environment:max-n
.
The specific check for an nt used in rec-tick
and
rec-dtick
is forth-recognize-nt?
; it is implemented on top of the
more general try-recognize
.
Try to recognize c-addr u with forth-recognize
,
then execute xt ‘( ... translator -- ... true |
false )’. If xt returns 0, reset the stacks to the
depths at the start of try-recognize
, drop three data
stack items, and push 0. Otherwise return the results
of executing xt.
If forth-recognize
produces a result nt
translate-nt
, return nt, otherwise 0.