Type .recognizers
to find out with which recognizers are
currently being used by Gforth. When invoked in a colon definition
after defining a local, the output of .recognizers
is (at the
time of this writing):
rec-nt ( rec-locals search-order ( Forth Forth Root ) )
rec-scope rec-num rec-float rec-complex rec-string rec-to rec-dtick
rec-tick rec-body rec-env rec-meta
Here the notation name ( name1 ... namen ) indicates that name is a recognizer sequence that contains the recognizers name1 ... namen.
The recognizers in this sequence are:
rec-nt
Recognizes locals and words in the search order.
rec-locals
Recognizes locals.
search-order
Recognizes words in the search order. This is shown as recognizer
sequence, because the wordlists (see Word Lists) themselves are
also recognizers: They implement the recognizer interface
(see Define recognizers with existing translators) in addition to working with
find-name-in
.
rec-scope
Recognizes ‘voc1:voc2:..vocn:word’, where voc1 is a vocabulary in the search order, voc2 is a vocabulary found in voc1, and so on, until word is found in vocn; the translator of this recognizer performs the semantics of word. Example: ‘environment:max-n’.
rec-num
Single-cell integers (‘#-15’, ‘$-f’), characters (‘'A'’), and double-cell integers ‘#-15.’, with or without number prefixes (see Integer and character literals).
rec-float
Floating-point numbers (‘1e’, see Floating-point number and complex literals)
rec-complex
Complex numbers (‘1e+2ei’, see Floating-point number and complex literals)
rec-string
Strings (‘"abc"’, see String and environment variable literals).
rec-to
Recognizes ->v
(equivalent to to v
),
+>v
(equivalent to +to v
), and '>v
(equivalent to addr v
), where v is a value-flavoured
word (see Values). Also recognizes @>d
(equivalent to
action-of d
), and =>d
(equivalent to is
d
), where d is a defer-flavoured word (see Deferred Words).
rec-dtick
Recognizes ``word
and produces the name token of word
(see Literals for tokens and addresses).
rec-tick
Recognizes `word
and produces the execution token of word
(see Literals for tokens and addresses).
rec-body
Recognizes <word>
for the body address of word and
<word+num
for an offset num from the body address
of word (see Literals for tokens and addresses).
rec-env
Recognizes ${env}
for the string contained at run-time in
the environment variable env see String and environment variable literals).
rec-meta
Recognizes rec?string
, e.g., float?1.
.
Rec-rec
is a recognizer found in the search order (e.g.,
rec-float
, and this recognizer then tries to recognize
string (e.g, 1.
), and the result becomes the result of
rec-meta
. This may be useful in cases where you want to use a
specific recognizer, e.g., to deal with conflicts.
The order of the recognizers is significant, because they are tried
from left to right, and the first recognizer that recognizes a word is
actually used. E.g., if you define a local ‘b’, it will
supersede Gforth’s predefined word b
.
In most cases, however, recognizers are designed to avoid matching the
same strings as other recognizers. E.g., rec-env
(the
environment variable recognizer) requires braces to avoid a conflict
with the number recognizer when recognizing environment variables like
‘ADD’; i.e., rec-env
recognizes ${ADD}
, while
rec-num
recognizes $ADD
.
There are a few cases where Gforth’s recognizers can recognize the same string, however:
However, there are no conflicts of Gforth-defined words with decimal numbers prefixed with ‘#’ or hex numbers prefixed with ‘$’, so it is a good practice to use these prefixes (that’s also a good idea to make sure that the right base is used). An older practice (before number prefixes were introduced) was to prefix hex numbers with ‘0’.
In the code bases we have looked at, starting words with '
(quote aka tick) is much more common than starting them with `
(backquote aka backtick), so the recognizers for the xt and the nt use
`
to reduce the number of conflicts.
rec-num
and the floating-point
recognizer rec-float
recognize, e.g., 1.
. Because
rec-num
is (by default) first, 1.
is recognized as a
double-cell integer. If you change the recognizer order to use
rec-float
first, 1.
is recognized as a floating-point
number, but loading code written in Standard Forth may behave in a
non-standard way.
In any case, it’s a good practice to avoid that conflict in your own
code as follows: Always write double-cell integers with a number
prefix, e.g., #1.
; and always write floating-point numbers with
an e
, e.g., 1e
.
->
. You can
avoid a conflict by using to myvalue
or to?->myvalue
(the latter works with postpone
).
Note that most Forth systems do not support all the recognizers we
describe above, but rec-locals search-order rec-num rec-float
are relatively common (even if a system uses a hard-coded text
interpreter instead of the flexible recognizer system).
You can use locate
(see Locating source code definitions) to
determine which recognizer recognizes a piece of source code. E.g.:
locate float?1.
will show that rec-meta
recognizes float?1.
. However,
if the recognizer recognizes a dictionary word (e.g., the scope
recognizer), locate will show that word.
Wordlists are also recognizers, as can be seen by the search order
being shown as recognizer sequence containing the wordlists, . A
wordlist recognizes the words that it contains. Just execute
the wordlist-id, and it will behave as a recognizer:
"dup" forth-wordlist execute
produces the translator token of translate-nt
on the
top-of-stack, and the name token of dup
below that.
Print the system recognizer order, with the first-searched recognizer leftmost.
recognize a name token
search the locals wordlist and if found replace
the translator with translate-locals
.
Recognizes strings of the form (simplified)
wordlist:word
, where wordlist is found in the
search order. The result is the same as for rec-nt
for
word (the ordinary word recognizer) if the search order
consists only of wordlist. The general form can have
several wordlists preceding word, separated by :
;
the first (leftmost) wordlist is found in the search order, the
second in the first, etc. word is the looked up in the
last (rightmost) wordlist.
converts a number to a single/double integer
recognize floating point numbers
Complex numbers are always in the format a+bi, where a and b are floating point numbers including their signs
Convert strings enclosed in double quotes into string literals,
escapes are treated as in S\"
.
words prefixed with ->
are treated as if preceeded by
TO
, with +>
as +TO
, with
'>
as ADDR
, with @>
as ACTION-OF
, and
with =>
as IS
.
words prefixed with ``
return their nt.
Example: ``S"
gives the nt of S"
.
words prefixed with `
return their xt.
Example: `dup
gives the xt of dup.
words bracketed with '<'
'>'
return their body.
Example: <dup>
gives the body of dup
words enclosed by ${
and }
are passed to getenv
to get the OS-environment variable as string.
Example: ${HOME}
gives the home directory.
words prefixed with recognizer?
are processed by
rec-
recognizer to disambiguate recognizers.
Example: hex num?cafe num?add
will be parsed as number only
Example: float?123.
will be parsed as float