CHAPTER 11.  DEFINING WORDS

 

The Forth language is a major synthesis of many concepts and techniques used for sometime in the computer industry, such as stacks, dictionary, virtual memory, and the interpreter.  The single most important invention by Charles Moore in developing this language which wrapped all these elements tegether and rolled them into a small yet powerful operating system is the code field in the header of a definition.  The code field contains the address of a routine to be executed when the definition is called.  This routine determines the characteristics of the definition, and interprets the data stored in the parameter field accordingly.  In the basic Forth system, only a very small set of code field routines are defined and are used to create many types of definitions often used in programming.  The types of definitions commonly used are colon definitions, code definitions, constants, and variables. 

The most interesting feature in the Forth language is that the machinery used to define these definitions is accessible to the user for him to create new types of definitions.  The mechanism is simply to define new code field routines which will correctly interpret a new class of words.  The freedom to create new types of definitions, or in a mind bogging phrase--to define defining words-- was coined as the extensibility of Forth language.  The process of adding a new definition to the dictionary--create a header, select the address of a code routine and put in the code field, and compile data or addresses into the parameter field--is termed 'to define a word'.  The words like ':', CODE , CONSTANT , VARIABLE , etc., which cause a new word to be defined or compiled into the dictionary, are thus called defining words.  The process of generating a word of this kind, the defining word, is 'to define a defining word'.  Our subject in this Chapter is how to define a word which defines a class of words. 

To create a definition , two things must be done properly: one is to specify how this definition is to be compiled and how this definition is to be constructed in the dictionary; and the second is to specify how this definition is to be executed when it is called by the text interpreter.  Consequently, a defining word consists of two parts: one to be used by the compiler to generate a definition in dictionary, and the other part to be executed when the definition is called.  All words generated by this defining word will have their code fields containing the same address pointing to the same run-time routine. 

There are two ways to define new defining words.  If the run-time routine pointed to by the code field is to be defined in machine assembly codes, the format is:

    : cccc ---;CODE assembly mnemonics

If the run-time routine is coded in high level words as in a colon definition, the format is:

    : cccc <BUILDS --- DOES> --- ;

In the above formats, cccc is the name of the new defining word,  --- denotes a series of predefined words, and 'assembly mnemonics' are assembly codes if an assembler has been defined in the dictionary.  If there is no assembler in the Forth system, machine codes in numeric form can be compiled into the dictionary to construct the run-time code routine. 

Executing the new defining word cccc in the form:

 

    cccc nnnn

 

will create a new definition nnnn in the dictionary and the words denoted by --- up to ;CODE or DOES> are executed to complete the process of building the definition in the dictionary.  The code field of this new definition will contain the address of the routine immediately following ;CODE or DOES> .  Consequently, when the newly defined word is called by the interpreter, the run-time routine will be executed. 

The above discussion might be somewhat confusing because of the context of defining a defining word.  It is.  The best way of explaining how the concept works is probably with a lot of examples.  Here we shall start with the figForth definitions of ;CODE , <BUILDS , and DOES> , followed by the two simple defining words CONSTANT and VARIABLE .  The most useful defining word ':' was discussed previously in Chapter 5 on the compiler.  It should be reviewed carefully. 

 

: ;CODE             --

 

Stop compilation and terminate a new defining word cccc  by compiling the run-time routine (;CODE) .  Assemble the  assembly mnemonics following.  Used in the form:

    : cccc -- ;CODE assembly mnemonics

 

?CSP                     Check the stack pointer.  Issue an error message if not equal to what was saved in CSP by ':' . 

COMPILE             When ;CODE is executed at run-time, the address of the next word will be compiled into dictionary. 

(;CODE)               Run-time procedure which completes the definition of a new defining word. 

[COMPILE]          Compile the next immediate word instead of executing it.

[                           Return to executing state to assemble the following assembly mnemonics. 

SMUDGE            Toggle the smudge bit in the length byte, and complete the new definition. 

;

IMMEDIATE

¡@

A class of definitions can then be created by using cccc in the form:

¡@

    cccc nnnn

 

The code fields in nnnn point to the code routine as assembled by the mnemonics following ;CODE in the definition of cccc .  The word nnnn when called to be executed will first jump to this code routine and execute this routine at run-time.  What will happen afterwards is totally dependent on this code routine.  The presence of the code field and hence the execution of the code routine after the word is called makes figForth an indirectly threaded coded system.  The code field allows users to extend Forth language to define new data structures and new control structures which are practically impossible in any other high level language.  This property is called the extensibility of Forth language. 

 

: (;CODE)           --

 

The run-time procedure compiled by ;CODE .   Rewrite the code field of the most recently defined word to  point to the following machine code sequence. 

 

R>                         Pop the address of the next instruction off the return stack, which is the starting address of the run-time code routine. 

LATEST                Get the name field address of the word under construction.

PFA CFA !             Find the code field address and store in it the address of

 t                            he code routine to be executed at run-time. 

;

 

The pair of words <BUILDS -- DOES> is used to define new defining words in the form:

 

    : cccc <BUILDS -- DOES> -- ;

 

The difference from the ;CODE construct is that <BUILDS-DOES> gives users the convenience of defining the code field routine in terms of other high level definitions, saving them the trouble of coding these routines in assembly mnemonics.  Using high level words to define a defining word makes them portable to other types of computers also speaking Forth.  The price to be paid is the slower speed in executing words defined by these defining words.  This is the tradeoff a user must weigh to his own satisfaction. 

 

: <BUILDS           --

 

When cccc is executed, <BUILDS will create a new header  for a definition with the name taken from the next text  in the input stream. 

 

0 CONSTANT      Create a new entry in the dictionary with a zero in its parameter field.  It will be replaced by the address of the code field routine after DOES> when DOES> is executed. 

;

 

: DOES>             --

 

Define run-time routine action within a high level defining  word.  DOES> alters the code field and the first cell in the  parameter field in the defining word, so that when a new  word created by this defining word is called, the sequence  of words compiled after DOES> will be executed. 

 

R>                       Get the address of the first word after DOES> .

LATEST               Get the name field address of the new definition under construction. 

PFA !                    Store the address of the run-time routine as the first parameter. 

;CODE                  When DOES> is executed, it will first do the following code routine because ;CODE puts the next address into the code field of CODE> . 

DODOE:                -- pfa

MOV IP,-(RP)        Push the address of the next instruction on the return stack. 

MOV (W)+,IP        Put the address of the run-time routine in IP .

MOV W,-(S)          W was incremented in the last instruction, pointing to the parameter field.  Push the first parameter on stack. 

NEXT

 

In the figForth model, there are three often used defining words beside ':' and CODE: CONSTANT, VARIABLE, and USER.  They are themselves defined:

 

: CONSTANT      n --

 

Create a new word with the next text string as its name and  with n inserted into its parameter field. 

CREATE              Create a new dictionary header with the next text string. 

SMUDGE             Toggle the smudge bit in the length byte in the name field.

,                            Compile n into the parameter field.

;CODE                  The code field of all constants defined by CONSTANT will have the address of the following code routine:

DOCON:               The constant interpreter.

MOV (W),-(S)       Push the contents of parameter field to the stack. 

NEXT                   Return to execute the next word.

¡@

It is sed in the following form:

 

    n CONSTANT cccc

 

to define cccc as a new constant.  When cccc is later called, the value n will be pushed on the data stack.  This is the best way to store a constant in the dictionary for later uses, if this constant is used often.  When a number is compiled as an in-line literal in a colon definition, 4 bytes are used because the word LIT must be compile before the literal so that the address interpreter would not mistakenly interpret it as a word address.  The overhead of defining a constant is 6 bytes and the bytes needed for name field, averaging to about 10 bytes per definition.  If the constant will be used more than thrice, savings in memory space justify the defining of a constant. 

 

: VARIABLE          n --

 

Define a new word with the following text as its name and  its parameter field initialized to n.  When the new word is  executed, the parameter field address instead of its content  is pushed on the stack. 

 

CONSTANT          Create a dictionary header with n in the parameter field.
Compiling action in defining a variable is identical to that of defining a constant, but run-time behavior is different. 

;CODE                  Code field in a variable points to following code routine.

¡@

DOVAR:               Variable interpreter.

MOV W,-(S)          Push the parameter field address on data stack.

 NEXT

 

Variables are defined by the following commands:

 

    n VARIABLE cccc

 

When cccc is later executed, the address of the variable is pushed on the data stack.  To get the current value of this variable, one should use the @ command :

 

    cccc @

 

and to change the value to a new one n1,

 

    n1 cccc !

 

: USER          n --

 

Create a user variable with n in the parameter field.  n  is a fixed offset relative to the user area pointer UP for  this user variable. 

 

CONSTANT          n is compiled as a constant.

;CODE                   The run-time code routine is labelled as DOUSE :

¡@

DOUSE:                 User variable interpreter.

MOV (W),-(S)        Push n on data stack. 

ADD UP,(S)          Add the base address of the user area.

NEXT                    Return.  Now the top of data stack has the address pointing to the user variable. 

 

After a user variable is defined as:

 

    n USER cccc

 

the word cccc can be called.  When cccc is executed, UP+n will be pushed on the data stack and its contents can be examined by @ or modified by ! .  In figForth, the user variables are used similar to other variables.  Their significance is not apparent because figForth generally does not support multitasking.  When Forth is used in a multitasking environment, each task owns a copy of all the user variables, which define the context of a task and allow tasks to be switched conveniently.  This is a topic much too advanced to be discussed here.

 

 

¡@