CHAPTER V.  COMPILER

The Forth computer spends most of its time waiting for the user to type in some commands at the terminal.  When it is actually doing something useful, it is doing one of two things: executing or interpreting words with the address interpreter, or parsing and compiling the input texts from the terminal or disk.  These are the two 'states' of the Forth computer when it is executing.  Internally, the Forth system uses an user variable STATE to remind itself what kind of job it is supposed to be doing.  If the contents of STATE is zero, the system is in the executing state, and if the contents of STATE is not zero, it is in the compiling state.  Two instructions are provided for the user to switch explicitly between the executing state and the compiling state.  They are '[', left-bracket, and ']', right-bracket. 

 

: [                 --

 

Suspend compilation and execute the words following [ up to ].  This allows calculation or compilation exceptions before resuming compilation with ].  Used in a colon definition in the form:

 

    : nnnn -- [ -- ] -- ;

 

0 STATE !             Write 0 into the user variable STATE and switch to executing state.

; IMMEDIATE        [ must be executed, not compiled.

 

: ]                 --

 

Resume compilation till the end of a colon definition.

 

C0H STATE !       The text interpreter compares the value stored in STATE with the value in the length byte of the definition found in the dictionary.  If the definition is an immediate word, its length byte is greater than C0H because of the precedence and the sign bits are both set. Setting STATE to C0H will force non-immediate words to be compiled and immediate words to be executed, thus entering into the 'compiling state'.

;

 

In either state, the text interpreter parses a text string out of the input stream and searches the dictionary for a matching name.  If an entry, a word of the same name, is found, its code field address will be pushed to the data stack.  Now, if STATE is zero, the address interpreter is called in to execute this word.  If STATE is not zero, the text interpreter itself will push this code field address to the top of dictionary, and compile this word into the body of a new definition the text interpreter is working on.  Therefore, the text interpreter is also the compiler in the figForth system, and it is very much optimized to do compilation just as efficiently as interpretation. 

There are numerous instances when the compiler cannot do its job to build complicated program structures.  The compiler itself can only compile linear list of addresses, one word after another.  If program structures require branching and looping, as in the BEGIN--UNTIL, IF--ELSE--ENDIF, and DO--LOOP constructs, the compiler needs lots of help from the address interpreter.  The help is provided through words of the IMMEDIATE nature, which are immediately executed even when the system is in the compiling state.  These immediate words are therefore compiler directives which direct the compiling process so that at run-time the execution sequences may be altered. 

In this Chapter, we shall first discuss the words which create a header for a new definition in the dictionary.  These are words which start the compiling process.  In Chapter 12 we shall discuss the immediate words which construct conditional or unconditional branch to take care of special compilation conditions. 

A dictionary entry or a word must have a header which consists of a name field, a link field, and a code field.  The body of the word is contained in the parameter field right after the code field.  The header is created by the word CREATE and its derivatives, which are called defining words because they are used to create or define different classes of words.  All words in the same class have the same code field address in the code fields.  The code field address points to a code routine which will interpret this word when this word is to be executed.  The structure of a definition as compiled in the dictionary is shown in Fig. 4. 

 

: CREATE            --

 

Create a dictionary header for a new definition with name cccc .  The new word is linked to the CURRENT vocabulary.  The code field points to the parameter field, ready to compile a code definition.  Used in the form: 

 

    CREATE  cccc

 

BL WORD             Bring the next string delimited by blanks to the top of dictionary.

HERE                     Save dictionary pointer as name field address to be linked.

DUP C@                Get the length byte of the string

WIDTH @ WIDTH       has the maximum number of characters allowed in the name field.

MIN                        Use the smaller of the two, and

1+ ALLOT              allocate space for name field, and advance DP to link field.

DUP 0A0H TOGGLE     Toggle the eighth (start) and the sixth (smudge) bits in the length byte of the name field.  Make a 'smudged' head so that dictionary search will not find this name .

HERE 1- 80H TOGGLE  Toggle the eighth bit in the last character of the name as a delimiter to the name field.

LATEST ,              Compile the name field address of the last word in the link field, extending the linking chain.

CURRENT @ !      Update contents of LATEST in the current vocabulary.

HERE 2+ ,             Compile the parameter field address into code field, for the convenience of a new code definition.  For other types of definitions, proper code routine address will be compiled here.

;

 

: CODE          --

 

Create a dictionary header for a code definition.  The code  field contains its parameter field address.  Assembly codes are to be compiled (assembled) into the parameter field.

 

CREATE          Create the header, nothing more to be done on the header.

[COMPILE]

ASSEMBLER       Select ASSEMBLER vocabulary as the CONTEXT vocabulary, which has all the assembly mnemonics and words pertaining to assembly processes.

;

 

 

 

It is important to remember that the text interpreter itself is doing the job for the assembler.  Thus all the words defined in the FORTH vocabulary are available to assist the assembling of machine code words.  In fact  assembling code definitions is much more complicated than compiling colon definitions.  Many specialized utility routines have to be defined in the assembler vocabulary before the simplest of code definitions can be assembled.  This part of the assembler vocabulary is generally called the pre-assembler, which is not in the figForth model because it is machine dependent.  In Chapter 14 we shall discuss the details involved in the assemblers, based on the PDP-11 and 8080 instruction sets. 

 

: :                 --

 

Start a colon definition, used in the form:

 

        : cccc --- ;

 

Create a dictionary header with name cccc as equivalent to the following sequence of words --- until the next  ';' or ;CODE .  The compiling process is done by the text interpreter as long as STATE is non-zero.  The CONTEXT vocabulary is set to CURRENT vocabulary , and words with the precedence (P) bit set are executed rather than compiled.

 

?EXEC                  Issue an error message if not executing.

!CSP                      Save the stack pointer in CSP to be checked by ';' or ;CODE .

CURRENT @ CONTEXT ! Make CONTEXT vocabulary the same as the CURRENT vocabulary.

CREATE               Now create the header and establish linkage with the current vocabulary.

]                            Change STATE to non-zero.  Enter compiling state and compile the words following till ';' or ;CODE .

;CODE                  End of the compiling process for ':'.  The following codes are to be executed when the word cccc is called.  The address here is to be compiled into the code field of cccc .

DOCOL:               Here comes the inner interpreter for colon definitions.

MOV IP,-(RP)       Push IP on the return stack

MOV W,IP            Move the parameter field address into IP , the next word to be executed.

NEXT                   Go execute the next word.

 

Execution of DOCOL adds one more level of nesting.  Unnesting is done by ';' (semi-colon), which must be the last word in a colon definition. 

 

: ;                 --

 

Terminate a colon definition and stop further compilation.  Return execution to the calling definition at run-time.

 

?CSP                     Check the stack pointer with that saved in CSP .  If they differ, issue an error message.

COMPILE ;S         Compile the code field address of the word ;S into the dictionary, at run-time.  ;S will return execution to the calling definition.

SMUDGE              Toggle the smudge bit back to zero.  Restore the length byte in the name field, thus completing the compilation of a new word.

[                             Set STATE to zero and return to the executing state.

;

IMMEDIATE

The ending of a colon definition ;CODE as seen in the definition of ':', involves an advanced concept of defining a defining word.  The discussions of this concept will be the theme in Chapter 11 on the defining words.  The detailed words which manipulates information in the dictionary will be discussed in Chapter 9.  The immediate words used in constructing branching structures are treated in Chapter 12 concerning control structures.