CHAPTER 12.  CONTROL STRUCTURES

 

Most definitions in the Forth dictionary are defined by the colon ':' word.  They are called colon definitions, Forth definitions, or high level definitions.  When the text interpreter sees the word ':', it creates a header using the text string following colon as the name and then enters the compiling state.  In the compiling state, the text interpreter reads in a text line from the input stream, parses out strings delimited by blanks, and tries to match them with dictionary entries.  If a string matches with a dictionary entry, the code field address of the matching word is added to the parameter field of the new definition under construction.  This is what we call the compiling process.  The compiling process ends when the terminating word ; or ;CODE is encountered. 

When a colon definition is later executed, the word addresses in its parameter field are executed by the address interpreter in the order as compiled.  If it is necessary to alter the sequential execution process at run-time, special word has to be used in the compiling process to set up the mechanism of branching and looping, to build the control structures and the program constructs in the colon definition.  These special words are equivalent to compiler directives or assembly directives in conventional computer languages.  These words do not become part of the compiled definition, but cause specific actions during compilation to build the control structure into the definition and to ensure its correct execution at run-time.  These special words in Forth are characterized by the fact that they all have a precedence bit in the length byte of the name field set to one.  Words with precedence bit set are called immediate words because the text interpreter turns these words over to the address interpreter for execution even during compilation. 

In this Chapter, we shall concern ourselves with the means by which the following control structures are built in a colon definition:

 

    IF -- ELSE -- ENDIF

    BEGIN -- UNTIL

    BEGIN -- WHILE -- REPEAT

and     DO -- I -- LEAVE -- LOOP

 

However, before discussing the detailed definitions of these words, a few utility words should be presented to make the discussions more intelligible.  The word COMPILE and [COMPILE] are used to handle special compiling situations.  The words BRANCH and 0BRANCH are the actual words which get compiled into the definition to do the branching and looping. 

Words in a colon definition are normally compiled into dictionary and their code field address are compiled into the parameter field of the colon definition under compilation.  Sometimes the compilation should be delayed to the run-time, i.  e., the word is to be compiled not when the colon definition is being compiled, but when the colon definition is later executed.  To defer compilation until run-time, the instruction COMPILE must precede the word. 

 

: COMPILE           --

 

Defer compilation until run-time.  When the word containing  COMPILE is executed, the code field address of the word  following COMPILE is compiled into the dictionary at run-time. 

 

?COMP                 Error if not compiling.

R>                         Top of return stack is pointing to the next word following COMPILE . 

DUP 2+ >R            Increment this pointer by 2 to point to the second word ollowing COMPILE , which will be the next word to be executed.  The word immediately following COMPILE should be compiled, not executed. 

@ ,                        Do the compilation at run-time.

;

 

Immediate words, because of their precedence bits, are executed during compilation.  However, if one wanted to use the word sequence in an immediate word as a regular colon definition, i.  e.  to compile it in-line, the word [COMPILE] can be used to force the following immediate word to be compiled into a definition.  The word [COMPILE] is used in the form

     : xxxx --- [COMPILE] cccc --- ;

in which cccc is the name of an immediate word. 

¡@

: [COMPILE]         --

¡@

Force the compilation of the following immediate word.

¡@

-FIND                    Accept next text string and search dictionary for a match.

0= 0 ?ERROR        No matching entry was found.  Issue an error message.

DROP                    Discard the length byte of the found name.

CFA ,                    Convert the name field address to code field address and compile it into the dictionary. 

;

IMMEDIATE

 

The two words changing execution sequence in a colon definition are BRANCH and 0BRANCH, both are primitive code definitions.  They are of such importance that I feel they should be treated fully.  The codes are from PDP-11 fig-Forth. 

 

CODE BRANCH         --

 

The run-time procedure to branch unconditionally.  An  in-line offset is added to the interpretive pointer  IP to branch forward or backward.  BRANCH is  compiled by ELSE, AGAIN, and REPEAT. 

 

 ADD (IP),IP         Add the contents of the next cell pointed to by IP to IP itself.  The result is put back to IP which points to the next word to be executed.  The next word can be out of the regular execution order. 

 NEXT           Return to the word pointed to by IP , completing the unconditional branching. 

¡@

CODE 0BRANCH        f --

 

The run-time procedure to branch conditionally.  If  f on stack is false (zero), the following in-line  offset is added to IP to branch forward or  backward.  Compiled by IF, UNTIL, and WHILE. 

 

TST (S)+                 Test the flag f on stack.

BNE ZBRA1           Not zero, continue executing next word by skipping the offset.

ADD (IP),IP             f is zero, do the branching.

NEXT

 

ZBRA1:                A common routine shared with LOOP.

ADD #2,IP            f is true, skip the in-line offset.  Pick up the word following the offset and continue execution.

NEXT

¡@

Conditional branching in a colon definition uses the forms:

 

    IF (true part) --- ENDIF

or IF (true part) --- ELSE (false part) --- ENDIF

 

At run-time, IF selects to execute the true part of words immediately following it, if the top item on data stack is true (non-zero).  If the flag is false (zero), the true part will be skipped to after ELSE to execute the false part.  After executing either part, execution resumes after ENDIF .  ELSE and the false part are optional.  If ELSE part is missing, execution skips to just after ENDIF . 

 

: IF                At run-time         f --

                Compile time        -- addr n

 

It compiles 0BRANCH and reserves one more cell for an offset value at addr .  addr will be used later to resolve  the offset value for branching.  n is set to 2 for error  checking when ELSE or ENDIF is later compiled. 

 

COMPILE 0BRANCH     Compile the code field address of the run-time routine  0BRANCH into the dictionary when IF is executed. 

HERE                    Push dictionary address on stack to be used by ELSE or ENDIF to calculate branching offset. 

0 ,                          Compile a dummy zero here, later it is to be replaced by an offset value used by 0BRANCH to compute the next word  address. 

2                            Error checking number.

;

IMMEDIATE       IF in a colon definition must be executed, not compiled.

 

: ENDIF             Compile time        addr n --

 

Compute the forward branching offset from addr to HERE  and store it at addr .  Test n to match the previous  IF or ELSE in the definition. 

 

?COMP                  Issue an error message if not compiling.

2 ?PAIRS ENDIF   must be paired with IF or ELSE .  If n is not 2, the structure was disturbed or improperly nested.  Issue an error message. 

HERE                    Push the current dictionary address to stack.

OVER -                 HERE-addr is the forward branching offset.

SWAP !                 Store the offset in addr , thus completing the IF-ENDIF or IF-ELSE-ENDIF construct.

;

IMMEDIATE

 

: ELSE          Compile time        addr1 n1 -- addr2 n2

 

Compile BRANCH and reserve a cell for forward branching  offset.  Resolve the pending forward branching from IF  by computing the offset from addr1 to HERE and storing  it at addr1 . 

 

2 ?PAIRS            Error checking for proper nesting.

COMPILE BRANCH Compile BRANCH at run-time when ELSE is executed.

HERE                   Push HERE on stack as addr2 .

0 ,                         Dummy zero reserving a cell for branching to ENDIF .

SWAP                   Move addr1 to top of stack.

[COMPILE] ENDIF         Call ENDIF to work on the offset for forward branching.  ENDIF is an immediate word.  To compile it the word [COMPILE] must be used. 

2                           Leave n2 on stack for error checking.

;

IMMEDIATE

 

Indefinite loops are to be constructed using the following forms:

 

    BEGIN --- UNTIL

or BEGIN --- WHILE --- REPEAT

 

BEGIN simply leaves the current dictionary address on stack for UNTIL or REPEAT to pickup and to compute a backward branching offset at the end of the loop.  WHILE is similar to IF in that it skips to just after REPEAT if the flag on stack at that point isfalse, thus terminating the indefinite loop from inside the loop.  UNTIL terminates the loop only at the end of the loop. 

 

: BEGIN             Compile time        -- addr n

 

At compile time BEGIN leaves the dictionary address on  stack with an error checking number n.  It does not compile  anything to the dictionary. 

 

?COMP                 Issue an error message if not compiling.

HERE                    Push dictionary pointer on stack to be used to compute backward branching offset. 

1                           Error checking number.

;

IMMEDIATE

¡@

: BACK          addr --

 

A run-time procedure computing the backward branching offset from HERE to addr on stack, and compile this offset  value in the next in-line cell in the dictionary

 

HERE - ,            addr-HERE, the backward branching offset.

;

¡@

: UNTIL             Compile time        addr n --

 

Compile 0BRANCH and an in-line offset from HERE to  addr.  Test the error checking code n.  If not equal to 1, there is an error  in the nesting structure. 

 

1 ?PAIRS            If n is not 1, issue an error message.

COMPILE 0BRANCH     Compile 0BRANCH at run-time.

BACK            Compute backward branching offset and compile the offset.

;

IMMEDIATE

 

When the colon definition containing the BEGIN-UNTIL structure is executed, the word 0BRANCH compiled by UNTIL at the end of a loop will test the flag on stack at that instant.  If the flag is false, 0BRANCH will branch back to the word following BEGIN.  The words between BEGIN and UNTIL will be repeatedly executed until the flag is true at UNTIL; at this instant, the interpreter will abort this loop and continue executing the words following UNTIL. 

 

: AGAIN             compile time        addr n --

 

Similar to UNTIL but compile BRANCH instead of  0BRANCH in the dictionary to construct an infinite loop.   Execution cannot leave this loop unless the words R> DROP  are executed in a word inside this loop. 

 

1 ?PAIRS            Error checking.

COMPILE BRANCH Compile BRANCH and an offset to BEGIN .

BACK

;

IMMEDIATE

 

The construct BEGIN-WHILE-REPEAT uses WHILE to abort a loop in the middle of the loop.  WHILE will test the flag left on stack at that point.  If the flag is true, WHILE continues the execution of following words until REPEAT, which then branches unconditionally back to BEGIN.  If the flag is false, WHILE causes execution to skip the words up to REPEAT, thus exiting the loop structure. 

 

: WHILE             Compile time        addr1 n1 -- addr1 n1 addr2 n2

 

Compile 0BRANCH and a dummy offset for REPEAT to resolve.   addr1 and n1 as left by BEGIN are also passed on to  be processed by REPEAT. 

 

[COMPILE] IF        Call IF to compile 0BRANCH and the offset.

2+              Leave 4 as n2 to be checked by REPEAT .

;

IMMEDIATE

 

: REPEAT            Compile time        addr1 n1 addr2 n2 --

 Compile BRANCH to jump back to BEGIN.  Resolve also  the branching offset required by WHILE. 

 

>R >R                              Get addr2 and n2 out of the way.

[COMPILE] AGAIN         Let AGAIN do the dirty work of compiling unconditional branch back to BEGIN . 

R> R>                              Restore addr2 and n2 .

[COMPILE] ENDIF         Use ENDIF to resolve the forward branching needed by WHILE . 

;

IMMEDIATE

 

The IF-ELSE-ENDIF and the BEGIN-UNTIL types of constructs simply redirect the execution sequence inside of a colon definition.  As discussed previously, the definitions of these compiler directives are quite short and simple, involving only branching and conditional branching.  The DO-LOOP type of construct is more complicated because additional mechanisms other than branching are needed to keep track of the loop limits and loop counts.  The run-time functions of DO are to take the lower and upper loop limits off the data stack, push them on the return stack, and setup the address for LOOP to jump back.  LOOP at run-time will then increment the loop count on top of the return stack and compare its value to that of the loop limit just under it on the return stack.  If the loop count equals or exceeds the loop limit, the loop is completed and execution goes to the next word after LOOP.  Otherwise, LOOP will branch back to DO and continue the looping.  +LOOP behaves similarly to LOOP except that it increment the loop count by a number supplied on the data stack. 

The words DO, LOOP, and +LOOP call on their respective run-time routines to do the work.  The detailed codes in these run-time routines will be discussed also. 

DO-LOOP's are set up in a colon definition in the following forms:

 

    DO --- I --- LOOP

or DO --- I --- +LOOP

 

At run-time, DO begins a sequence of repetitive executionscontrolled by a loop count and a loop limit.  The starting value of the loop count and the loop limit are taken off the data stack at run time.  Upon reaching the word LOOP, the loop count is incremented by one.  Until the new loop count equals or exceeds the loop limit, execution loops back to the word just after DO.  Otherwise, the two loop parameters are removed from the return stack and the execution continues ahead at the word after LOOP.  Within a loop, the word I will copy the loop count to data stack to be used in computations. 

 

: DO                Run-time        n1 n2 --

                Compile time        -- addr n

¡@

COMPILE (DO)     Compile the run-time routine address of (DO) into dictionary.

HERE                    Address addr for backward branching from LOOP or +LOOP.

3                            Number for error checking.

;

IMMEDIATE

¡@

CODE (DO)           n1 n2 --

 

The run-time routine starting a DO-LOOP.  n1 and n2 are pushed on the return stack as loop limit and loop index, respectively.

 

MOV 2(S),-(RP)     Push the loop limit n1 on return stack.

MOV (S),-(RP)       Push the initial loop count n2 on return stack above n1 . 

ADD #4,S              Adjust the stack pointer to drop n1 and n2 off the data stack. 

NEXT                    Return.

¡@

CODE I          -- n

 

Return the current loop index inside a DO-LOOP.

 

MOV (RP),-(S)       Copy the loop count on return stack and push it to data stack. 

NEXT

¡@

CODE LEAVE      --

 

Make the loop limit equal to the loop count and force the loop to terminate at LOOP or +LOOP . 

 

MOV (RP),2(RP)      Copy loop count to loop limit on the return stack. 

 NEXT

 

: LOOP          addr n --

 

Terminate a DO-LOOP structure in a colon definition.

 

3 ?PAIRS              Check the number left by DO .  If it is not 3, issue an error message.  The loop is not properly nested. 

COMPLIE (LOOP)      Compile (LOOP) at run-time when LOOP is executed.

BACK                    Compute and compile the backward branch offset.

;

IMMEDIATE

¡@

CODE (LOOP)         --

¡@

Run-time routine of LOOP .

¡@

INC (RP)               Increment the loop count on return stack.

CMP (RP),2(RP)    Compare loop count with the loop limit.

BGE LOOP1          Jump to LOOP1 if the loop count is equal or greater than the loop limit. 

ADD (IP),IP           Add backward branch offset to IP and

NEXT                     branch back to repeat the DO-LOOP.

¡@

LOOP1:         

ADD #4,RP           Exit the loop.  Discard the loop parameters off the return stack. 

ADD #2,IP            Advance IP over the in-line offset number and

NEXT                    continue executing the next word after LOOP .

 

When the loop count must be incremented by an amount other than one, +LOOP should be used to close a DO-LOOP .  It is used in the form:

 

    DO --- I --- +LOOP

¡@

: +LOOP             Run-time        n1 --

                Compile time        addr n1 --

¡@

Increment the loop index by n1 on the stack and test for loop completion.  Branch back to DO if not yet done. 

 

3 ?PAIRS               Check n.  If it is not 3 as left by DO , issue an error message. 

COMPILE (+LOOP)         Compile the address of (+LOOP) at run-time when the colon definition is being built. 

BACK                    Compile back branch offset.

;

IMMEDIATE

¡@

CODE (+LOOP)        n --

¡@

Run-time routine at the end of a DO--+LOOP loop.

¡@

ADD (S),(RP)        Add n to the loop count on return stack.

TST (S)+                 Test and pop data stack

BLT LOOP3           If n is negative, jump to LOOP3 for special processing. 

CMP 2(RP),(RP)      n is positive.  Compare loop count with loop limit. 

BLE LOOP2            If the loop is done, jump to LOOP2 to exit.

ADD (IP),IP             Not yet done, return to DO .

NEXT

¡@

LOOP2:

ADD #4,RP           Clear return stack.

ADD #2,IP           Advance IP to the next word after +LOOP .

NEXT

¡@

LOOP3:

CMP (RP),2(RP)      Negative increment n .  Reverse comparison.

BLE LOOP2

ADD (IP),IP             Not yet done with the loop.  Return to the word after DO . 

NEXT

 

 

¡@