|
CHAPTER 9. DICTIONARY
In a Forth computer, the dictionary is a linked list of named entries or words which are executed when called by name. The dictionary consists of procedures defined either in assembly codes (code definitions) or in high level codes (colon definitions). It also contains system information as constants and variables used by the system. Inside the computer, the dictionary is maintained as a stack, growing from low memory towards high memory as new definitions are compiled or assembled into the dictionary. When the text interpreter parses out a text string form the input stream, the text is moved to the top of dictionary. If the text is the name of a new definition, it will be left there for the compiling process to continue. If it is not a new definition, the text interpreter will try to find a word in the dictionary with a name matching the string. The word found in the dictionary will be executed or compiled depending on the state of the text interpreter. The dictionary is thus the bulk of a Forth system, containing all the necessary information necessary to make the whole system work. The dictionary as a stack is maintained by a user variable named DP, the dictionary pointer, which points to the first empty memory location above the dictionary. A few utility words move DP around to effect various functions involving the dictionary.
: HERE -- addr ¡@ DP @ Fetch the address of the next available memory location above the dictionary. ; ¡@ : ALLOT n -- ¡@ DP +! Increment dictionary pointer DP by n, reserving n bytes of dictionary memory for whatever purposes intended. ; ¡@ : , n --
Store n into the next available cell above dictionary and advance DP by 2, i. e., compile n into the dictionary.
HERE ! Store n into dictionary 2 ALLOT Point DP above n, the number just compiled. ;
In fact, ',' (comma) is the most primitive kind of a compiler. With it alone, theoretically we can build the complete dictionary, or compile anything and everything into the dictionary. All the compiler words and assembler words are simple or complicated derivatives of ','. This feature is clearly reflected in the nomenclature of assembly mnemonics in the Forth assembler in which all mnemonics end with a comma.
For byte oriented processors, C, is defined to compile a byte value into the dictionary:
: C, b --
Enter a byte b on dictionary and increment DP by 1.
HERE C! 1 ALLOT ; ¡@ : -FIND -- pfa b tf , or ff
Accept the next word delimited by blanks in the input stream to HERE, and search the CONTEXT and then the CURRENT vocabularies for a matching name. If found, the entry's parameter field address, a length byte, and a true flag are left on stack. Otherwise only a boolean false flag is left.
BL WORD Move text string delimited by blanks from input string to the top of dictionary HERE . HERE The address of text to be matched. CONTEXT @ @ Fetch the name field address of the last word defined in the CONTEXT vocabulary and begin the dictionary search. (FIND) A primitive. Search the dictionary starting at the address on stack for a name matching the text at the address second on stack. Return the parameter field address of the matching name, its length byte, and a boolean true flag on stack for a match. If no match is possible, only a boolean false flag is left on stack. DUP 0= Look at the flag on stack IF No match in CONTEXT vocabulary DROP Discard the false flag HERE Get the address of text again LATEST The name field address of the last word defined in the CURRENT vocabulary (FIND) Search again through the CURRENT vocabulary. ENDIF ;
Please note the order of the two dictionary searches in -FIND .The first search is through the CONTEXT vocabulary. Only after no matching word is found there, is the CURRENT vocabulary then searched. This searching policy allows words of the same name to be defined in different vocabularies. Which word gets executed or compiled by the text interpreter will depend upon the 'context' in which the word was defined. A sophisticated Forth system usually has three vocabularies: the trunk FORTH vocabulary which contains all the system words, an EDITOR vocabulary which allows a programmer to edit his source codes in screens, an an ASSEMBLER vocabulary which has all the appropriate assembly mnemonics and control structure words. The user can create his own vocabulary and put all his applications words in it to avoid conflicts with words defined in the system. A good example is the definition of the trunk vocabulary of all the Forth system words:
VOCABULARY FORTH IMMEDIATE All vocabularies have to be declared IMMEDIATE, so that context can be switched during compilation. After FORTH is defined as above, whenever FORTH is encountered by the text interpreter, the interpreter will set the user variable CONTEXT to point to the second cell of the parameter field in the FORTH definition, which maintains the name field address of the last word defined in the FORTH vocabulary as the starting word to be searched. Using the phrase
FORTH DEFINITIONS
will set both the CONTEXT and the CURRENT to point to FORTH vocabulary so that new definitions will be added to the FORTH vocabulary. The words VOCABULARY and DEFINITIONS are defined as:
: VOCABULARY --
A defining word used in the form VOCABULARY cccc to create a new vocabulary with name cccc . Invoking cccc will make it the context vocabulary which will be searched by the text interpreter.
<BUILDS Create a dictionary entry with following text string as its name, and the code field pointing to the word after DOES> . 0A081H , A dummy header at vocabulary intersection. CURRENT @ Fetch the parameter field address pointing to the last word defined in the current vocabulary. CFA , Store its code field address in the second cell in parameter field. HERE Address of vocabulary link. VOC-LINK @ , Fetch the user variable VOC-LINK and insert it in the dictionary. VOC-LINK ! Update VOC-LINK with the link in this vocabulary. DOES> This is the end in defining cccc vocabulary. The next words are to be executed when the name cccc is invoked. 2 + CONTEXT ! When cccc is invoked, the second cell in its parameter field will be stored into the variable CONTEXT . The next dictionary search will begin with the cccc vocabulary. ; ¡@ : DEFINITIONS --
Used in the form: cccc DEFINITIONS Make cccc vocabulary the current vocabulary. Hence new definitions will be added to the cccc vocabulary.
CONTEXT @ CURRENT ! ;
The header of an dictionary entry is composed of a name field, a link field, and a code field. The parameter field coming after the header is the body of the entry. The name field is of variable length from 2 to 32 bytes, depending on the length of the name from 1 to 31 characters in the figForth model. The first byte in the name field is the length byte. The first and the last bytes in the name field have their most significant bits set as delimiting indicators. Therefore, knowing the address of any of the fields in the header, one can calculate the addresses of all other fields. Different field addresses are used for different purposes. The name field address is used to print out the name, the link field address is used in dictionary searches, the code field address is used by the address interpreter, and the parameter field address is used to access data stored in the parameter field. To facilitate the conversions between the addresses, a few words are defined as follows: : TRAVERSE addr1 n -- addr2 Move across the name field of a variable length name field. addr1 is the address of either the length byte or the last character. If n=1, the motion is towards high memory; if n=-1, the motion is towards low memory. addr2 is the address of the other end of the name field.
SWAP Get addr1 to top of stack. BEGIN OVER + Copy n and add to addr, pointing to the next character. 7FH Test number for the eighth bit of a character OVER C@ Fetch the character < If it is greater than 127, the end is reached. UNTIL Loop back if not the end. SWAP DROP Discard n. ; ¡@ : LFA pfa -- lfa ¡@ Convert the parameter field address to link field address. ¡@ 4 - ; ¡@ : CFA pfa -- cfa ¡@ 2 - ; Convert the parameter field address to code field address. ¡@ : NFA pfa -- nfa ¡@ Convert the parameter field address to name field address. ¡@ 5 - The end of name field -1 TRAVERSE Move to the beginning of the name field. ; ¡@ : PFA nfa -- pfa ¡@ Convert the name field address to parameter field address. ¡@ 1 TRAVERSE Move to the end of name field. 5 + Parameter field. ; ¡@ : LATEST -- addr
Leave the name field address of the last word defined in the current vocabulary.
CURRENT @ @ ;
To locate a word in the dictionary, a special word ' (tick) is defined to be used in the form: ' cccc to search for the name cccc in the dictionary.
: ' -- pfa
Leave the parameter field address of a dictionary entry with a name cccc . Used in a colon definition as a compiler directive, it compiles the parameter field address of the word into dictionary as a literal. Issue an error message if no matching name is found.
-FIND Get cccc and search the dictionary, first the context and then current vocabularies. 0= 0 ?ERROR Not found. Issue error message. DROP Matched. Drop the length byte. [COMPILE] Compile the next immediate word LITERAL to compile the parameter field address at run-time. LITERAL ; IMMEDIATE ' must be immediate to be useful in a colon definition. All the previous discussions are on words which add or compile data to the dictionary. In program development, one will come to a point that he has to clear the dictionary of some words no longer needed. The word FORGET allows him to discard some part of the dictionary to reclaim the dictionary space for other uses.
: FORGET -- ¡@ Used in the form: FORGET cccc Delete definitions defined after and including the word cccc . The current and context vocabulary must be the same.
CURRENT @ CONTEXT @ - 18 ?ERROR Compare current with context, if not the same, issue an error [COMPILE] ' Locate cccc in the dictionary. DUP Copy the parameter field address FENCE @ Compare with the contents in the user variable FENCE , < 15 ?ERROR If cccc is less than FENCE , do not forget. FENCE guards the trunk FORTH vocabulary from being accidentally forgotten. DUP NFA Fetch the name field address of cccc, and DP ! store in the dictionary pointer DP . Now the top of dictionary is redefined to be the first byte of cccc , in effect deleting all definitions above cccc . LFA @ Get the link field address of cccc pointing to the word just below it. CURRENT @ ! Store it in the current vocabulary, adjusting the current vocabulary to the fact that all definitions above (including) cccc no longer exist. ;
A powerful word VLIST prints of the names of all entries defined in the context vocabulary to allow the programmer to peek at the definitions in the dictionary.
: VLIST --
List the names of all entries in the context vocabulary. The 'break' key on terminal will terminate the listing.
80H OUT ! Initialize the output character counter OUT to print 128 characters. CONTEXT @ @ Fetch the name field address of the last word in the context vocabulary. BEGIN OUT @ Get the output character count C/L > If it is larger than characters/line of the output device, IF CR 0 OUT ! output a CR/LF and reset OUT . ENDIF DUP ID. Type out the name and SPACE SPACE add two spaces. PFA LFA @ Get the link pointing to previous word. DUP 0= See if it is zero, the end of the link, ?TERMINAL OR or if the break key on terminal was pressed. UNTIL Exit at the end of link or after break key was pressed; otherwise continue the listing of names. DROP Discard the parameter field address on stack and return. ; ¡@
¡@ |