CHAPTER 14.  FORTH ASSEMBLERS

 

An assembler which translates assembly mnemonics into machine codes is equivalent to a compiler in complexity if not more complicated.  One might expect the assembler to be simpler because it is at a lower level of construct.  However, the large number of mnemonic names with many different modes of addressing make the assembling task much more difficult.  In a Forth language system the assembling processes cannot be accomplished by the text interpreter alone.  All the resources in the Forth system are needed.  For this reason the assembler in Forth is often defined as an independent vocabulary, and the assembling process is controlled by the address interpreter, in the sense that all assembly mnemonics used by the assembler are not just names representing the machine codes but they are actually Forth instructions executed by the address interpreter.  These instructions when executed will cause machine codes to be assembled to the dictionary as literals.  The data stack and the return stack are often used to assemble proper codes and to resolve branching addresses. 

 

 

14.1.   Three Levels of Forth Assembler

 

 

Before discussing codes in the Forth assemblers, I would like to present assemblers in three levels of complexity:

 

Level 0:    The programmer looks up the machine codes and assembles them to the dictionary;

 

Level 1:    The computer translates the assembly mnemonics to codes with a lookup-table, but

        the programmer must fill in addresses  and literals when needed; and

 

Level 2:    The computer does all the work, with mnemonics and operands supplied by the        

        programmer. 

 

The Level 0 Assembler in Forth uses only three definitions already defined in the Forth Compiler:

 

CREATE          Generate the header for a new code definition,

,               Assemble a 16 bit literal into the dictionary, and

C,              Assemble a byte literal into the dictionary, used in byte

                oriented processors. 

 

These definitions were described as the most primitive compiler in Chapter 9.  They might just as well be the most primitive assembler if the new definition were a code definition.  The programmer would write down the machine codes first with the help of those small code cards supplied freely by CPU vendors.  The machine codes are entered on the top of the data stack and then assembled to the parameter field of the new definition on top of the dictionary. 

The Level 1 Assembler would use the defining word CONSTANT to define assembly mnemonics relating them to their respective machine code.  The text interpreter when confronted with a mnemonic name would push the corresponding machine code on the stack.  The code will then be assembled to the dictionary by  , or C, .  An example is:

 

    0 CONSTANT HALT

 

which defines HALT as a constant of 0.  During assembly, the phrase

 

    ...   HALT ,   ...

 

would assemble a HALT instruction into the dictionary.  To make it easier for himself, the programmer might want a new definition:

 

 : HALT,    HALT , ;

 

Executing HALT, would then assemble the HALT instruction to the dictionary. 

Historically all assembler definitions end their names with a comma for the reason just described, indicating that the definition causes a machine instruction to be assembled to the dictionary.  This convention serves very well to distinguish assembler definitions from regular Forth definitions. 

This scheme in Level 1 Assembler is quite adequate if there were a one to one mapping from mnemonics to machine codes.  However, in cases where many codes share the same mnemonic and differ only in operands or addressing mode, the basic code must be augmented to accommodate operands or address fields.  It is not difficult to modify definitions as HALT, to make the necessary changes in the code, which has to pass the data stack anyway.  To define each assembly mnemonic individually is messy and inelegant.  A much more appealing method is to use the <BUILDS-DOES> construct in the Forth language to define whole classes of mnemonics with the same characteristics, which brings us to the Level 2 Assembler. 

In the last example of the HALT instruction, instead of using CONSTANT to relate the mnemonic name with the code, a defining word is created as:

 

: OP        <BUILDS , DOES> @ , ;

 

The instruction HALT, is then defined by the defining word OP as:

 

0 OP HALT,      1 OP WAIT,      5 OP RESET, .  .  . 

 

Now, when HALT, is later processed by the text interpreter, the code 0 is automatically assembled into the dictionary by the runtime routine @ , . 

The <BUILDS-DOES> construct can be applied to all other types of assembly mnemonics to assemble different classes of instructions, providing some of the finest examples for the extensibility in the Forth language.   No other language can possibly offer such a powerfull tool to its programmers. 

A syntactic problem in using the Forth assembler is that before the mnemonics can be executed to assemble a machine code, all the addressing information and operands must be provided on the data stack.  Therefore, operands must precede the instruction mnemonics, resulting in the postfix notation.  The source listing of a Forth code definition is therefore very different from the conventional assembly source listing, where the operands follow the assembly mnemonic.  Using the data stack and the postfix notation greatly facilitate the assembling process in the Forth assembler.  This is a very small price to pay for the capability to access the host CPU and to make the fullest use of the resources in a computer system. 

Two assemblers will be discussed in this Chapter in an effort to cover the widest range of microprocessors.  One is for the homely Intel 8080A which is a byte oriented machine with a rather primitive instruction set.  On the other end is the PDP-11 instruction set, which is extensively micro-coded in a 16 bit wide code field.  I feel that these two examples should be sufficient to illustrate how Forth assemblers are constructed for most other microprocessors. 

 

 

14.2.   PDP-11 ASSEMBLER

 

 

The PDP-11 instruction set is typical of that for minicomputers.  With a 16 bit instruction field, very flexible and versatile addressing schemes are possible comparing with those used in the 8 bit instructions of most common microprocessors.  In addition, PDP-11 is a stack oriented machine in which all registers can be used as stack pointers in addition to normal accumulator and addressing functions.  There are 8 registers in the PDP-11 CPU: registers 0 to 5 are general purpose registers, register 6 is a dedicated stack pointer, and register 7 is the program counter.  Registers can be used in many different addressing modes, making it very convenient to host a Forth virtual machine in the PDP-11 computer.  This assembler was programmed by John James and was included in his PDP-11 figForth Model.

The following command sequence must be given first to initiate the ASSEMBLER vocabulary and to prepare the Forth system to build the assembler. 

 

OCTAL           PDP-11 instructions are best presented in octal base because 

                address fields are 6 bits wide. 

¡@

0 VARIABLE OLDBASE

 

To ease switching base to and from octal, the currently used  base will be stored away in OLDBASE, to be restored when the  assembly process is completed

VOCABULARY ASSEMBLER IMMEDIATE

Create the assembler vocabulary to house all the assembly mnemonics and other necessary definitions. 

 

: ENTERCODE         --

 

Invoke ASSEMBLER vocabulary to start the assembly process. 

 

[COMPILE] ASSEMBLER Set CONTEXT to ASSEMBLER to search for the mnemonics. 

BASE @ OLDBASE ! OCTAL  Switch base to octal.  Save old base to be restored after

                assembly. 

SP@                 Push stack pointer on stack for error checking at end.

;

 

: CODE          --

 

A more refined defining word to start a code definition. 

 

CREATE          Create a header with the name following CODE .

ENTERCODE       Invoke ASSEMBLER .

;

 

ASSEMBLER DEFINITIONS

 

Set both CONTEXT and CURRENT vocabularies to ASSEMBLER .  New definitions hereafter will be placed in the assembler  vocabulary. 

Before discussing the assembler definitions, the PDP-11 CPU registers and their addressing modes should be clarified.  An address field uses 6 bits in an instruction.  The lower 3 bits specify a register to be referenced for addressing, and the upper 3 bits specify the addressing mode.  The register and the addressing mode are combined to form an address field which is used to specify either a source operand or a destination operand in the assembly instruction as required.  Registers and modes are defined as follows:

 

: IS                n --

¡@

CONSTANT ;      Short hand for CONSTANT . 

 

0 IS R0     1 IS R1     2 IS R2     3 IS R3     4 IS R4     5 IS R5    

6 IS SP     7 IS PC     2 IS W 3 IS U 4 IS IP     5 IS S

6 IS RP

 

: RTST          r mode -- addr-field -1

 

Test register r for range between 0 and 7.  Add r and mode to form address field addr-field .  Also leave a flag -1 on  stack to indicate that an address field is underneath. 

 

OVER            Get r to top for tests.

DUP 7 >             Larger than 7 ?

SWAP 0 <            Smaller than 0 ?

OR IF           In either case, issue an error message,

    ." NOT A REGISTER:"

    OVER .  ENDIF   with the offending number appended.

+               addr-field = r + mode

-1              The flag.

;

 

The addressing modes are defined as executable definitions using names similar to the operand notation used in PDP assembly language with some twists.  The stack effects are:

    r -- addr-field , -1 . 

¡@

: )+    20 RTST ;       Post-increment register mode. 

¡@

: -)    40 RTST ;       Pre-decrement register mode.

¡@

: I)    60 RTST ;       Indexed register mode.

¡@

: @)+   30 RTST ;       Deferred post-increment mode.

¡@

: @-)   50 RTST ;       Deferred pre-decrement mode.

¡@

: @I)   70 RTST ;       Deferred index mode.

 

The addressing mode using the program counter is somewhat different from the modes using other general purpose registers. 

 

: #     27 -1 ;         Immediate addressing mode. 

¡@

: @#    37 -1 ;         Absolute addressing mode.

¡@

: ()                r -- addr-field -1      for register deferred mode. 

                n -- n 77 -1            for relative deferred mode.

¡@

DUP 10 U<           Top of stack is between 0 and 7, a register.

IF 10 + -1          Make the address field.

ELSE 77 -1 ENDIF        Otherwise, top of stack is an address offset.  Make

                it the relative deferred mode. 

;

 

The simplest instruction requires no operand.  These instructions can be defined by a simple defining word:

 

: OP                n --

 

A defining word to define instructions without operands. 

 

<BUILDS             Create an header for a mnemonic definition with the mnemonic

                name following OP . 

,               Compile the instruction code on the stack to the parameter

                field in the new definition. 

DOES>           --

                When the defined mnemonic definition is executed during

                assembly, execute the following words:

@ ,                 Fetch the instruction code stored in parameter field and

                assemble it to the code definition under construction on

                top of the dictionary. 

;

 

0 OP HALT,      1 OP WAIT,      2 OP RTI,       3 OP BPT,  

4 OP IOT,       5 OP RESET,     6 OP RTT,  

   

241 OP CLC,     242 OP CLV,     244 OP CLZ,     250 OP CLN,    

261 OP SEC,     262 OP SEV,     264 OP SEZ,         270 OP SEN,    

277 OP SCC,     257 OP CCC,     240 OP NOP,     6400 OP MARK,

 

Instructions with operands are of course more involved.  Those with only one operand are defined by a defining word 1OP .  This word uses many other utility definitions.  However, we shall first present the high level 1OP before getting into the nitty gritty details of the other low level definitions. 

 

: 1OP           n --

 

A defining word to define instructions with one operand. 

 

<BUILDS , DOES>         The same defining word format.

@ ,                 When the defined word is executed during assembly, the basic

                instruction code is fetched and assembled to the dictionary. 

FIXMODE             Take the mode packet on stack to resolve the address field.

DUP                 Copy the address field.

HERE 2 - ORMODE         Insert the address field into the lower 6 bit destination field. 

,OPERAND            If the instruction needs a 16 bit value either as a literal

                or as an address, assemble it after the instruction. 

;

 

: FIXMODE           addr-field -1 -- addr-field

                r -- r

                n -- n 67

 

Fix the mode packet on the data stack for ORMODE and  ,OPERAND to assemble the instruction correctly.

 

DUP -1 =            Top of stack = -1 ?

IF DROP             Yes, drop -1 and leave addr-field on top.

ELSE                The top of the stack might be a register or a literal.

    DUP 10 SWAP U< If top of stack is larger than 7 , PC relative mode.

    IF 67 ENDIF         Push 67 on top of n , indicating PC mode.

                Otherwise, leave the register number on the stack. 

ENDIF

;

 

: ORMODE            addr-field addr --

 

Take the address field value addr-field and insert it into  the lower 6 bit address field in the instruction code at  addr . 

 

SWAP            Move addr-field to top of the stack.

OVER @          Fetch the instruction code at addr .

OR              Insert address field.

SWAP !          Put the modified instruction back.

;

¡@

: ,OPERAND      (n) addr-field --

¡@

Assemble a literal to the dictionary to complete a program  counter addressing instruction. 

 

DUP 67 =            PC relative mode ?

OVER 77 = Or        PC relative deferred mode?

OR IF           In either case,

    SWAP        move operand n to top of the stack.

    HERE 2 + -      Compute offset from n to the next instruction address.

    SWAP        Put the offset value under addr-field.

ENDIF

DUP 27 =            PC immediate mode ?

OVER 37 = OR        Or PC absolute mode ?

SWAP            Get addr-field for another test.

177760 AND 60 = OR Or if it is index addressing mode.

IF , ENDIF I            n any of the three cases, assemble the literal

                after the instruction code. 

;               None of above.  The instruction does not need a literal.  It

                is already complete. 

 

: B                 --

 

Modify the instruction code just assembled to the dictionary to make a byte instruction from a cell instruction. 

 

100000          MSB of the byte instruction must be set.

HERE 2 - +!             Toggle the MSB of the instruction code on top of dictionary.

;

 

B is to be used immediately after an instruction definition like op1 op2 MOV,  B to move a byte from op1 to op2.   The byte instruction can be defined separately as MOVB,.   However, the modifier definition B is more elegant in reducing the number of mnemonic definitions by 25%. 

 

5100 1OP CLR,   5200 1OP INC,   5300 1OP DEC,   5400 1OP NEG,

5500 1OP ADC,   5600 1OP SBC,   5700 1OP TST,   6000 1OP ROR,

6100 1OP ROL,   6200 1OP ASR,   6300 1OP ASL,   6700 1OP SXT,

  100 1OP JMP,

 

: ROP           n --

 

A defining word to define two operand instructions.  The source operand can only be a register without mode selection.   The destination address field is the lower 6 bits, and the source register is specified by bits 6 to 8. 

 

<BUILDS , DOES>         Make header and store instruction code.

@ ,                 When defined instruction is executed, assemble the basic

                instruction code to the dictionary. 

FIXMODE             Fix the destination address field.

DUP                 Copy the just completed address field value.

HERE 2 -            Address of the instruction.

DUP >R          Save a copy of this address on the return stack to fix the

                source register field underneath it on the stack. 

ORMODE          Insert the destination address field into the instruction.

,OPERAND            If a literal operand is required, assemble it here.

DUP 7 SWAP U<       The register number must be less than 7 .

IF ." ERR-REG-B" ENDIF  The register number is too big, issue an error message. 

100 * R> ORMODE         Justify the source register field value and insert

                it into the instruction. 

;

¡@

74000 ROP XOR, 4000 ROP JSR,

¡@

: BOP           n --

 

A defining word used to define branching and conditional  branching instructions.  This word is included only for  completeness since the branchings are not structured.  In  Forth code definitions, more powerful branching and looping  structures should be used, as will be discussed shortly. 

 

<BUILDS , DOES>

@ ,

HERE -          The target address is presummably on data stack.  Compute

                the offset value for branching. 

DUP 376 >           If the offset is greater than 376, issue an error message:

IF ." ERR-BR+" .  ENDIF     with the out of range offset.

DUP -400 <          If the offset is less than -400, issue an error message:

IF ." ERR-BR-" .  ENDIF     with the out of range offset.

2 / 377 AND             The correct offset value is then

HERE 2 = ORMODE     inserted into the instruction code.

;

 

400 BOP BR,     1000 BOP BNE,   1400 BOP BEQ,   2000 BOP BGE,

2400 BOP BLT,   3000 BOP BGT,   3400 BOP BLE,   100000 BOP BPL,

100400 BOP BMI, 101000 BOP BHI,     101400 BOP BLOS,    102000 BOP BVC,

102400 BOP BVS, 103000 BOP BCC,     103400 BOP BCS,     103400 BOP BLO,

103000 BOP BHIS,

 

: 2OP               n --

 

A defining word to define two operand instructions.

 

<BUILDS , DOES>

@ ,

FIXMODE             Fix the mode packet for destination field.

DUP HERE 2 -       Get the address of the instruction to be fixed.

DUP >R                 Save a copy of the instruction address on return stack.

ORMODE              Insert the destination field.

,OPERAND           Assemble a literal after the instruction if required.

FIXMODE             Now process the source mode packet.

DUP 100 *            Justify the source field value.

R ORMODE          Insert the source field into the instruction.

,OPERAND           Assemble a literal if required.

HERE R> - 6 =       If there are two literals assembled after the instruction, they are in the wrong order.

IF SWAPOP ENDIF         The two literals have to be swapped.

;

¡@

: SWAPOP            --

Swap the two literals after a two operand instruction.  If  either literal is used for PC addressing, the offset value  will have to be adjusted to reflect the swapping. 

 

HERE 2 - @          Push the last literal on the stack.

HERE 6 - @          This is the instruction code itself.

6700 AND 6700 =         PC relative mode?

IF 2 + ENDIF        Yes, increment the last literal by 2.

HERE 4 - @          Now work on the first literal.

HERE 6 - @          Get the instruction back again.

67 AND 67 =         Is the destination field also of PC relative mode?

IF 2 - ENDIF          If it is, decrement the branching offset by 2.

HERE 2 - !             Put the first offset last,

HERE 4 - ! ;            and the last offset first.

¡@

10000 2OP MOV, 20000 2OP CMP, 30000 2OP BIT, 40000 2OP BIC,

50000 2OP BIS, 60000 2OP ADD, 160000 2OP SUB,

 

Two more instructions need to be patched:

 

: RST, 200 OR , ;

¡@

: EMT, 104000 + , ;

 

The branching instructions are similar to the GOTO statements in high level languages.  They are not very useful in promoting modular and structured programming.  Therefore, their usage in Forth code definitions is discouraged.  Somewhat modified forms of these branch instructions are defined in the assembler to code IF-ELSE-ENDIF and BEGIN-UNTIL types of structures.  Although these structures are very similar to the structures used in colon definitions, the functions of these words in the assembler are different.  Thus it is a good practice to define them with names ending in commas as all other mnemonic definitions.  However, the comma at the end does not imply that an instruction code is always  assembled by these special definitions. 

 

The conditional branching instructions are defined as constants to be assembled by the words requiring branching.  The notation is reversed from the PDP mnemonics because of this assembling procedure. 

 

1000 IS EQ      1400 IS NE      2000 IS LT      2400 IS GE

3000 IS LE      3400 IS GT      100000 IS MI    101000 IS LOS  

101400 IS HI    102000 IS VS    102400 IS VC    103000 IS LO

103400 IS HIS

 

: IF,               n -- addr

 

Take the literal n on stack and assemble it to dictionary  as a conditional branching instruction.  Leave the address of  this branching instruction on the data stack to resolve  the branching offset later. 

 

HERE            Address of the branching instruction.

SWAP ,          Assemble the branching instruction to the dictionary.

;

¡@

: IPATCH,           addr1 addr2 --

 

Use the addresses left on the stack to compute the forward  branching offset and patch up the instruction assembled by  IF, . 

 

OVER -                   Byte offset from addr1 to addr2.

2 / 1- 377 AND       The 8 bit instruction offset.

SWAP DUP @         Fetch out the branching instruction at addr1 .

ROT OR                  Insert the offset into the branching instruction.

SWAP !                   Put the completed instruction back.

;

 

: ENDIF,            addr --

 

Close the conditional structure in a code definition. 

 

HERE IPATCH,        Call on IPATCH, to resolve the forward branching.

;

¡@

: ELSE,             addr1 -- addr2

 

Assemble an unconditional branch instruction at HERE ,  and patch up the offset field in the instruction assembled  by IF, .  Leave the address of the current branch instruction  on the stack for ENDIF, to resolve. 

 

400 ,                        Assemble the BR, instruction to the dictionary.

HERE IPATCH,        Patch up the conditional branching instruction at IF, .

HERE 2 -                   Leave address of BR, for ELSE, to patch up.

;

¡@

: BEGIN,            addr --

¡@

HERE            Begin an indefinite loop.  Push DP on stack for backward

                branching. 

;

¡@

: UNTIL,            addr n --

 

Assemble the conditional branching instruction n to the  dictionary, taking addr as the address to branch back to. 

 

,                                Assemble n which must be one of the conditional branching instruction codes. 

HERE 2 -                  The address of the above instruction.

SWAP IPATCH,        Patch up the offset in the branching instruction.

;

 

: REPEAT,           addr1 addr2 --

 

Used in the form: BEGIN, .  .  .  WHILE, .  .  .  REPEAT,  inside a code definition.  Assemble an unconditional branch  instruction pointing to BEGIN, at addr1, and resolve the  forward branch offset for WHILE, at addr2 . 

 

HERE                    Save the DP pointing to the current BR, instruction.

400 ,                     Assemble BR, here.

ROT IPATCH,       Patch the BR, instruction to branch back to BEGIN, at addr1 . 

HERE                    This is where the conditional branch at WHILE, should branch to on false condition.

IPATCH,               Patch up the conditional branch at WHILE, .

;

¡@

: WHILE,            n -- addr

 

Assemble a conditional jump instruction at HERE .  Push the  address of this instruction addr on the stack for REPEAT,  to resolve the forward jump address. 

 

HERE            Push DP to stack.

SWAP            Move n to top of stack, and

,               assemble it literally as an instruction.

;

¡@

: C;                addr --

 

Ending of a code definition started by ENTERCODE . 

 

CURRENT @ CONTEXT !     Restore CONTEXT vocabulary to CURRENT .  Thus abandon the ASSEMBLER vocabulary to the current vocabulary where the new code definition was added.  The programmer can now test the new definition. 

OLDBASE @ BASE !    Restore the old base before assembling.

SP@ 2+ =            Compare the current SP with addr on the stack,

IF SMUDGE           if they are the same, the stack was not disturbed.  Restore the smudged header to complete the new definition.  Otherwise, issue an error message.

ELSE ." CODE ERROR, STACK DEPTH CHANGED"

ENDIF

;

 

: NEXT,             --

 

The address interpreter returning execution process to the colon definition which calls the code definition.  This  must be the last word in a code definition before C; . 

 

IP )+ W MOV,        Move the contents of IP to W.  IP is incremented by 2.

W @)+ JMP, J        ump to execute the instruction sequence pointed to by the contents of W.  W is incremented by 2, pointing to the parameter field of the word to be executed.

;

 

FORTH DEFINITIONS  

 

The assembler vocabulary is now completed.  Return  to the FORTH trunk vocabulary by setting both CONTEXT  and CURRENT to FORTH . 

 

DECIMAL             Restore decimal base.  The base was changed to octal when entering the a process of creating the assembler. 

 

 

14.3.   8080 ASSEMBLER

The assembler is usually defined in an independent vocabulary separated from the trunk FORTH vocabulary and other vocabularies.  To generate the ASSEMBLER vocabulary and to make some modifications in the FORTH vocabulary, the following words must be executed.  These words are commands to setup the ASSEMBLER vocabulary.  This 8080 Assembler was authored by John Cassidy, who also built the 8080 figForth Model.

 

HEX                     All 8080 codes will be represented in hexadecimal base.

VOCABULARY ASSEMBLER        Create a new vocabulary for assembler.

IMMEDIATE        Vocabulary must be of IMMEDIATE type to be used within colon definitions. 

' ASSEMBLER CFA     Get the code field address of ASSEMBLER definition, and

' ;CODE 0A + !      patch up the code in ;CODE .  This is to replace the word SMUDGE with ASSEMBLER , so that the codes following ;CODE can be understood in the context of the assembler.  The function of SMUDGE is deferred to the end of the code sequence in C; . 

 

: CODE          --

 

A more fully developed definition to start a code definition with error checking. 

¡@

?EXEC                  If not executing, issue an error message.

CREATE               Create a new dictionary header with the following name.

[COMPILE]           Compile the next IMMEDIATE word.

ASSEMBLER        Switch the CONTEXT to ASSEMBLER vocabulary to search assembly mnemonics first before the current vocabulary. 

!CSP                      Store current stack pointer in CSP for later error checking. 

; IMMEDIATE

 

: C;                --

 

Ending of a new code definition.  Check for error and restore the smudged header. 

 

CURRENT @ CONTEXT !     At the beginning of assembly, CONTEXT was switched to ASSEMBLER, to search for the assembler mnemonics.  After the code definition is completed, CONTEXT must be restored to CURRENT vocabulary to continue program development or testing.

?EXEC                   If not executing, issue an error message.

?CSP                      If the data stack was disturbed, issue an error message.

; IMMEDIATE

 

: LABEL         --

 

Define a subroutine which can be called by the assembler CALL instruction.  It is not necessary in Forth. 

 

?EXEC

0 VARIABLE       Subroutine header is defined as a variable with a dummy value 0.  When the name is executed, the address of its parameter field will be put on the stack to be used by the CALLing instruction. 

SMUDGE             Smudge the header as usual.

-2 ALLOT            Backup the dictionary pointer to overwrite the dummy 0 with the subroutine.

[COMPILE] ASSEMBLER     Get the assembler to process the mnemonics following.

!CSP                     Store SP for error checking.

; IMMEDIATE

¡@

: 8*                n -- n*8

 

Multiply top of stack by 8. 

 

DUP + DUP + DUP + ;     Faster than doing real multiplication on an 8080.

 

ASSEMBLER DEFINITIONS

Set both the CONTEXT and CURRENT vocabularies to ASSEMBLER .  Now, all subsequent definitions are put  into the ASSEMBLER vocabulary to be referenced by CODE  and ;CODE . The definitions up to this point went into  the FORTH vocabulary. 

 

: IS                n --

 

CONSTANT ;      Shorthand of CONSTANT . 

 

Following are register name definitions:

 

0 IS B 1 IS C 2 IS D 3 IS E 4 IS H 5 IS L 6 IS M

7 IS A      6 IS PSW    6 IS SP     2A28 IS NEXT

 

In 8080 fig-Forth, NEXT was defined as a code routine  starting at address 2A28 in memory.  With NEXT thus  defined as a constant, NEXT JMP should be the last  instruction in a code definition before C; . 

 

: 1MI               n --

 

A defining word to create single byte 8080 instructions  without operands.  MI stands for machine instruction. 

 

<BUILDS              Create a header with the name following.

C,                          Store instruction code on the stack to the parameter field.

DOES>                  The following words are to be executed when the newly defined mnemonic name is executed during assembly. 

C@ C,                   Fetch the instruction code stored in the parameter field and assemble it into the dictionary as a byte literal. 

;

 

The following single byte instructions are defined by 1MI .

 

76 1MI HLT      07 1MI RLC      0F 1MI RRC      17 1MI RAL

1F 1MI RAR      C9 1MI RET      D8 1MI RC       D0 1MI RNC  

C8 1MI RZ       C0 1MI RNZ      F0 1MI RP       F8 1MI RM

E8 1MI RPE      E0 1MI RPO      2F 1MI CMA      37 1MI STC

3F 1MI CMC      27 1MI DAA      FB 1MI EI       F3 1MI DI      

00 1MI NOP      E9 1MI PCHL     F9 1MI SPHL     E3 XTHL

EB 1MI XCHG

 

: 2MI               n --

 

A defining word to define 8080A instructions with a source  operand.  The source field is the least significant 3 bits. 

 

<BUILDS C, DOES>    Create a header for the mnemonic name following.  Store the instruction code in the parameter field. 

C@ + C,                When the mnemonic defined is executed, the code value is pulled out from the parameter field, the number representing the source register on the stack is added to the code and the completed instruction is assembled to the dictionary. 

;

 

The following 8080 instructions are defined by 2MI :

 

80 2MI ADD      88 2MI ADC      90 2MI SUB      98 2MI SBB

A0 2MI ANA A8 2MI XRA B0 2MI ORA      B8 2MI CMP

 

: 3MI               n --

 

A defining word to define 8080 instructions with destination  register specified in the bits 3, 4, and 5. 

 

<BUILDS C, DOES>

C@                       When the mnemonic is executed during assembly, the basic code value is fetched from the parameter field. 

SWAP                   The operand's register number on the stack is swapped over the code value, and

8*                          multiplied by 8 to line up with the destination field.

+ C,                       Add the register number to the instruction and assemble it.

;

 

Following instructions are defined by 3MI :

 

04 3MI INR      05 3MI DCR      C7 3MI RST      C5 3MI PUSH    

C1 3MI POP      09 3MI DAD      02 3MI STAX     0A 3MI LDAX    

03 3MI INX      0B 3MI DCX

¡@

: 4MI               n --

 

A defining word to define 8080 instruction with an immediate  byte value following the instruction code. 

 

<BUILDS C, DOES>   

C@ C, C,               The instruction code is fetched from the parameter field and assembled into the dictionary, and the byte value given on

 t                            he stack is assembled following the instruction code. 

;

 

Examples are:

 

C6 4MI ADI      CE 4MI ACI      D6 4MI SUI      DE 4MI SBI

E6 4MI ANI      EE 4MI XRI      F6 4MI ORI      FE 4MI CPI

DB 4MI IN       D3 4MI OUT

¡@

: 5MI               n --

 

A defining word to define 8080 instruction taking a 16 bit  value as an operand, either as an address or as an immediate  value for operations. 

 

<BUILDS C, DOES>

C@ C,                   When the defined mnemonic is executed, the instruction code is assembled to the dictionary. 

,                             The number on the stack is assembled after the instruction.

;

 

Examples are:

 

C3 5MI JMP      CD 5MI CALL 3       2 5MI STA       3A 5MI LDA

22 5MI SHLD     2A 5MI LHLD

¡@

The 8080 MOV instruction needs two operands to specify the source and destination registers for data movements.  The two register numbers are pushed on the data stack for the MOV definition to pick up and assemble as one instruction code.  The MVI and LXI instructions behave similarly. 

 

: MOV           b1 b2 --

 

Assemble a MOV instruction to the dictionary with b1  representing source register and b2 destination register. 

 

8*              b2*8 is the destination field.

40              Basic code for a MOV instruction.

+ +                 Add the source and destination fields to the instruction.

C,              Assemble to dictionary.

;

¡@

: MVI               b1 b2 --

 

Assemble a MVI instruction to dictionary, with b2 specifying  the destination field and b1 the immediate byte value  following the instruction. 

 

8*              Destination field.

6               Basic MVI instruction code.

+ C,                Assemble the instruction.

C,              Assemble the immediate byte value after the instruction.

;

¡@

: LXI               n b --

 

Assemble a LXI instruction with b specifying the destination  register pair, and n as a two byte immediate value to be  loaded into the register pair. 

 

8* 1+ C,            Assemble the LXI instruction.

,               Assemble the two byte immediate value after the instruction.

;

 

The foregoing discussion covers most of the 8080 instruction set with the exception of conditional jump instructions.  The reason is that the conditional jumps are used to construct the more structured definitions like IF-ELSE-ENDIF and BEGIN-UNTIL.  The non-structured jump instructions such as CALL, RET, conditional CALL's and RET's are defined in the assembler for completeness. 

Subroutines are better defined as independent colon or code definitions. The short jumps in code definitions are implemented in the following way. Instead of the regular conditional jump instruction, a set of Forth words are defined to be used with the conditional structures:

 

C2 IS 0=        D2 IS CS        E2 IS PE        F2 IS 0<

   

: NOT               b1 -- b2

 

Negate the conditional b1 to reverse the jumping condition. 

 

8 + ;               The byte value b2 is to be assembled by the instruction

                IF , etc., to effect conditional branching.

¡@

: IF                b -- addr 2

¡@

Assemble the conditional b into the dictionary.  Leave  on the stack the current dictionary pointer to resolve later  the forward branching address, and a flag 2 for error  checking

 

C,                          Assemble the conditional b.

HERE                    Push current DP to stack as addr.

0 ,                          Assemble a dummy 0 here for forward jumping.  The address will be resolved by ELSE or ENDIF . 

2                            Flag for error checking.

;

¡@

: ENDIF             addr n --

 

Terminate an IF-ELSE-ENDIF structure in a code definition.   Check n for error.  Use addr to resolve the forward jumping  address at IF or ELSE . 

 

2 ?PAIRS                If n is not 2, issue an error message.

HERE SWAP !        Store the current DP to addr after IF or ELSE to complete the conditional structure. 

;

 

: ELSE          addr1 n -- addr2 2

 

Start a false clause in a code definition.  Resolve the  forward branching at addr1 and leave the present address  addr2 and a flag on the stack to be used by ENDIF . 

 

2 ?PAIRS              If n is not 2, issue an error message.

C3 IF                     Use IF to assemble a unconditional jump instruction (C3) to the dictionary, and also leave addr2 and 2 on the stack.

ROT                      Get addr1 to top of stack.

SWAP                   The stack is now addr2 addr1 n2 .

ENDIF                   Take n2 and addr1 from top of the stack to resolve the jump address at IF .

2                            n2 the flag.

;

¡@

: BEGIN             -- addr 1

¡@

Start an indefinite loop such as

    BEGIN .  .  .  UNTIL ,

    BEGIN ...  WHILE ...  REPEAT ,

or BEGIN ...  AGAIN .

¡@

HERE            Leave current DP on stack for backward branching from the end of the loop. 

1                    Flag for error checking.

;

¡@

: UNTIL             addr n b --

 

End of an indefinite loop.  Assemble a conditional jump  instruction b and address addr of BEGIN for backward  branching. 

 

SWAP                  Get n to top of the stack for error checking.

1 ?PAIRS             If n is not 1 , issue an error message.

C,                         Assemble b literally as a conditional jump instruction.

,                            Assemble the address addr of BEGIN for branching.

;

 

: AGAIN             addr n --

 

End of an infinite loop.  Assemble an unconditional jump  instruction to branch backward to addr . 

 

1 ?PAIRS         Check n for error.

C3 C,               Assemble the JMP instruction,

,                       with the address addr .

;

¡@

: WHILE             b -- addr 4

 

Abort an infinite loop from the middle inside the loop.   Assemble a conditional jump instruction b , and leave  the DP and a flag on the stack for REPEAT to resolve the  backward jump address

 Used in the form:

    BEGIN .  .  .  WHILE .  .  .  REPEAT

¡@

IF              Use IF to do the dirty work.

2+              The flag left by IF is 2.  Change it to 4 for REPEAT to verify.

;

¡@

: REPEAT            addr1 n1 addr2 n2 --

 

Assemble JMP addr1 to dictionary to close the loop from  BEGIN .  Resolve forward jump address at addr2 as required  by WHILE . 

 

>R >R                    Get addr2 and n2 out of way.

AGAIN                  Let AGAIN assemble the backward jump.

R> R> 2-                Bring back addr2 and n2.  Change n2 back to 2.

ENDIF                   Check error.  Resolve jump address for WHILE.

;

 

FORTH DEFINITIONS

 

The whole ASSEMBLER vocabulary is now completed.   Restore the CONTEXT and CURRENT vocabularies to the  trunk FORTH vocabulary for normal programming activity. 

 

DECIMAL             Restore base from hexadecimal.

 

 

 

¡@