|
CHAPTER 14. FORTH ASSEMBLERS
An assembler which translates assembly mnemonics into machine codes is equivalent to a compiler in complexity if not more complicated. One might expect the assembler to be simpler because it is at a lower level of construct. However, the large number of mnemonic names with many different modes of addressing make the assembling task much more difficult. In a Forth language system the assembling processes cannot be accomplished by the text interpreter alone. All the resources in the Forth system are needed. For this reason the assembler in Forth is often defined as an independent vocabulary, and the assembling process is controlled by the address interpreter, in the sense that all assembly mnemonics used by the assembler are not just names representing the machine codes but they are actually Forth instructions executed by the address interpreter. These instructions when executed will cause machine codes to be assembled to the dictionary as literals. The data stack and the return stack are often used to assemble proper codes and to resolve branching addresses.
14.1. Three Levels of Forth Assembler
Before discussing codes in the Forth assemblers, I would like to present assemblers in three levels of complexity:
Level 0: The programmer looks up the machine codes and assembles them to the dictionary;
Level 1: The computer translates the assembly mnemonics to codes with a lookup-table, but the programmer must fill in addresses and literals when needed; and
Level 2: The computer does all the work, with mnemonics and operands supplied by the programmer.
The Level 0 Assembler in Forth uses only three definitions already defined in the Forth Compiler:
CREATE Generate the header for a new code definition, , Assemble a 16 bit literal into the dictionary, and C, Assemble a byte literal into the dictionary, used in byte oriented processors.
These definitions were described as the most primitive compiler in Chapter 9. They might just as well be the most primitive assembler if the new definition were a code definition. The programmer would write down the machine codes first with the help of those small code cards supplied freely by CPU vendors. The machine codes are entered on the top of the data stack and then assembled to the parameter field of the new definition on top of the dictionary. The Level 1 Assembler would use the defining word CONSTANT to define assembly mnemonics relating them to their respective machine code. The text interpreter when confronted with a mnemonic name would push the corresponding machine code on the stack. The code will then be assembled to the dictionary by , or C, . An example is:
0 CONSTANT HALT
which defines HALT as a constant of 0. During assembly, the phrase
... HALT , ...
would assemble a HALT instruction into the dictionary. To make it easier for himself, the programmer might want a new definition:
: HALT, HALT , ;
Executing HALT, would then assemble the HALT instruction to the dictionary. Historically all assembler definitions end their names with a comma for the reason just described, indicating that the definition causes a machine instruction to be assembled to the dictionary. This convention serves very well to distinguish assembler definitions from regular Forth definitions. This scheme in Level 1 Assembler is quite adequate if there were a one to one mapping from mnemonics to machine codes. However, in cases where many codes share the same mnemonic and differ only in operands or addressing mode, the basic code must be augmented to accommodate operands or address fields. It is not difficult to modify definitions as HALT, to make the necessary changes in the code, which has to pass the data stack anyway. To define each assembly mnemonic individually is messy and inelegant. A much more appealing method is to use the <BUILDS-DOES> construct in the Forth language to define whole classes of mnemonics with the same characteristics, which brings us to the Level 2 Assembler. In the last example of the HALT instruction, instead of using CONSTANT to relate the mnemonic name with the code, a defining word is created as:
: OP <BUILDS , DOES> @ , ;
The instruction HALT, is then defined by the defining word OP as:
0 OP HALT, 1 OP WAIT, 5 OP RESET, . . .
Now, when HALT, is later processed by the text interpreter, the code 0 is automatically assembled into the dictionary by the runtime routine @ , . The <BUILDS-DOES> construct can be applied to all other types of assembly mnemonics to assemble different classes of instructions, providing some of the finest examples for the extensibility in the Forth language. No other language can possibly offer such a powerfull tool to its programmers. A syntactic problem in using the Forth assembler is that before the mnemonics can be executed to assemble a machine code, all the addressing information and operands must be provided on the data stack. Therefore, operands must precede the instruction mnemonics, resulting in the postfix notation. The source listing of a Forth code definition is therefore very different from the conventional assembly source listing, where the operands follow the assembly mnemonic. Using the data stack and the postfix notation greatly facilitate the assembling process in the Forth assembler. This is a very small price to pay for the capability to access the host CPU and to make the fullest use of the resources in a computer system. Two assemblers will be discussed in this Chapter in an effort to cover the widest range of microprocessors. One is for the homely Intel 8080A which is a byte oriented machine with a rather primitive instruction set. On the other end is the PDP-11 instruction set, which is extensively micro-coded in a 16 bit wide code field. I feel that these two examples should be sufficient to illustrate how Forth assemblers are constructed for most other microprocessors.
14.2. PDP-11 ASSEMBLER
The PDP-11 instruction set is typical of that for minicomputers. With a 16 bit instruction field, very flexible and versatile addressing schemes are possible comparing with those used in the 8 bit instructions of most common microprocessors. In addition, PDP-11 is a stack oriented machine in which all registers can be used as stack pointers in addition to normal accumulator and addressing functions. There are 8 registers in the PDP-11 CPU: registers 0 to 5 are general purpose registers, register 6 is a dedicated stack pointer, and register 7 is the program counter. Registers can be used in many different addressing modes, making it very convenient to host a Forth virtual machine in the PDP-11 computer. This assembler was programmed by John James and was included in his PDP-11 figForth Model. The following command sequence must be given first to initiate the ASSEMBLER vocabulary and to prepare the Forth system to build the assembler.
OCTAL PDP-11 instructions are best presented in octal base because address fields are 6 bits wide. ¡@ 0 VARIABLE OLDBASE
To ease switching base to and from octal, the currently used base will be stored away in OLDBASE, to be restored when the assembly process is completed. VOCABULARY ASSEMBLER IMMEDIATE Create the assembler vocabulary to house all the assembly mnemonics and other necessary definitions.
: ENTERCODE --
Invoke ASSEMBLER vocabulary to start the assembly process.
[COMPILE] ASSEMBLER Set CONTEXT to ASSEMBLER to search for the mnemonics. BASE @ OLDBASE ! OCTAL Switch base to octal. Save old base to be restored after assembly. SP@ Push stack pointer on stack for error checking at end. ;
: CODE --
A more refined defining word to start a code definition.
CREATE Create a header with the name following CODE . ENTERCODE Invoke ASSEMBLER . ;
ASSEMBLER DEFINITIONS
Set both CONTEXT and CURRENT vocabularies to ASSEMBLER . New definitions hereafter will be placed in the assembler vocabulary. Before discussing the assembler definitions, the PDP-11 CPU registers and their addressing modes should be clarified. An address field uses 6 bits in an instruction. The lower 3 bits specify a register to be referenced for addressing, and the upper 3 bits specify the addressing mode. The register and the addressing mode are combined to form an address field which is used to specify either a source operand or a destination operand in the assembly instruction as required. Registers and modes are defined as follows:
: IS n -- ¡@ CONSTANT ; Short hand for CONSTANT .
0 IS R0 1 IS R1 2 IS R2 3 IS R3 4 IS R4 5 IS R5 6 IS SP 7 IS PC 2 IS W 3 IS U 4 IS IP 5 IS S 6 IS RP
: RTST r mode -- addr-field -1
Test register r for range between 0 and 7. Add r and mode to form address field addr-field . Also leave a flag -1 on stack to indicate that an address field is underneath.
OVER Get r to top for tests. DUP 7 > Larger than 7 ? SWAP 0 < Smaller than 0 ? OR IF In either case, issue an error message, ." NOT A REGISTER:" OVER . ENDIF with the offending number appended. + addr-field = r + mode -1 The flag. ;
The addressing modes are defined as executable definitions using names similar to the operand notation used in PDP assembly language with some twists. The stack effects are: r -- addr-field , -1 . ¡@ : )+ 20 RTST ; Post-increment register mode. ¡@ : -) 40 RTST ; Pre-decrement register mode. ¡@ : I) 60 RTST ; Indexed register mode. ¡@ : @)+ 30 RTST ; Deferred post-increment mode. ¡@ : @-) 50 RTST ; Deferred pre-decrement mode. ¡@ : @I) 70 RTST ; Deferred index mode.
The addressing mode using the program counter is somewhat different from the modes using other general purpose registers.
: # 27 -1 ; Immediate addressing mode. ¡@ : @# 37 -1 ; Absolute addressing mode. ¡@ : () r -- addr-field -1 for register deferred mode. n -- n 77 -1 for relative deferred mode. ¡@ DUP 10 U< Top of stack is between 0 and 7, a register. IF 10 + -1 Make the address field. ELSE 77 -1 ENDIF Otherwise, top of stack is an address offset. Make it the relative deferred mode. ;
The simplest instruction requires no operand. These instructions can be defined by a simple defining word:
: OP n --
A defining word to define instructions without operands.
<BUILDS Create an header for a mnemonic definition with the mnemonic name following OP . , Compile the instruction code on the stack to the parameter field in the new definition. DOES> -- When the defined mnemonic definition is executed during assembly, execute the following words: @ , Fetch the instruction code stored in parameter field and assemble it to the code definition under construction on top of the dictionary. ;
0 OP HALT, 1 OP WAIT, 2 OP RTI, 3 OP BPT, 4 OP IOT, 5 OP RESET, 6 OP RTT,
241 OP CLC, 242 OP CLV, 244 OP CLZ, 250 OP CLN, 261 OP SEC, 262 OP SEV, 264 OP SEZ, 270 OP SEN, 277 OP SCC, 257 OP CCC, 240 OP NOP, 6400 OP MARK,
Instructions with operands are of course more involved. Those with only one operand are defined by a defining word 1OP . This word uses many other utility definitions. However, we shall first present the high level 1OP before getting into the nitty gritty details of the other low level definitions.
: 1OP n --
A defining word to define instructions with one operand.
<BUILDS , DOES> The same defining word format. @ , When the defined word is executed during assembly, the basic instruction code is fetched and assembled to the dictionary. FIXMODE Take the mode packet on stack to resolve the address field. DUP Copy the address field. HERE 2 - ORMODE Insert the address field into the lower 6 bit destination field. ,OPERAND If the instruction needs a 16 bit value either as a literal or as an address, assemble it after the instruction. ;
: FIXMODE addr-field -1 -- addr-field r -- r n -- n 67
Fix the mode packet on the data stack for ORMODE and ,OPERAND to assemble the instruction correctly.
DUP -1 = Top of stack = -1 ? IF DROP Yes, drop -1 and leave addr-field on top. ELSE The top of the stack might be a register or a literal. DUP 10 SWAP U< If top of stack is larger than 7 , PC relative mode. IF 67 ENDIF Push 67 on top of n , indicating PC mode. Otherwise, leave the register number on the stack. ENDIF ;
: ORMODE addr-field addr --
Take the address field value addr-field and insert it into the lower 6 bit address field in the instruction code at addr .
SWAP Move addr-field to top of the stack. OVER @ Fetch the instruction code at addr . OR Insert address field. SWAP ! Put the modified instruction back. ; ¡@ : ,OPERAND (n) addr-field -- ¡@ Assemble a literal to the dictionary to complete a program counter addressing instruction.
DUP 67 = PC relative mode ? OVER 77 = Or PC relative deferred mode? OR IF In either case, SWAP move operand n to top of the stack. HERE 2 + - Compute offset from n to the next instruction address. SWAP Put the offset value under addr-field. ENDIF DUP 27 = PC immediate mode ? OVER 37 = OR Or PC absolute mode ? SWAP Get addr-field for another test. 177760 AND 60 = OR Or if it is index addressing mode. IF , ENDIF I n any of the three cases, assemble the literal after the instruction code. ; None of above. The instruction does not need a literal. It is already complete.
: B --
Modify the instruction code just assembled to the dictionary to make a byte instruction from a cell instruction.
100000 MSB of the byte instruction must be set. HERE 2 - +! Toggle the MSB of the instruction code on top of dictionary. ;
B is to be used immediately after an instruction definition like op1 op2 MOV, B to move a byte from op1 to op2. The byte instruction can be defined separately as MOVB,. However, the modifier definition B is more elegant in reducing the number of mnemonic definitions by 25%.
5100 1OP CLR, 5200 1OP INC, 5300 1OP DEC, 5400 1OP NEG, 5500 1OP ADC, 5600 1OP SBC, 5700 1OP TST, 6000 1OP ROR, 6100 1OP ROL, 6200 1OP ASR, 6300 1OP ASL, 6700 1OP SXT, 100 1OP JMP,
: ROP n --
A defining word to define two operand instructions. The source operand can only be a register without mode selection. The destination address field is the lower 6 bits, and the source register is specified by bits 6 to 8.
<BUILDS , DOES> Make header and store instruction code. @ , When defined instruction is executed, assemble the basic instruction code to the dictionary. FIXMODE Fix the destination address field. DUP Copy the just completed address field value. HERE 2 - Address of the instruction. DUP >R Save a copy of this address on the return stack to fix the source register field underneath it on the stack. ORMODE Insert the destination address field into the instruction. ,OPERAND If a literal operand is required, assemble it here. DUP 7 SWAP U< The register number must be less than 7 . IF ." ERR-REG-B" ENDIF The register number is too big, issue an error message. 100 * R> ORMODE Justify the source register field value and insert it into the instruction. ; ¡@ 74000 ROP XOR, 4000 ROP JSR, ¡@ : BOP n --
A defining word used to define branching and conditional branching instructions. This word is included only for completeness since the branchings are not structured. In Forth code definitions, more powerful branching and looping structures should be used, as will be discussed shortly.
<BUILDS , DOES> @ , HERE - The target address is presummably on data stack. Compute the offset value for branching. DUP 376 > If the offset is greater than 376, issue an error message: IF ." ERR-BR+" . ENDIF with the out of range offset. DUP -400 < If the offset is less than -400, issue an error message: IF ." ERR-BR-" . ENDIF with the out of range offset. 2 / 377 AND The correct offset value is then HERE 2 = ORMODE inserted into the instruction code. ;
400 BOP BR, 1000 BOP BNE, 1400 BOP BEQ, 2000 BOP BGE, 2400 BOP BLT, 3000 BOP BGT, 3400 BOP BLE, 100000 BOP BPL, 100400 BOP BMI, 101000 BOP BHI, 101400 BOP BLOS, 102000 BOP BVC, 102400 BOP BVS, 103000 BOP BCC, 103400 BOP BCS, 103400 BOP BLO, 103000 BOP BHIS,
: 2OP n --
A defining word to define two operand instructions.
<BUILDS , DOES> @ , FIXMODE Fix the mode packet for destination field. DUP HERE 2 - Get the address of the instruction to be fixed. DUP >R Save a copy of the instruction address on return stack. ORMODE Insert the destination field. ,OPERAND Assemble a literal after the instruction if required. FIXMODE Now process the source mode packet. DUP 100 * Justify the source field value. R ORMODE Insert the source field into the instruction. ,OPERAND Assemble a literal if required. HERE R> - 6 = If there are two literals assembled after the instruction, they are in the wrong order. IF SWAPOP ENDIF The two literals have to be swapped. ; ¡@ : SWAPOP -- Swap the two literals after a two operand instruction. If either literal is used for PC addressing, the offset value will have to be adjusted to reflect the swapping.
HERE 2 - @ Push the last literal on the stack. HERE 6 - @ This is the instruction code itself. 6700 AND 6700 = PC relative mode? IF 2 + ENDIF Yes, increment the last literal by 2. HERE 4 - @ Now work on the first literal. HERE 6 - @ Get the instruction back again. 67 AND 67 = Is the destination field also of PC relative mode? IF 2 - ENDIF If it is, decrement the branching offset by 2. HERE 2 - ! Put the first offset last, HERE 4 - ! ; and the last offset first. ¡@ 10000 2OP MOV, 20000 2OP CMP, 30000 2OP BIT, 40000 2OP BIC, 50000 2OP BIS, 60000 2OP ADD, 160000 2OP SUB,
Two more instructions need to be patched:
: RST, 200 OR , ; ¡@ : EMT, 104000 + , ;
The branching instructions are similar to the GOTO statements in high level languages. They are not very useful in promoting modular and structured programming. Therefore, their usage in Forth code definitions is discouraged. Somewhat modified forms of these branch instructions are defined in the assembler to code IF-ELSE-ENDIF and BEGIN-UNTIL types of structures. Although these structures are very similar to the structures used in colon definitions, the functions of these words in the assembler are different. Thus it is a good practice to define them with names ending in commas as all other mnemonic definitions. However, the comma at the end does not imply that an instruction code is always assembled by these special definitions.
The conditional branching instructions are defined as constants to be assembled by the words requiring branching. The notation is reversed from the PDP mnemonics because of this assembling procedure.
1000 IS EQ 1400 IS NE 2000 IS LT 2400 IS GE 3000 IS LE 3400 IS GT 100000 IS MI 101000 IS LOS 101400 IS HI 102000 IS VS 102400 IS VC 103000 IS LO 103400 IS HIS
: IF, n -- addr
Take the literal n on stack and assemble it to dictionary as a conditional branching instruction. Leave the address of this branching instruction on the data stack to resolve the branching offset later.
HERE Address of the branching instruction. SWAP , Assemble the branching instruction to the dictionary. ; ¡@ : IPATCH, addr1 addr2 --
Use the addresses left on the stack to compute the forward branching offset and patch up the instruction assembled by IF, .
OVER - Byte offset from addr1 to addr2. 2 / 1- 377 AND The 8 bit instruction offset. SWAP DUP @ Fetch out the branching instruction at addr1 . ROT OR Insert the offset into the branching instruction. SWAP ! Put the completed instruction back. ;
: ENDIF, addr --
Close the conditional structure in a code definition.
HERE IPATCH, Call on IPATCH, to resolve the forward branching. ; ¡@ : ELSE, addr1 -- addr2
Assemble an unconditional branch instruction at HERE , and patch up the offset field in the instruction assembled by IF, . Leave the address of the current branch instruction on the stack for ENDIF, to resolve.
400 , Assemble the BR, instruction to the dictionary. HERE IPATCH, Patch up the conditional branching instruction at IF, . HERE 2 - Leave address of BR, for ELSE, to patch up. ; ¡@ : BEGIN, addr -- ¡@ HERE Begin an indefinite loop. Push DP on stack for backward branching. ; ¡@ : UNTIL, addr n --
Assemble the conditional branching instruction n to the dictionary, taking addr as the address to branch back to.
, Assemble n which must be one of the conditional branching instruction codes. HERE 2 - The address of the above instruction. SWAP IPATCH, Patch up the offset in the branching instruction. ;
: REPEAT, addr1 addr2 --
Used in the form: BEGIN, . . . WHILE, . . . REPEAT, inside a code definition. Assemble an unconditional branch instruction pointing to BEGIN, at addr1, and resolve the forward branch offset for WHILE, at addr2 .
HERE Save the DP pointing to the current BR, instruction. 400 , Assemble BR, here. ROT IPATCH, Patch the BR, instruction to branch back to BEGIN, at addr1 . HERE This is where the conditional branch at WHILE, should branch to on false condition. IPATCH, Patch up the conditional branch at WHILE, . ; ¡@ : WHILE, n -- addr
Assemble a conditional jump instruction at HERE . Push the address of this instruction addr on the stack for REPEAT, to resolve the forward jump address.
HERE Push DP to stack. SWAP Move n to top of stack, and , assemble it literally as an instruction. ; ¡@ : C; addr --
Ending of a code definition started by ENTERCODE .
CURRENT @ CONTEXT ! Restore CONTEXT vocabulary to CURRENT . Thus abandon the ASSEMBLER vocabulary to the current vocabulary where the new code definition was added. The programmer can now test the new definition. OLDBASE @ BASE ! Restore the old base before assembling. SP@ 2+ = Compare the current SP with addr on the stack, IF SMUDGE if they are the same, the stack was not disturbed. Restore the smudged header to complete the new definition. Otherwise, issue an error message. ELSE ." CODE ERROR, STACK DEPTH CHANGED" ENDIF ;
: NEXT, --
The address interpreter returning execution process to the colon definition which calls the code definition. This must be the last word in a code definition before C; .
IP )+ W MOV, Move the contents of IP to W. IP is incremented by 2. W @)+ JMP, J ump to execute the instruction sequence pointed to by the contents of W. W is incremented by 2, pointing to the parameter field of the word to be executed. ;
FORTH DEFINITIONS
The assembler vocabulary is now completed. Return to the FORTH trunk vocabulary by setting both CONTEXT and CURRENT to FORTH .
DECIMAL Restore decimal base. The base was changed to octal when entering the a process of creating the assembler.
14.3. 8080 ASSEMBLER The assembler is usually defined in an independent vocabulary separated from the trunk FORTH vocabulary and other vocabularies. To generate the ASSEMBLER vocabulary and to make some modifications in the FORTH vocabulary, the following words must be executed. These words are commands to setup the ASSEMBLER vocabulary. This 8080 Assembler was authored by John Cassidy, who also built the 8080 figForth Model.
HEX All 8080 codes will be represented in hexadecimal base. VOCABULARY ASSEMBLER Create a new vocabulary for assembler. IMMEDIATE Vocabulary must be of IMMEDIATE type to be used within colon definitions. ' ASSEMBLER CFA Get the code field address of ASSEMBLER definition, and ' ;CODE 0A + ! patch up the code in ;CODE . This is to replace the word SMUDGE with ASSEMBLER , so that the codes following ;CODE can be understood in the context of the assembler. The function of SMUDGE is deferred to the end of the code sequence in C; .
: CODE --
A more fully developed definition to start a code definition with error checking. ¡@ ?EXEC If not executing, issue an error message. CREATE Create a new dictionary header with the following name. [COMPILE] Compile the next IMMEDIATE word. ASSEMBLER Switch the CONTEXT to ASSEMBLER vocabulary to search assembly mnemonics first before the current vocabulary. !CSP Store current stack pointer in CSP for later error checking. ; IMMEDIATE
: C; --
Ending of a new code definition. Check for error and restore the smudged header.
CURRENT @ CONTEXT ! At the beginning of assembly, CONTEXT was switched to ASSEMBLER, to search for the assembler mnemonics. After the code definition is completed, CONTEXT must be restored to CURRENT vocabulary to continue program development or testing. ?EXEC If not executing, issue an error message. ?CSP If the data stack was disturbed, issue an error message. ; IMMEDIATE
: LABEL --
Define a subroutine which can be called by the assembler CALL instruction. It is not necessary in Forth.
?EXEC 0 VARIABLE Subroutine header is defined as a variable with a dummy value 0. When the name is executed, the address of its parameter field will be put on the stack to be used by the CALLing instruction. SMUDGE Smudge the header as usual. -2 ALLOT Backup the dictionary pointer to overwrite the dummy 0 with the subroutine. [COMPILE] ASSEMBLER Get the assembler to process the mnemonics following. !CSP Store SP for error checking. ; IMMEDIATE ¡@ : 8* n -- n*8
Multiply top of stack by 8.
DUP + DUP + DUP + ; Faster than doing real multiplication on an 8080.
ASSEMBLER DEFINITIONS Set both the CONTEXT and CURRENT vocabularies to ASSEMBLER . Now, all subsequent definitions are put into the ASSEMBLER vocabulary to be referenced by CODE and ;CODE . The definitions up to this point went into the FORTH vocabulary.
: IS n --
CONSTANT ; Shorthand of CONSTANT .
Following are register name definitions:
0 IS B 1 IS C 2 IS D 3 IS E 4 IS H 5 IS L 6 IS M 7 IS A 6 IS PSW 6 IS SP 2A28 IS NEXT
In 8080 fig-Forth, NEXT was defined as a code routine starting at address 2A28 in memory. With NEXT thus defined as a constant, NEXT JMP should be the last instruction in a code definition before C; .
: 1MI n --
A defining word to create single byte 8080 instructions without operands. MI stands for machine instruction.
<BUILDS Create a header with the name following. C, Store instruction code on the stack to the parameter field. DOES> The following words are to be executed when the newly defined mnemonic name is executed during assembly. C@ C, Fetch the instruction code stored in the parameter field and assemble it into the dictionary as a byte literal. ;
The following single byte instructions are defined by 1MI .
76 1MI HLT 07 1MI RLC 0F 1MI RRC 17 1MI RAL 1F 1MI RAR C9 1MI RET D8 1MI RC D0 1MI RNC C8 1MI RZ C0 1MI RNZ F0 1MI RP F8 1MI RM E8 1MI RPE E0 1MI RPO 2F 1MI CMA 37 1MI STC 3F 1MI CMC 27 1MI DAA FB 1MI EI F3 1MI DI 00 1MI NOP E9 1MI PCHL F9 1MI SPHL E3 XTHL EB 1MI XCHG
: 2MI n --
A defining word to define 8080A instructions with a source operand. The source field is the least significant 3 bits.
<BUILDS C, DOES> Create a header for the mnemonic name following. Store the instruction code in the parameter field. C@ + C, When the mnemonic defined is executed, the code value is pulled out from the parameter field, the number representing the source register on the stack is added to the code and the completed instruction is assembled to the dictionary. ;
The following 8080 instructions are defined by 2MI :
80 2MI ADD 88 2MI ADC 90 2MI SUB 98 2MI SBB A0 2MI ANA A8 2MI XRA B0 2MI ORA B8 2MI CMP
: 3MI n --
A defining word to define 8080 instructions with destination register specified in the bits 3, 4, and 5.
<BUILDS C, DOES> C@ When the mnemonic is executed during assembly, the basic code value is fetched from the parameter field. SWAP The operand's register number on the stack is swapped over the code value, and 8* multiplied by 8 to line up with the destination field. + C, Add the register number to the instruction and assemble it. ;
Following instructions are defined by 3MI :
04 3MI INR 05 3MI DCR C7 3MI RST C5 3MI PUSH C1 3MI POP 09 3MI DAD 02 3MI STAX 0A 3MI LDAX 03 3MI INX 0B 3MI DCX ¡@ : 4MI n --
A defining word to define 8080 instruction with an immediate byte value following the instruction code.
<BUILDS C, DOES> C@ C, C, The instruction code is fetched from the parameter field and assembled into the dictionary, and the byte value given on t he stack is assembled following the instruction code. ;
Examples are:
C6 4MI ADI CE 4MI ACI D6 4MI SUI DE 4MI SBI E6 4MI ANI EE 4MI XRI F6 4MI ORI FE 4MI CPI DB 4MI IN D3 4MI OUT ¡@ : 5MI n --
A defining word to define 8080 instruction taking a 16 bit value as an operand, either as an address or as an immediate value for operations.
<BUILDS C, DOES> C@ C, When the defined mnemonic is executed, the instruction code is assembled to the dictionary. , The number on the stack is assembled after the instruction. ;
Examples are:
C3 5MI JMP CD 5MI CALL 3 2 5MI STA 3A 5MI LDA 22 5MI SHLD 2A 5MI LHLD ¡@ The 8080 MOV instruction needs two operands to specify the source and destination registers for data movements. The two register numbers are pushed on the data stack for the MOV definition to pick up and assemble as one instruction code. The MVI and LXI instructions behave similarly.
: MOV b1 b2 --
Assemble a MOV instruction to the dictionary with b1 representing source register and b2 destination register.
8* b2*8 is the destination field. 40 Basic code for a MOV instruction. + + Add the source and destination fields to the instruction. C, Assemble to dictionary. ; ¡@ : MVI b1 b2 --
Assemble a MVI instruction to dictionary, with b2 specifying the destination field and b1 the immediate byte value following the instruction.
8* Destination field. 6 Basic MVI instruction code. + C, Assemble the instruction. C, Assemble the immediate byte value after the instruction. ; ¡@ : LXI n b --
Assemble a LXI instruction with b specifying the destination register pair, and n as a two byte immediate value to be loaded into the register pair.
8* 1+ C, Assemble the LXI instruction. , Assemble the two byte immediate value after the instruction. ;
The foregoing discussion covers most of the 8080 instruction set with the exception of conditional jump instructions. The reason is that the conditional jumps are used to construct the more structured definitions like IF-ELSE-ENDIF and BEGIN-UNTIL. The non-structured jump instructions such as CALL, RET, conditional CALL's and RET's are defined in the assembler for completeness. Subroutines are better defined as independent colon or code definitions. The short jumps in code definitions are implemented in the following way. Instead of the regular conditional jump instruction, a set of Forth words are defined to be used with the conditional structures:
C2 IS 0= D2 IS CS E2 IS PE F2 IS 0<
: NOT b1 -- b2
Negate the conditional b1 to reverse the jumping condition.
8 + ; The byte value b2 is to be assembled by the instruction IF , etc., to effect conditional branching. ¡@ : IF b -- addr 2 ¡@ Assemble the conditional b into the dictionary. Leave on the stack the current dictionary pointer to resolve later the forward branching address, and a flag 2 for error checking.
C, Assemble the conditional b. HERE Push current DP to stack as addr. 0 , Assemble a dummy 0 here for forward jumping. The address will be resolved by ELSE or ENDIF . 2 Flag for error checking. ; ¡@ : ENDIF addr n --
Terminate an IF-ELSE-ENDIF structure in a code definition. Check n for error. Use addr to resolve the forward jumping address at IF or ELSE .
2 ?PAIRS If n is not 2, issue an error message. HERE SWAP ! Store the current DP to addr after IF or ELSE to complete the conditional structure. ;
: ELSE addr1 n -- addr2 2
Start a false clause in a code definition. Resolve the forward branching at addr1 and leave the present address addr2 and a flag on the stack to be used by ENDIF .
2 ?PAIRS If n is not 2, issue an error message. C3 IF Use IF to assemble a unconditional jump instruction (C3) to the dictionary, and also leave addr2 and 2 on the stack. ROT Get addr1 to top of stack. SWAP The stack is now addr2 addr1 n2 . ENDIF Take n2 and addr1 from top of the stack to resolve the jump address at IF . 2 n2 the flag. ; ¡@ : BEGIN -- addr 1 ¡@ Start an indefinite loop such as BEGIN . . . UNTIL , BEGIN ... WHILE ... REPEAT , or BEGIN ... AGAIN . ¡@ HERE Leave current DP on stack for backward branching from the end of the loop. 1 Flag for error checking. ; ¡@ : UNTIL addr n b --
End of an indefinite loop. Assemble a conditional jump instruction b and address addr of BEGIN for backward branching.
SWAP Get n to top of the stack for error checking. 1 ?PAIRS If n is not 1 , issue an error message. C, Assemble b literally as a conditional jump instruction. , Assemble the address addr of BEGIN for branching. ;
: AGAIN addr n --
End of an infinite loop. Assemble an unconditional jump instruction to branch backward to addr .
1 ?PAIRS Check n for error. C3 C, Assemble the JMP instruction, , with the address addr . ; ¡@ : WHILE b -- addr 4
Abort an infinite loop from the middle inside the loop. Assemble a conditional jump instruction b , and leave the DP and a flag on the stack for REPEAT to resolve the backward jump address. Used in the form: BEGIN . . . WHILE . . . REPEAT ¡@ IF Use IF to do the dirty work. 2+ The flag left by IF is 2. Change it to 4 for REPEAT to verify. ; ¡@ : REPEAT addr1 n1 addr2 n2 --
Assemble JMP addr1 to dictionary to close the loop from BEGIN . Resolve forward jump address at addr2 as required by WHILE .
>R >R Get addr2 and n2 out of way. AGAIN Let AGAIN assemble the backward jump. R> R> 2- Bring back addr2 and n2. Change n2 back to 2. ENDIF Check error. Resolve jump address for WHILE. ;
FORTH DEFINITIONS
The whole ASSEMBLER vocabulary is now completed. Restore the CONTEXT and CURRENT vocabularies to the trunk FORTH vocabulary for normal programming activity.
DECIMAL Restore base from hexadecimal.
¡@ |