9.  P24 CPU Architecture

 

 

 

P24 is a “Minimal Instruction Set Computer” design patterned after Mr. Chuck Moore's MuP21.  P24 has a 24-bit CPU core with dual stack architecture intended to efficiently execute Forth-like instructions. The processor design is simple to allow implementation within field programmable gate arrays.  P24 employs a RISC-like instruction set with four 6-bit instructions packed into 24 bit words.   With 6-bit code for instructions, it can accommodate 64 machine instructions.  Currently only 26 are implemented.  The rest are reserved for user to define their own instructions.

 

Following is a list of unique features of P24:

 

*              24-bit address and data buses

*              6-bit RISC-like CPU instructions

*              4-deep instruction cache

*              17-deep data stack

*              33-deep return stack

*              Current implementation runs at 10 MHz in FPGA

 

 

9.1          Registers and Stacks

 

P24 has the following registers:

 

A             Address Register, supplying address for memory read and write

I               Instruction Latch, holding instructions to be executed

P              Program Counter, pointing to the next program word in memory

R             Top of Return Stack

S              Top of Data stack

T             Accumulator for ALU

 

All registers are 25 bit wide.  The most significant bit in T, T(24) is the carry produced by the 24-bit adder.  This carry bit is preserved as data in T when it is transferred to other registers and to the stacks.  The preservation of carry bit greatly simplifies the logic processing of data, and allows interrupts to be serviced when the next program word is fetches from the memory, without having to save the carry bit and restore it on return.

 

P24 has two stacks:

 

S_stack  Data stack, 17 levels deep

R_stack  Return stack, 33 levels deep

 

The return stack is used to preserve return addresses on subroutine calls.  The data stack is used to pass parameters among the nested subroutine calls.  With these two stacks in the CPU hardware, P24 is optimized to support the Forth programming language.

 

The 24-bit P24 CPU sports a small, RISC-like instruction set. Four 6-bit instructions are packed into one 24-bit word, and are executed consecutively after a word is fetched from memory. The P24 CPU has a two-stack architecture that is easily programmed in Forth.  The data stack is 17 levels deep (including T), and the return stack is 33 levels depth.

 

The following diagram shows the architecture of the P24 processor.  It shows the registers, the stacks, and the data paths among them.

 

Not shown in the diagram is the connection between T register and the external data bus.  When reading data from memory, the A register supplies the memory address to the address bus, and data is latch from the data bus into the T register.  When writing data into memory, the address is supplied by A register, and data is written to the data bus from the T register.

 

 

Figure 1.               The architecture of P24

 

 

                                Data Bus                    Address Bus

                                   |                               ^                              ^

                                   |                               |                               |

                                   v                              |                               |

                                |-----|                        |-----|                        |-----|

                                |  I   |                        |  P  |                        |  A  |

                                |-----|                        |-----|                        |-----|

                                                                   ^                              ^

                                                                   |                               |

                                                                   v                              v

|------------------------------------------|-----|                               |-----|                        |-----|-------------------------||

| Return Stack                                        |  R  | --------------    |  T  | --------------    |  S   | Data Stack                   |

|------------------------------------------|-----|                               |-----|                        |-----|--------|-----------------      |

                                                                                                 ^   |                         |

                                                                                                 |   v                        v

                                                                                                 |  |------------------|

                                                                                                 |  |   ALU                |

                                                                                                 |  |------------------|

                                                                                                 |<------|

 

                                                                                               

9.2          Functional Block Diagram of P24

 

 

These data path diagrams should be read with the CPU24.VHD file. 

 

The instruction decoding logic simply apply the proper control signals to the following register loading and multiplexer selecting signals:

 

Clr                           Master reset

Clk                          Master clock, 0-40 MHz

t_sel                       Select input to T register

tload                       If set, load t_in into T register

spop                       If set, pop the data stack

spush                     If set, push T on the data stack

a_sel                       Select input to A register

aload                      If set, load a_in inot A register

r_sel                       Select input to R register

rload                       If set, load r_in into R register

rpop                        If set. Pop the return stack

rpush                      If set, push R on the return stack

p_sel                      Select input to P register

pload                      If set, load P_in into P register

m_sel                      Select output to Address bus

iload                       If set, load instruction from data bus to I register

reset                       Clear the machine instruction counter

slot                         Output of machine instruction counter to select instruction

 

The synchronous program execution unit clocks the slot signal, which selects the proper 6-bit instructions in the I register to produce the above control signals.  At the rising clock edge, the selected data are latched into the proper register and stacks.  All data signals must stabilize before the next rising clock edge strikes.

 

The architecture is very simple and components are very similar to one another.  It should be very easy to do a good layout, and the routing should not be difficult.

 

 

Figure 2.               The block diagrams of P24 components

 

 

The T and Data Stack Data Path

 

not t-------|                              |-----------|                                |-----------|

s xor t-----|                              |               |                               |               |

s and t-----|--          t_in-----|     T          |--             t--------|  s_stack    |----s                       

s + t-------|                              |               |               spop-----|                |

(s+t)/2-----|             tload----|                 |               spush----|               |

t/2 --------|               clk------|                  |               clk------|                  |

c&t/2-------|            clr------|                   |               clr------|                   |

(s+t)*2-----|                            |-----------|                                |-----------|

t*2&a-------|          

t*2---------|                  

s   --------| 

a  ---------|              

r-----------|              

data -------|

                |              

t_sel-------^           

 

 

The A Register and A-Mux

 

                                                                |-----------|

t ----------|--                             a_in-----|                 |---a

a+1---------|                                             |               |

(s+t)&a/2---|                          aload----|A             |

a*2+c-------|                           clk------|                  |

                 |                              clr------|                   |

a_sel-------^                                           |-----------|

 

 

The Return Stack Data Path

 

                                |                               |-----------|                                |-----------|

r_out-------|--r_in-----|     R    |--r--------|  r_stack |---r_out                 

r+1---------|                              |                               |  rpop-----|                              |

p-----------|  rload----|                             |  rpush----|                             |

                                |  clk------|                                |  clk------|                                |

                                |  clr------|                                |  clr------|                                |

r_sel-------^                            |-----------|                                |-----------|

 

 

The Program Counter Data Path

 

                                |                               |-----------|                               

interrupt---|                            |                               |                               | 

p&i(17.0)---|--p_in-----|     P  |--p--------|---address

p+1---------|                             |                               |  a--------| 

r-----------|  pload----|                             |                               | 

                                |  clk------|                                |                               |

                                |  clr------|                                |                               |

p_sel-------^                           |-----------|  m_sel----^

 

 

The Instruction Latch and Decoder Data Path

 

                                                                |-----------|                               

                                                                |                               |                               | 

data--------------------|     I       |--i(23.0)- |---code(5.0)

                                  iload-----|                              |                               |  

                                  clk-------|                               |                               |

                                  clr-------|                                |                               |

                                                                |-----------|                |

                                                                                                                |

                                                                |-----------|                                |

                                                                |                               |               |

                                  reset-----|   sync  |-slot(2.0)-^

                                                                |                               |

                                  clk-------|                               |

                                  clr-------|                                |

                                                                |---------   |

 

On power-up, all registers and the stacks are cleared to zero when "clr" is held high.  When "clr" is lowered to zero, the master clock "clk" will start the CPU from memory location 0, as the initialized P register is pointing to.

 

 

 

9.3          Input/Output Signals and System Timing

 

P24 is very flexible in packaging, depending on the memory configuration.  These are the signals normally brought to I/O pins.  In certain applications, the memory is included on chip and the address bus and data bus do not have to be brought out.

 

CLK                        1-40 MHz master clock

A0-23                     Address bus to RAM, SRAM and I/O devices

D0-23                      Data bus for RAM, SRAM and I/O devices

CLR                        Low system reset (active low)

Vdd                         5V power supply

Vss                         Ground

WE                         Write enable (active low)

INT0-4                    External interrupt inputs

UART_IN              RS232 serial input pin

UART_OUT         RS232 serial output pin

 

All time periods noted in the following timing diagrams are in periods of the master clock.

 

Figure 3.               Timing of P24 instruction executions

 

 

Master Clock

|----------| |----------| |----------| |----------| |----------| |----------| |----

|               |               |               |               |               |               |               |               |               |               |               |               |

|               |----------| |----------| |----------| |----------| |----------| |----------|

 

 

Slot0 Signal

|----------|                                                 |----------|                                                 |----------|

| slot0      |  slot1     |  slot2     |  slot3     |  slot4     |  slot0     |  slot1     |  slot2     |  slot3     |  slot4     |  slot0     |

|               |-------------------------------------------|     |-------------------------------------------|     |------

fetch       execute   execute   execute   execute   fetch      execute   execute   execute   execute   fetch     

 

 

call, jump, jz, jnc

|----------| |----------|

| slot0      |  slot1     |  slot0     | slot1      |  slot2     |  slot3     |  slot4

|               |----------|                 |----------------------------------------         

 fetch      execute   execute   execute   execute   execute ...

 

 

NOP and RET instructions can be in any of the four slots.  When these two instructions are executed, slot0 will be forced into the next slot, and the next instruction words will be fetched and then executed.

 

The P24 implementation contains a very simple interrupt controller. If an interrupt is pending on slot0, the program counter is pushed to return stack and the interrupt vector is placed in the program counter.  The interrupt vector is the current state of INT0-INT4.  Once an interrupt is serviced via execution of slot4, servicing of interrupts is automatically disabled until the execution of an RET instruction. Immediately after the RET execution, any pending interrupt (if any) will be serviced.

 

When executing a right shift instruction SHR, the sign bit T(23) is preserved.  Bits T(23..1) are shifted to the right by one bit.  Bit T(0) is latched onto the UART_OUT pin, and UART_IN pin is latched into the carry bit T(24).  This very simple mechanism allows a simple RS232 serial port to be built in P24 core.  As the serial port is the only peripheral device required by eForth, this simple serial port opens a window for the user to access the resources provided by P24, and supports a powerful embedded Forth system to control and to program the P24 system.

 

 

9.4          P24 Instruction Set

 

The P24 instruction set can be best explained using the register and data flow diagram as shown in Figures 1 and 2.  The T register is the center of the ALU, which takes data from the T and S registers and routes the results back to the T register.  The contents of T can be moved to the A register, pushed on the data stack S, and pushed on the return stack S.

 

The T register connects the data stack and the return stack as a large shift register.  Data can be shifted towards the return stack by the PUSH instruction, and shifted towards the data stack by the POP instruction.

 

Register A holds a memory address, which is used to read data from memory into the T register, or write the data in T register to external memory.  The address in A can be auto incremented, so that P24 can conveniently access data arrays in memory.

 

P is the program counter and it holds the address of the next instruction to be fetched from the memory.  After an instruction is fetched, P is auto incremented and ready to read the next instruction.  When a CALL instruction is executed, the address in P is pushed on the return stack.  When a return (RET) instructions is executed, the previously saved address in R is popped back into P.  The execution sequence interrupted by CALL can now be resumed.

 

P24 is a microprocessor with 24-bit instructions.  Each instruction contains up to 4 6-bit machine codes.  The instruction fields in a program word can be shown as follows:

 

 

Bits:        23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00

 

                 | Slot1                    | Slot2                      | Slot3                       | Slot4                      |

 

 

There are 64 possible instructions in a 6-bit field. Half of these are reserved for user applications.  Only the lower 32 instructions are specified in P24. These instructions consist of four classes:

 

0              Transfer Instructions

1              Memory Access Instructions

2              ALU Instructions

3              Register Instructions

 

JUMP, CALL, JZ and JNC instructions must appear as Slot1 of a program word, ie. bits 23-18.  The last 18 bits 17-0 contain the address inside the current 256K word page.  They can access code within the current page.  To reach other pages of memory, you will have to push a 24-bit address on the return stack and execute the RET instruction. 

 

The transfer instructions thus has the following forms:

 

                JUMP     aaaaaa aaaaaa aaaaaa

                CALL     aaaaaa aaaaaa aaaaaa

                JZ            aaaaaa aaaaaa aaaaaa

                JNC         aaaaaa aaaaaa aaaaaa

 

The conditional jump instruction JZ is used to implement the IF, WHILE, and UNTIL words in Forth in that it does pop the number being tested in T.  The conditional jump instruction JNC causes a jump if the carry bit T(24) is cleared.  It is useful in multiple precision math operations.  JNC and JZ does not pop the T register, so its contents can be tested again.

 

 


Table 1.  P24 Machine Code

 

Code       Name                      Function

 

Transfer Instructions

00            JUMP                     Jump to 18 bit address.  Must in Slot1.

01            RET                        Subroutine return.

02            JZ                            Jump if T is 0.  Must in Slot1.

03            JNC                         Jump if carry is reset.  Must in Slot1.

04            CALL                     Call subroutine.  Must in Slot1.

05                                            Reserved

06                                            Reserved

07                                            Reserved

 

Memory Access Instructions

08                                            Reserved

09            LDP                        Push memory at A to T.  Increment A.

0A           LDI                         Push in-line literal to T.

0B           LD                           Push memory at A to T.

0C                                           Reserved

0D           STP                         Pop T to memory at A.  Increment A.

0E                                            Reserved

0F            ST                           Pop T to memory at A.

 

ALU Instructions

10            COM                      Complement all bits in T.

11            SHL                        Shift T left 1 bit.

12            SHR                        Shift T right 1 bit.

13            MUL                       Multiplication step.

14            XOR                       Pop S and Exclusive OR it to T.

15            AND                       Pop S and AND it to T.

16            DIV                         Division step.

17            ADD                       Pop S and add it to T.

 

Register Instructions

18            POP                        Pop R to push T.

19            LDA                       Push A to T.

1A           DUP                        Duplicate T.

1B           OVER                     S to T, push original T.

1C           PUSH                     Pop T to push R.

1D           STA                        Pop T to A.

1E            NOP                        Do nothing.

1F            DROP                     Pop T.

 

 

 

 

Individual instructions and their functions are listed as follows:

 

 

 


JUMP     (SKIP, ELSE, AGAIN, REPEAT)

 

Code:                                      0

 

Usage:                                    000000 aaaaaa aaaaaa aaaaaa

 

Stack Effects:                        none

Carry:                                     no change

 

Function:                                              

 

Jump to the 18 bit address in the bit field 17-0 in the current 256K word page of memory.  It must be in slot 0 of a word.               

 

Restriction:

 

This instruction allows the program to be redirected to any location within an 256K word page of memory.  It does not cross page boundaries.  To jump to locations outside of a memory page, one has to push the target address on the return stack and execute the RET instruction to effect a long jump.  This restriction also applies to CALL, JZ and JNC.  See also RET.

 

Coding Example:

 

CODE 50us

        2 ldi skip

CODE 100us

        1 ldi

        then

        sta -138 ldi

        begin lda add

        -until

        drop

        ret

 

SKIP makes an unconditional jump to THEN, to let 50us sharing the delay loop with 100us.

 

 


RET   (;)

 

Code:                                      1

 

Usage:                                    000001 xxxxxx xxxxxx xxxxxx

                                                cccccc 000001 xxxxxx xxxxxx

                                                cccccc cccccc 000001 xxxxxx

                                                cccccc cccccc cccccc 000001

 

Stack Effects:                        ( -- ; R: a -- )

Carry:                                     no change

 

Function:                                              

 

Pop the address of the top of the return stack into the program counter P, thus resume the execution sequence interrupted by the last CALL instruction.  Besides terminating a subroutine, this instruction may be used to effect a long jump to a location outside of the current memory page.

 

This instruction can be placed in any slot of a word.  The instructions before return are executed.  The instructions following return are ignored.

 

 

Coding Example:

 

In the subroutine thread model, RET is used to terminate all code words and colon words.  The Forth word ; simply compiles a RET to end a Forth word.

 

 

 


JZ            (IF, WHILE, UNTIL)

 

Code:                                      2

 

Usage:                                    000010 aaaaaa aaaaaa aaaaaa

 

Stack Effects:                        ( n -- )

Carry:                                     no change

 

Function:                                              

 

Conditionally jump to the 18 bit address in the bit field 17-0 in the current 256K word page of memory, if the T register contains a 0.  It must be in slot 0 of a word.         

 

The T register is destroyed and the data stack is popped back to T.  This instruction is different from JNC, which does not pop the data stack and removes T.

 

Coding Example:

 

CODE ?DUP ( w -- w w | 0 )

   dup

   if dup ret then

   ret

 

 


JNC         (-UNTIL, -IF, -WHILE)

 

Code:                                      3

 

Usage:                                    000011 aaaaaa aaaaaa aaaaaa

 

Stack Effects:                        ( n -- n )

Carry:                                     no change

 

Function:                                              

 

Conditionally jump to the 18 bit address in the bit field 17-0 in the current 256K word page of memory, if the Carry flag (Bit 24 of T) is reset.  It must be in slot 0 of a word.               

 

The T register and the data stack are preserved.  This instruction is different from the instructions JZ, which pop the data stack and removes T.

 

 

Coding Example:

 

To test the negative flag T(23), it is shifted into carry T(24) and tested using JNC compiled by -IF.

 

CODE ABS ( n -- +n )

   dup shl

   -if drop com 1 ldi add

       ret

   then

   drop ret

 

 


CALL

 

Code:                                      4

 

Usage:                                    000100 aaaaaa aaaaaa aaaaaa

 

Stack Effects:                        ( -- ; R: -- a )

Carry:                                     no change

 

Function:                                              

 

Call a subroutine whose address is in the bit field 17-0 in the current 256K word page of memory.  It must be in slot 0 of a word

 

The address of the next word is pushed on the return stack.  When a return instruction in the subroutine is encountered, this address is popped off the return stack and the next word is executed to resume the interrupted execution sequence.

 

Restriction:

 

This instruction allows the program to call to any subroutine within the current 256K page of memory.  It does not cross page boundaries. 

 

Coding Example:

 

All Forth words are compiled as subroutine calls.  This is the most efficient way to build lists in Forth.

 

 

 

               

 


LDP

 

Code:                                      9

 

Usage:                                    001001 ccccccc ccccccc ccccccc

                                                ccccccc 001001 ccccccc ccccccc

                                                ccccccc ccccccc 001001 ccccccc

                                                ccccccc ccccccc ccccccc 001001

 

Stack Effects:                        ( -- n )

Carry:                                     reset to 0

 

Function:                                              

 

Fetch the contents of a memory location whose 24-bit address is in the A register and push that number onto the data stack.  The address in the A register is then incremented to facilitate accessing the next memory.  It is most useful in reading values from a table in the memory.

 

This fetch instruction is different from the @ instruction in Forth, which uses the address on the top of the data stack.

 

This instruction also resets the carry flag (Bit 24) in the T register.

 

Coding Example:

 

Increment T                           sta ldp drop lda

 

Otherwise,                             cccccc cccccc ldi add

                                                000000 000000 000000 000001

costs 6 slots.

 

 

 


LDI

 

Code:                                      0A

 

Usage:                                    001010 cccccc cccccc cccccc

                                                nnnnnn nnnnnn nnnnnn nnnnnn

 

                                                cccccc 001010 cccccc cccccc

                                                nnnnnn nnnnnn nnnnnn nnnnnn

 

                                                cccccc cccccc 001010 cccccc

                                                nnnnnn nnnnnn nnnnnn nnnnnn

 

                                                cccccc cccccc cccccc 001010

                                                nnnnnn nnnnnn nnnnnn nnnnnn

 

Stack Effects:                        ( -- n )

Carry:                                     reset to 0

 

Function:                                              

 

Fetch the contents of the next word and push that number onto the data stack.  The program counter PC is incremented passing the next word.  This instruction allows a program to enter numbers onto the data stack for later use.

 

This instruction also resets the carry flag (Bit 24) in the T register.

 

Coding Example:

 

Push 1 2 3 4 on data stack:

 

                                                Ldi ldi ldi ldi

                                                1

                                                2

                                                3

                                                4

 

 

 


LD

 

Code:                                      0B

 

Usage:                                    001011 cccccc cccccc cccccc

                                                cccccc 001011 cccccc cccccc

                                                cccccc cccccc 001011 cccccc

                                                cccccc cccccc cccccc 001011

 

Stack Effects:                        ( -- n )

Carry:                                     reset to 0

 

Function:                                              

 

Fetch the contents of a memory location whose 24-bit address is in the A register and push that number onto the data stack.  The address in the A register is not modified.

 

This fetch instruction is different from the @ instruction in Forth, which uses the address on the top of the data stack.

 

This instruction also resets the carry flag (Bit 24) in the T register.

 

Coding Example:

 

 


STP

 

Code:                                      0D

 

Usage:                                    001101 cccccc cccccc cccccc

                                                cccccc 001101 cccccc cccccc

                                                cccccc cccccc 001101 cccccc

                                                cccccc cccccc cccccc 001101

 

Stack Effects:                        ( n -- )

Carry:                                     restore from data stack

 

Function:                                              

 

Pop the number off the data stack and store it into the memory location whose 24-bit address is in Register A.  The address in the A register is then incremented to facilitate the next memory access.  It is most useful in storing values to a table in the memory.

 

This store instruction is different from the ! instruction in Forth, which uses the address on the top of the data stack.

 

Coding Example:

 

See the copying program shown in LDP.

 

 


ST

 

Code:                                      0F

 

Usage:                                    001111 cccccc cccccc cccccc

                                                cccccc 001111 cccccc cccccc

                                                cccccc cccccc 001111 cccccc

                                                cccccc cccccc cccccc 001111

 

Stack Effects:                        ( n -- )

Carry:                                     restore from data stack

 

Function:                                              

 

Pop the number off the data stack and store it into the memory location whose 24-bit address is in Register A.  The address in the A register is not modified.

 

This store instruction is different from the ! instruction in Forth, which uses the address on the top of the data stack.

 

Coding Example:

 

CODE ! ( n a -- )

sta st ret

 

 


COM

 

Code:                                      10

 

Usage:                                    010000 cccccc cccccc cccccc

                                                cccccc 010000 cccccc cccccc

                                                cccccc cccccc 010000 cccccc

                                                cccccc cccccc cccccc 010000

 

Stack Effects:                        ( n1 – n1* )

Carry:                                     no change

 

Function:                                              

 

Complement all 24 bits in the T register.  This is a one's complement operation.

 

Coding Example:

 

To generate a -1 in T register:

 

                                                zero com

 

OR has to be synthesized from COM, and AND using:

A or B = not( not(A) and not(B))

 

CODE OR ( n n - n )           ( this looks pretty awkward, maybe )

   com push com               ( the last available opcode or NIP )

   pop and com ret            ( should be replaced with OR )

 

 


SHL

 

Code:                                      11

 

Usage:                                    010001 cccccc cccccc cccccc

                                                cccccc 010001 cccccc cccccc

                                                cccccc cccccc 010001 cccccc

                                                cccccc cccccc cccccc 010001

 

Stack Effects:                        ( n -- 2n )

Carry:                                     Bit 23 of T is shifted into carry

 

Function:                                              

 

Shift all lower 24 bits in the T register to the left by 1 bit.  The lowest Bit-0 is cleared.

 

Coding Example:

 

Multiply T by 3:                   dup shl add

Multiply by 5:                       dup shl shl add

Multiply by 6:                       dup shl add shl

 

SHL allows the negative bit of T(23) to be tested as carry T(24):

 

CODE 0< ( n - f )

   shl

   -if drop -1 ldi ret

   then

   dup xor ( 0 ldi )

   ret

 

 


SHR

 

Code:                                      12

 

Usage:                                    010010 cccccc cccccc cccccc

                                                cccccc 010010 cccccc cccccc

                                                cccccc cccccc 010010 cccccc

                                                cccccc cccccc cccccc 010010

 

Stack Effects:                        ( n -- n/2 )

Carry:                                     loaded from serial input

 

Function:                                              

 

Shift the contents of the T register right by one bit.  Bit-0 is shifted to the bit-banged UART serial output. The sign (Bit23) is preserved.

 

Coding Example:

 

SHR is used to implement a simple UART.  The lowest bit in T, T(0) is shifted out to the UART serial output pin, and the UART serial input pin is loaded into

carry for testing.

 

CODE EMIT ( c -- )

        $7F ldi and

        shl $FFFF01 ldi xor

        $0A ldi

        FOR shr 100us NEXT

        drop ret

CODE KEY ( -- c )

        $FFFFFF ldi

        begin   shr

        -until

        repeat   ( wait for start bit )

        50us

        7 ldi

        FOR

          100us shr

          -if $80 ldi xor then

        NEXT

        $FF ldi and

        100us ret

 

 

 


MUL

 

Code:                                      13

 

Usage:                                    010011 cccccc cccccc cccccc

                                                cccccc 010011 cccccc cccccc

                                                cccccc cccccc 010011 cccccc

                                                cccccc cccccc cccccc 010011

 

Stack Effects:                        ( n1 n2 -- n1 n3 )

Carry:                                     unchanged

 

Function:                                              

 

Conditionally add the S register on the data stack to the T register if Bit-0 in A is set.  If Bit-0 in A is reset, T register is not modified.  The T-A register pair is now shifted to the right by one bit.

 

This MUL instruction is useful as a multiplication step in implementing a fast software multiplication routine.  Repeating this instruction 24 times will multiply A and S and produce a 48-bit product in the T-A pair. (T is normally initialized to zero prior to the multiply sequence. However any non-zero initial value in T adds to the final result in the T-A pair.)

 

Coding Example:

 

Multiply two 24-bit unsigned integers.  Multiplicand is in S.  Multiplier is in A.

 

                                                mul mul mul mul

                                                mul mul mul mul

                                                mul mul mul mul

                                                mul mul mul mul

                                                mul mul mul mul

                                                mul mul mul mul

 

The 48-bit product is in T-A register pair and the multiplicand in S is preserved.

 

Primitive multiplication routines are thus defined:

 

CODE UM* ( u u -- ud )

   sta 0 ldi

   mul mul mul mul

   mul mul mul mul

   mul mul mul mul

   mul mul mul mul

   mul mul mul mul

   mul mul mul mul

   push drop lda pop

   ret

 

 


XOR

 

Code:                                      14

 

Usage:                                    010100 cccccc cccccc cccccc

                                                cccccc 010100 cccccc cccccc

                                                cccccc cccccc 010100 cccccc

                                                cccccc cccccc cccccc 010100

 

Stack Effects:                        ( n1 n2 -- n3 )

Carry:                                     unchanged

 

Function:                                              

 

Pop S on the data stack and exclusive-OR it to the T register.  All 24 bits in T are affected.

 

Coding Example:

 

To clear T to zero:

 

                                                dup xor   ( now use more transparent “drop zero” )

 

To generate a zero in T register:

 

                                                dup dup xor           ( now use faster “zero” )

 

T is duplicated twice to save its contents.  The two duplicated copies of T are XOR'ed together.  All the reset bits remained reset.  All set bits get reset.  Thus a 0 is created in T.

 

It costs 5 slots to produce a -1:

 

                                                Ldi cccccc cccccc cccccc

                                                -1

vs                           

                                                dup dup xor com  ( now use faster “zero com” )

 

 

 


AND

 

Code:                                      15

 

Usage:                                    010101 cccccc cccccc cccccc

                                                cccccc 010101 cccccc cccccc

                                                cccccc cccccc 010101 cccccc

                                                cccccc cccccc cccccc 010101

 

Stack Effects:                        ( n1 n2 -- n3 )

Carry:                                     unchanged

 

Function:                                              

 

Pop S on the data stack and AND it to the T register.  All 24 bits in T are affected.

 

Coding Example:

 

 


DIV

 

Code:                                      16

 

Usage:                                    010110 cccccc cccccc cccccc

                                                cccccc 010110 cccccc cccccc

                                                cccccc cccccc 010110 cccccc

                                                cccccc cccccc cccccc 010110

 

Stack Effects:                        ( n1 n2 -- n1 n3 )

Carry:                                     unchanged (I think – need to check.)

 

Function:                                              

 

Add the S register on the data stack to the T register. If the addition produces a carry place the sum in T, otherwise leave T unchanged.  The T-A register pair is now shifted to the left by one bit.  Carry is shifted into A(0).

 

This DIV instruction is useful as a division step in implementing a fast software division routine.  Repeating this instruction 25 times will divide a 48 bit number originally in the T-A register pair by the negative of the number in S, leaving the result in A and remainder in T.

 

Coding Example:

 

Divide a 48-bit positive integer by a positive divisor.  The negated divisor is in S.

 

                                                div div div div

                                                div div div div

                                                div div div div

                                                div div div div

                                                div div div div

                                                div div div div

                                                div shr

 

(Note: I think that this last shr undoes the most recent shl that is

part of div, aligning the remainder properly in T. Also I think

this division actually only works properly for 47 bit unsigned

numbers in T-A. -- WRC)

 

Primitive division routines are thus defined:

 

CODE UM/MOD ( ud u -- ur uq )

   com 1 ldi add sta

   push lda push sta

   pop pop

   skip

CODE /MOD ( n n -- r q )

   com 1 ldi add push

   sta pop 0 ldi

   then

   div div div div

   div div div div

   div div div div

   div div div div

   div div div div

   div div div div

   div 1 ldi xor shr

   push drop pop lda

   ret


ADD

 

Code:                                      17

 

Usage:                                    010111 cccccc cccccc cccccc

                                                cccccc 010111 cccccc cccccc

                                                cccccc cccccc 010111 cccccc

                                                cccccc cccccc cccccc 010111

 

Stack Effects:                        ( n1 n2 -- n1+n2 )

Carry:                                     change according to n1 and n2

 

Function:                                              

 

Pop S on the data stack and add it to the T register. 

 

Coding Example:

 

The primitive addition in eForth is thus defined:

 

CODE UM+  ( n n - n carry )         ( don’t use this if you want speed – WRC )

   add

   -if 1 ldi ret

   then

   dup dup xor ( 0 )

   ret

 


POP

 

Code:                                      18

 

Usage:                                    011000 cccccc cccccc cccccc

                                                cccccc 011000 cccccc cccccc

                                                cccccc cccccc 011000 cccccc

                                                cccccc cccccc cccccc 011000

 

Stack Effects:                        ( -- n ; R: n -- )

Carry:                                     unchanged

 

Function:                                              

 

Pop the R register on the return stack to the T register.  Original contents in T are pushed on the data stack.

 

Coding Example:

 

Exchanging A and T            lda push sta pop

Exchanging A and R            lda pop sta push

Increment T                           sta ldp drop lda                    ( now use “one add” )

Decrement T                         dup dup xor com add           ( now use “zero com add” )

 

 

 


LDA

 

Code:                                      19

 

Usage:                                    011001 cccccc cccccc cccccc

                                                cccccc 011001 cccccc cccccc

                                                cccccc cccccc 011001 cccccc

                                                cccccc cccccc cccccc 011001

 

Stack Effects:                        ( -- a )

Carry:                                     unchanged

 

Function:                                              

 

Copy the contents in the A register to the T register.  The original content of the T register is pushed on the data stack.  With LDA and STA, the A register can serve as a scratch pad register to save and restore the contents of the T register.

 

Coding Example: (see example for POP)

 

 

 


DUP

 

Code:                                      1A

 

Usage:                                    011010 cccccc cccccc cccccc

                                                cccccc 011010 cccccc cccccc

                                                cccccc cccccc 011010 cccccc

                                                cccccc cccccc cccccc 011010

 

Stack Effects:                        ( n -- n n )

Carry:                                     unchanged

 

Function:                                              

 

Duplicate T register and push it on the data stack.

 

Coding Example:

 

Decrement T                         dup dup xor com add           ( now use “zero com add” )

 

 


OVER

 

Code:                                      1B

 

Usage:                                    011011 cccccc cccccc cccccc

                                                cccccc 011011 cccccc cccccc

                                                cccccc cccccc 011011 cccccc

                                                cccccc cccccc cccccc 011011

 

Stack Effects:                        ( n1 n2 –- n1 n2 n1 )

Carry:                                     unchanged

 

Function:                                              

 

S is transferred into T register.  The original contents in the T register is pushed onto the data stack.

 

Coding Example:

 

CODE 2DUP ( n1 n2 – n1 n2 n1 n2 )

   over over ret

 

 


PUSH

 

Code:                                      1C

 

Usage:                                    011100 cccccc cccccc cccccc

                                                cccccc 011100 cccccc cccccc

                                                cccccc cccccc 011100 cccccc

                                                cccccc cccccc cccccc 011100

 

Stack Effects:                        ( n -- ; R: -- n )

Carry:                                     unchanged

 

Function:                                              

 

Pop S on the data stack and store it to the T register.  The original contents in the T register is pushed onto the return stack.

 

Coding Example:

 

CODE ROT ( w1 w2 w3 -- w2 w3 w1 )

   push push sta pop

   pop lda ret

 

 


STA

 

Code:                                      1D

 

Usage:                                    011101 cccccc cccccc cccccc

                                                cccccc 011101 cccccc cccccc

                                                cccccc cccccc 011101 cccccc

                                                cccccc cccccc cccccc 011101

 

Stack Effects:                        ( a -- )

Carry:                                     no change

 

Function:                                              

 

Pop S on the data stack and store it to the T register.  The original contents in the T register is copied into the A register.  This instruction initializes the A register so that it can be used to fetch data from memory or store data into memory. 

 

Coding Example:

 

CODE ! ( n a -- )

      sta st ret

 

 


NOP

 

Code:                                      1E

 

Usage:                                    011110 xxxxxx xxxxxx xxxxxx

                                                cccccc 011110 xxxxxx xxxxxx

                                                cccccc cccccc 011110 xxxxxx

                                                cccccc cccccc cccccc 011110

 

Stack Effects:                        (  -- )

Carry:                                     no change

 

Function:                                              

 

No operation.  This instruction will force the execute state to slot 0, to get the next word to be fetched and executed.

 

Coding Example: usually inserted by assembler.

 

 


DROP

 

 

Code:                                      1F

 

Usage:                                    011111 cccccc cccccc cccccc

                                                cccccc 011111 cccccc cccccc

                                                cccccc cccccc 011111 cccccc

                                                cccccc cccccc cccccc 011111

 

Stack Effects:                        ( n -- )

Carry:                                     unchanged

 

Function:                                              

 

Pop S on the data stack and store it to the T register.  The original contents in the T register are lost.

 

Coding Example: see example for jump.