P24 Microprocessor User's Manual

9. P24 CPU Architecture

P24 is a “Minimal Instruction Set Computer” design patterned after Mr. Chuck Moore's MuP21. P24 has a 24-bit CPU core with dual stack architecture intended to efficiently execute Forth-like instructions. The processor design is simple to allow implementation within field programmable gate arrays. P24 employs a RISC-like instruction set with four 6-bit instructions packed into 24 bit words. With 6-bit code for instructions, it can accommodate 64 machine instructions. Currently only 26 are implemented. The rest are reserved for user to define their own instructions.

Following is a list of unique features of P24:

* 24-bit address and data buses

* 6-bit RISC-like CPU instructions

* 4-deep instruction cache

* 17-deep data stack

* 33-deep return stack

* Current implementation runs at 10 MHz in FPGA

9.1 Registers and Stacks

P24 has the following registers:

A Address Register, supplying address for memory read and write

I Instruction Latch, holding instructions to be executed

P Program Counter, pointing to the next program word in memory

R Top of Return Stack

S Top of Data stack

T Accumulator for ALU

All registers are 25 bit wide. The most significant bit in T, T(24) is the carry produced by the 24-bit adder. This carry bit is preserved as data in T when it is transferred to other registers and to the stacks. The preservation of carry bit greatly simplifies the logic processing of data, and allows interrupts to be serviced when the next program word is fetches from the memory, without having to save the carry bit and restore it on return.

P24 has two stacks:

S_stack Data stack, 17 levels deep

R_stack Return stack, 33 levels deep

The return stack is used to preserve return addresses on subroutine calls. The data stack is used to pass parameters among the nested subroutine calls. With these two stacks in the CPU hardware, P24 is optimized to support the Forth programming language.

The 24-bit P24 CPU sports a small, RISC-like instruction set. Four 6-bit instructions are packed into one 24-bit word, and are executed consecutively after a word is fetched from memory. The P24 CPU has a two-stack architecture that is easily programmed in Forth. The data stack is 17 levels deep (including T), and the return stack is 33 levels depth.

The following diagram shows the architecture of the P24 processor. It shows the registers, the stacks, and the data paths among them.

Not shown in the diagram is the connection between T register and the external data bus. When reading data from memory, the A register supplies the memory address to the address bus, and data is latch from the data bus into the T register. When writing data into memory, the address is supplied by A register, and data is written to the data bus from the T register.

Figure 1. The architecture of P24

Data Bus Address Bus

| ^ ^

| | |

v | |

|-----| |-----| |-----|

| I | | P | | A |

|-----| |-----| |-----|

^ ^

| |

v v

|------------------------------------------|-----| |-----| |-----|-------------------------||

| Return Stack | R | -------------- | T | -------------- | S | Data Stack |

|------------------------------------------|-----| |-----| |-----|--------|----------------- |

^ | |

| v v

| |------------------|

| | ALU |

| |------------------|

|<------|

9.2 Functional Block Diagram of P24

These data path diagrams should be read with the CPU24.VHD file.

The instruction decoding logic simply apply the proper control signals to the following register loading and multiplexer selecting signals:

Clr Master reset

Clk Master clock, 0-40 MHz

t_sel Select input to T register

tload If set, load t_in into T register

spop If set, pop the data stack

spush If set, push T on the data stack

a_sel Select input to A register

aload If set, load a_in inot A register

r_sel Select input to R register

rload If set, load r_in into R register

rpop If set. Pop the return stack

rpush If set, push R on the return stack

p_sel Select input to P register

pload If set, load P_in into P register

m_sel Select output to Address bus

iload If set, load instruction from data bus to I register

reset Clear the machine instruction counter

slot Output of machine instruction counter to select instruction

The synchronous program execution unit clocks the slot signal, which selects the proper 6-bit instructions in the I register to produce the above control signals. At the rising clock edge, the selected data are latched into the proper register and stacks. All data signals must stabilize before the next rising clock edge strikes.

The architecture is very simple and components are very similar to one another. It should be very easy to do a good layout, and the routing should not be difficult.

Figure 2. The block diagrams of P24 components

The T and Data Stack Data Path

not t-------| |-----------| |-----------|

s xor t-----| | | | |

s and t-----|-- t_in-----| T |-- t--------| s_stack |----s

s + t-------| | | spop-----| |

(s+t)/2-----| tload----| | spush----| |

t/2 --------| clk------| | clk------| |

c&t/2-------| clr------| | clr------| |

(s+t)*2-----| |-----------| |-----------|

t*2&a-------|

t*2---------|

s --------|

a ---------|

r-----------|

data -------|

t_sel-------^

The A Register and A-Mux

|-----------|

t ----------|-- a_in-----| |---a

a+1---------| | |

(s+t)&a/2---| aload----|A |

a*2+c-------| clk------| |

| clr------| |

a_sel-------^ |-----------|

The Return Stack Data Path

| |-----------| |-----------|

r_out-------|--r_in-----| R |--r--------| r_stack |---r_out

r+1---------| | | rpop-----| |

p-----------| rload----| | rpush----| |

| clk------| | clk------| |

| clr------| | clr------| |

r_sel-------^ |-----------| |-----------|

The Program Counter Data Path

| |-----------|

interrupt---| | | |

p&i(17.0)---|--p_in-----| P |--p--------|---address

p+1---------| | | a--------|

r-----------| pload----| | |

| clk------| | |

| clr------| | |

p_sel-------^ |-----------| m_sel----^

The Instruction Latch and Decoder Data Path

|-----------|

| | |

data--------------------| I |--i(23.0)- |---code(5.0)

iload-----| | |

clk-------| | |

clr-------| | |

|-----------| |

| | |

reset-----| sync |-slot(2.0)-^

| |

clk-------| |

clr-------| |

|--------- |

On power-up, all registers and the stacks are cleared to zero when "clr" is held high. When "clr" is lowered to zero, the master clock "clk" will start the CPU from memory location 0, as the initialized P register is pointing to.

9.3 Input/Output Signals and System Timing

P24 is very flexible in packaging, depending on the memory configuration. These are the signals normally brought to I/O pins. In certain applications, the memory is included on chip and the address bus and data bus do not have to be brought out.

CLK 1-40 MHz master clock

A0-23 Address bus to RAM, SRAM and I/O devices

D0-23 Data bus for RAM, SRAM and I/O devices

CLR Low system reset (active low)

Vdd 5V power supply

Vss Ground

WE Write enable (active low)

INT0-4 External interrupt inputs

UART_IN RS232 serial input pin

UART_OUT RS232 serial output pin

All time periods noted in the following timing diagrams are in periods of the master clock.

Figure 3. Timing of P24 instruction executions

Master Clock

|----------| |----------| |----------| |----------| |----------| |----------| |----

| | | | | | | | | | | | |

| |----------| |----------| |----------| |----------| |----------| |----------|

Slot0 Signal

|----------| |----------| |----------|

| |-------------------------------------------| |-------------------------------------------| |------

fetch execute execute execute execute fetch execute execute execute execute fetch

call, jump, jz, jnc

|----------| |----------|

| |----------| |----------------------------------------

fetch execute execute execute execute execute ...

NOP and RET instructions can be in any of the four slots. When these two instructions are executed, slot0 will be forced into the next slot, and the next instruction words will be fetched and then executed.

The P24 implementation contains a very simple interrupt controller. If an interrupt is pending on slot0, the program counter is pushed to return stack and the interrupt vector is placed in the program counter. The interrupt vector is the current state of INT0-INT4. Once an interrupt is serviced via execution of slot4, servicing of interrupts is automatically disabled until the execution of an RET instruction. Immediately after the RET execution, any pending interrupt (if any) will be serviced.

When executing a right shift instruction SHR, the sign bit T(23) is preserved. Bits T(23..1) are shifted to the right by one bit. Bit T(0) is latched onto the UART_OUT pin, and UART_IN pin is latched into the carry bit T(24). This very simple mechanism allows a simple RS232 serial port to be built in P24 core. As the serial port is the only peripheral device required by eForth, this simple serial port opens a window for the user to access the resources provided by P24, and supports a powerful embedded Forth system to control and to program the P24 system.

9.4 P24 Instruction Set

The P24 instruction set can be best explained using the register and data flow diagram as shown in Figures 1 and 2. The T register is the center of the ALU, which takes data from the T and S registers and routes the results back to the T register. The contents of T can be moved to the A register, pushed on the data stack S, and pushed on the return stack S.

The T register connects the data stack and the return stack as a large shift register. Data can be shifted towards the return stack by the PUSH instruction, and shifted towards the data stack by the POP instruction.

Register A holds a memory address, which is used to read data from memory into the T register, or write the data in T register to external memory. The address in A can be auto incremented, so that P24 can conveniently access data arrays in memory.

P is the program counter and it holds the address of the next instruction to be fetched from the memory. After an instruction is fetched, P is auto incremented and ready to read the next instruction. When a CALL instruction is executed, the address in P is pushed on the return stack. When a return (RET) instructions is executed, the previously saved address in R is popped back into P. The execution sequence interrupted by CALL can now be resumed.

P24 is a microprocessor with 24-bit instructions. Each instruction contains up to 4 6-bit machine codes. The instruction fields in a program word can be shown as follows:

Bits: 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00

There are 64 possible instructions in a 6-bit field. Half of these are reserved for user applications. Only the lower 32 instructions are specified in P24. These instructions consist of four classes:

0 Transfer Instructions

1 Memory Access Instructions

2 ALU Instructions

3 Register Instructions

JUMP, CALL, JZ and JNC instructions must appear as Slot1 of a program word, ie. bits 23-18. The last 18 bits 17-0 contain the address inside the current 256K word page. They can access code within the current page. To reach other pages of memory, you will have to push a 24-bit address on the return stack and execute the RET instruction.

The transfer instructions thus has the following forms:

JUMP aaaaaa aaaaaa aaaaaa

CALL aaaaaa aaaaaa aaaaaa

JZ aaaaaa aaaaaa aaaaaa

JNC aaaaaa aaaaaa aaaaaa

The conditional jump instruction JZ is used to implement the IF, WHILE, and UNTIL words in Forth in that it does pop the number being tested in T. The conditional jump instruction JNC causes a jump if the carry bit T(24) is cleared. It is useful in multiple precision math operations. JNC and JZ does not pop the T register, so its contents can be tested again.

Table 1. P24 Machine Code

Code Name Function

Transfer Instructions

00 JUMP Jump to 18 bit address. Must in Slot1.

01 RET Subroutine return.

02 JZ Jump if T is 0. Must in Slot1.

03 JNC Jump if carry is reset. Must in Slot1.

04 CALL Call subroutine. Must in Slot1.

05 Reserved

06 Reserved

07 Reserved

Memory Access Instructions

08 Reserved

09 LDP Push memory at A to T. Increment A.

0A LDI Push in-line literal to T.

0B LD Push memory at A to T.

0C Reserved

0D STP Pop T to memory at A. Increment A.

0E Reserved

0F ST Pop T to memory at A.

ALU Instructions

10 COM Complement all bits in T.

11 SHL Shift T left 1 bit.

12 SHR Shift T right 1 bit.

13 MUL Multiplication step.

14 XOR Pop S and Exclusive OR it to T.

15 AND Pop S and AND it to T.

16 DIV Division step.

17 ADD Pop S and add it to T.

18 POP Pop R to push T.

19 LDA Push A to T.

1A DUP Duplicate T.

1B OVER S to T, push original T.

1C PUSH Pop T to push R.

1D STA Pop T to A.

1E NOP Do nothing.

1F DROP Pop T.

Individual instructions and their functions are listed as follows:

JUMP (SKIP, ELSE, AGAIN, REPEAT)

Code: 0

Usage: 000000 aaaaaa aaaaaa aaaaaa

Stack Effects: none

Carry: no change

Function:

Jump to the 18 bit address in the bit field 17-0 in the current 256K word page of memory. It must be in slot 0 of a word.

Restriction:

This instruction allows the program to be redirected to any location within an 256K word page of memory. It does not cross page boundaries. To jump to locations outside of a memory page, one has to push the target address on the return stack and execute the RET instruction to effect a long jump. This restriction also applies to CALL, JZ and JNC. See also RET.

Coding Example:

CODE 50us

2 ldi skip

CODE 100us

1 ldi

then

sta -138 ldi

begin lda add

-until

drop

ret

SKIP makes an unconditional jump to THEN, to let 50us sharing the delay loop with 100us.

RET (;)

Code: 1

Usage: 000001 xxxxxx xxxxxx xxxxxx

cccccc 000001 xxxxxx xxxxxx

cccccc cccccc 000001 xxxxxx

cccccc cccccc cccccc 000001

Stack Effects: ( -- ; R: a -- )

Carry: no change

Function:

Pop the address of the top of the return stack into the program counter P, thus resume the execution sequence interrupted by the last CALL instruction. Besides terminating a subroutine, this instruction may be used to effect a long jump to a location outside of the current memory page.

This instruction can be placed in any slot of a word. The instructions before return are executed. The instructions following return are ignored.

Coding Example:

In the subroutine thread model, RET is used to terminate all code words and colon words. The Forth word ; simply compiles a RET to end a Forth word.

JZ (IF, WHILE, UNTIL)

Code: 2

Usage: 000010 aaaaaa aaaaaa aaaaaa

Stack Effects: ( n -- )

Carry: no change

Function:

Conditionally jump to the 18 bit address in the bit field 17-0 in the current 256K word page of memory, if the T register contains a 0. It must be in slot 0 of a word.

The T register is destroyed and the data stack is popped back to T. This instruction is different from JNC, which does not pop the data stack and removes T.

Coding Example:

CODE ?DUP ( w -- w w | 0 )

dup

if dup ret then

ret

JNC (-UNTIL, -IF, -WHILE)

Code: 3

Usage: 000011 aaaaaa aaaaaa aaaaaa

Stack Effects: ( n -- n )

Carry: no change

Function:

Conditionally jump to the 18 bit address in the bit field 17-0 in the current 256K word page of memory, if the Carry flag (Bit 24 of T) is reset. It must be in slot 0 of a word.

The T register and the data stack are preserved. This instruction is different from the instructions JZ, which pop the data stack and removes T.

Coding Example:

To test the negative flag T(23), it is shifted into carry T(24) and tested using JNC compiled by -IF.

CODE ABS ( n -- +n )

dup shl

-if drop com 1 ldi add

ret

then

drop ret

CALL

Code: 4

Usage: 000100 aaaaaa aaaaaa aaaaaa

Stack Effects: ( -- ; R: -- a )

Carry: no change

Function:

Call a subroutine whose address is in the bit field 17-0 in the current 256K word page of memory. It must be in slot 0 of a word

The address of the next word is pushed on the return stack. When a return instruction in the subroutine is encountered, this address is popped off the return stack and the next word is executed to resume the interrupted execution sequence.

Restriction:

This instruction allows the program to call to any subroutine within the current 256K page of memory. It does not cross page boundaries.

Coding Example:

All Forth words are compiled as subroutine calls. This is the most efficient way to build lists in Forth.

LDP

Code: 9

Usage: 001001 ccccccc ccccccc ccccccc

ccccccc 001001 ccccccc ccccccc

ccccccc ccccccc 001001 ccccccc

ccccccc ccccccc ccccccc 001001

Stack Effects: ( -- n )

Carry: reset to 0

Function:

Fetch the contents of a memory location whose 24-bit address is in the A register and push that number onto the data stack. The address in the A register is then incremented to facilitate accessing the next memory. It is most useful in reading values from a table in the memory.

This fetch instruction is different from the @ instruction in Forth, which uses the address on the top of the data stack.

This instruction also resets the carry flag (Bit 24) in the T register.

Coding Example:

Increment T sta ldp drop lda

Otherwise, cccccc cccccc ldi add

000000 000000 000000 000001

costs 6 slots.

LDI

Code: 0A

Usage: 001010 cccccc cccccc cccccc

nnnnnn nnnnnn nnnnnn nnnnnn

cccccc 001010 cccccc cccccc

nnnnnn nnnnnn nnnnnn nnnnnn

cccccc cccccc 001010 cccccc

nnnnnn nnnnnn nnnnnn nnnnnn

cccccc cccccc cccccc 001010

nnnnnn nnnnnn nnnnnn nnnnnn

Stack Effects: ( -- n )

Carry: reset to 0

Function:

Fetch the contents of the next word and push that number onto the data stack. The program counter PC is incremented passing the next word. This instruction allows a program to enter numbers onto the data stack for later use.

This instruction also resets the carry flag (Bit 24) in the T register.

Coding Example:

Push 1 2 3 4 on data stack:

Ldi ldi ldi ldi

Code: 0B

Usage: 001011 cccccc cccccc cccccc

cccccc 001011 cccccc cccccc

cccccc cccccc 001011 cccccc

cccccc cccccc cccccc 001011

Stack Effects: ( -- n )

Carry: reset to 0

Function:

Fetch the contents of a memory location whose 24-bit address is in the A register and push that number onto the data stack. The address in the A register is not modified.

This fetch instruction is different from the @ instruction in Forth, which uses the address on the top of the data stack.

This instruction also resets the carry flag (Bit 24) in the T register.

Coding Example:

STP

Code: 0D

Usage: 001101 cccccc cccccc cccccc

cccccc 001101 cccccc cccccc

cccccc cccccc 001101 cccccc

cccccc cccccc cccccc 001101

Stack Effects: ( n -- )

Carry: restore from data stack

Function:

Pop the number off the data stack and store it into the memory location whose 24-bit address is in Register A. The address in the A register is then incremented to facilitate the next memory access. It is most useful in storing values to a table in the memory.

This store instruction is different from the ! instruction in Forth, which uses the address on the top of the data stack.

Coding Example:

See the copying program shown in LDP.

Code: 0F

Usage: 001111 cccccc cccccc cccccc

cccccc 001111 cccccc cccccc

cccccc cccccc 001111 cccccc

cccccc cccccc cccccc 001111

Stack Effects: ( n -- )

Carry: restore from data stack

Function:

Pop the number off the data stack and store it into the memory location whose 24-bit address is in Register A. The address in the A register is not modified.

This store instruction is different from the ! instruction in Forth, which uses the address on the top of the data stack.

Coding Example:

CODE ! ( n a -- )

sta st ret

COM

Code: 10

Usage: 010000 cccccc cccccc cccccc

cccccc 010000 cccccc cccccc

cccccc cccccc 010000 cccccc

cccccc cccccc cccccc 010000

Stack Effects: ( n1 – n1* )

Carry: no change

Function:

Complement all 24 bits in the T register. This is a one's complement operation.

Coding Example:

To generate a -1 in T register:

zero com

OR has to be synthesized from COM, and AND using:

A or B = not( not(A) and not(B))

CODE OR ( n n - n ) ( this looks pretty awkward, maybe )

com push com ( the last available opcode or NIP )

pop and com ret ( should be replaced with OR )

SHL

Code: 11

Usage: 010001 cccccc cccccc cccccc

cccccc 010001 cccccc cccccc

cccccc cccccc 010001 cccccc

cccccc cccccc cccccc 010001

Stack Effects: ( n -- 2n )

Carry: Bit 23 of T is shifted into carry

Function:

Shift all lower 24 bits in the T register to the left by 1 bit. The lowest Bit-0 is cleared.

Coding Example:

Multiply T by 3: dup shl add

Multiply by 5: dup shl shl add

Multiply by 6: dup shl add shl

SHL allows the negative bit of T(23) to be tested as carry T(24):

CODE 0< ( n - f )

shl

-if drop -1 ldi ret

then

dup xor ( 0 ldi )

ret

SHR

Code: 12

Usage: 010010 cccccc cccccc cccccc

cccccc 010010 cccccc cccccc

cccccc cccccc 010010 cccccc

cccccc cccccc cccccc 010010

Stack Effects: ( n -- n/2 )

Carry: loaded from serial input

Function:

Shift the contents of the T register right by one bit. Bit-0 is shifted to the bit-banged UART serial output. The sign (Bit23) is preserved.

Coding Example:

SHR is used to implement a simple UART. The lowest bit in T, T(0) is shifted out to the UART serial output pin, and the UART serial input pin is loaded into

carry for testing.

CODE EMIT ( c -- )

$7F ldi and

shl $FFFF01 ldi xor

$0A ldi

FOR shr 100us NEXT

drop ret

CODE KEY ( -- c )

$FFFFFF ldi

begin shr

-until

repeat ( wait for start bit )

50us

7 ldi

FOR

100us shr

-if $80 ldi xor then

$FF ldi and

100us ret

MUL

Code: 13

Usage: 010011 cccccc cccccc cccccc

cccccc 010011 cccccc cccccc

cccccc cccccc 010011 cccccc

cccccc cccccc cccccc 010011

Stack Effects: ( n1 n2 -- n1 n3 )

Carry: unchanged

Function:

Conditionally add the S register on the data stack to the T register if Bit-0 in A is set. If Bit-0 in A is reset, T register is not modified. The T-A register pair is now shifted to the right by one bit.

This MUL instruction is useful as a multiplication step in implementing a fast software multiplication routine. Repeating this instruction 24 times will multiply A and S and produce a 48-bit product in the T-A pair. (T is normally initialized to zero prior to the multiply sequence. However any non-zero initial value in T adds to the final result in the T-A pair.)

Coding Example:

Multiply two 24-bit unsigned integers. Multiplicand is in S. Multiplier is in A.

mul mul mul mul

The 48-bit product is in T-A register pair and the multiplicand in S is preserved.

Primitive multiplication routines are thus defined:

CODE UM* ( u u -- ud )

sta 0 ldi

mul mul mul mul

push drop lda pop

ret

XOR

Code: 14

Usage: 010100 cccccc cccccc cccccc

cccccc 010100 cccccc cccccc

cccccc cccccc 010100 cccccc

cccccc cccccc cccccc 010100

Stack Effects: ( n1 n2 -- n3 )

Carry: unchanged

Function:

Pop S on the data stack and exclusive-OR it to the T register. All 24 bits in T are affected.

Coding Example:

To clear T to zero:

dup xor ( now use more transparent “drop zero” )

To generate a zero in T register:

dup dup xor ( now use faster “zero” )

T is duplicated twice to save its contents. The two duplicated copies of T are XOR'ed together. All the reset bits remained reset. All set bits get reset. Thus a 0 is created in T.

It costs 5 slots to produce a -1:

Ldi cccccc cccccc cccccc

-1

dup dup xor com ( now use faster “zero com” )

AND

Code: 15

Usage: 010101 cccccc cccccc cccccc

cccccc 010101 cccccc cccccc

cccccc cccccc 010101 cccccc

cccccc cccccc cccccc 010101

Stack Effects: ( n1 n2 -- n3 )

Carry: unchanged

Function:

Pop S on the data stack and AND it to the T register. All 24 bits in T are affected.

Coding Example:

DIV

Code: 16

Usage: 010110 cccccc cccccc cccccc

cccccc 010110 cccccc cccccc

cccccc cccccc 010110 cccccc

cccccc cccccc cccccc 010110

Stack Effects: ( n1 n2 -- n1 n3 )

Carry: unchanged (I think – need to check.)

Function:

Add the S register on the data stack to the T register. If the addition produces a carry place the sum in T, otherwise leave T unchanged. The T-A register pair is now shifted to the left by one bit. Carry is shifted into A(0).

This DIV instruction is useful as a division step in implementing a fast software division routine. Repeating this instruction 25 times will divide a 48 bit number originally in the T-A register pair by the negative of the number in S, leaving the result in A and remainder in T.

Coding Example:

Divide a 48-bit positive integer by a positive divisor. The negated divisor is in S.

div div div div

div shr

(Note: I think that this last shr undoes the most recent shl that is

part of div, aligning the remainder properly in T. Also I think

this division actually only works properly for 47 bit unsigned

numbers in T-A. -- WRC)

Primitive division routines are thus defined:

CODE UM/MOD ( ud u -- ur uq )

com 1 ldi add sta

push lda push sta

pop pop

skip

CODE /MOD ( n n -- r q )

com 1 ldi add push

sta pop 0 ldi

then

div div div div

div 1 ldi xor shr

push drop pop lda

ret

ADD

Code: 17

Usage: 010111 cccccc cccccc cccccc

cccccc 010111 cccccc cccccc

cccccc cccccc 010111 cccccc

cccccc cccccc cccccc 010111

Stack Effects: ( n1 n2 -- n1+n2 )

Carry: change according to n1 and n2

Function:

Pop S on the data stack and add it to the T register.

Coding Example:

The primitive addition in eForth is thus defined:

CODE UM+ ( n n - n carry ) ( don’t use this if you want speed – WRC )

add

-if 1 ldi ret

then

dup dup xor ( 0 )

ret

POP

Code: 18

Usage: 011000 cccccc cccccc cccccc

cccccc 011000 cccccc cccccc

cccccc cccccc 011000 cccccc

cccccc cccccc cccccc 011000

Stack Effects: ( -- n ; R: n -- )

Carry: unchanged

Function:

Pop the R register on the return stack to the T register. Original contents in T are pushed on the data stack.

Coding Example:

Exchanging A and T lda push sta pop

Exchanging A and R lda pop sta push

Increment T sta ldp drop lda ( now use “one add” )

Decrement T dup dup xor com add ( now use “zero com add” )

LDA

Code: 19

Usage: 011001 cccccc cccccc cccccc

cccccc 011001 cccccc cccccc

cccccc cccccc 011001 cccccc

cccccc cccccc cccccc 011001

Stack Effects: ( -- a )

Carry: unchanged

Function:

Copy the contents in the A register to the T register. The original content of the T register is pushed on the data stack. With LDA and STA, the A register can serve as a scratch pad register to save and restore the contents of the T register.

Coding Example: (see example for POP)

DUP

Code: 1A

Usage: 011010 cccccc cccccc cccccc

cccccc 011010 cccccc cccccc

cccccc cccccc 011010 cccccc

cccccc cccccc cccccc 011010

Stack Effects: ( n -- n n )

Carry: unchanged

Function:

Duplicate T register and push it on the data stack.

Coding Example:

Decrement T dup dup xor com add ( now use “zero com add” )

OVER

Code: 1B

Usage: 011011 cccccc cccccc cccccc

cccccc 011011 cccccc cccccc

cccccc cccccc 011011 cccccc

cccccc cccccc cccccc 011011

Stack Effects: ( n1 n2 –- n1 n2 n1 )

Carry: unchanged

Function:

S is transferred into T register. The original contents in the T register is pushed onto the data stack.

Coding Example:

CODE 2DUP ( n1 n2 – n1 n2 n1 n2 )

over over ret

PUSH

Code: 1C

Usage: 011100 cccccc cccccc cccccc

cccccc 011100 cccccc cccccc

cccccc cccccc 011100 cccccc

cccccc cccccc cccccc 011100

Stack Effects: ( n -- ; R: -- n )

Carry: unchanged

Function:

Pop S on the data stack and store it to the T register. The original contents in the T register is pushed onto the return stack.

Coding Example:

CODE ROT ( w1 w2 w3 -- w2 w3 w1 )

push push sta pop

pop lda ret

STA

Code: 1D

Usage: 011101 cccccc cccccc cccccc

cccccc 011101 cccccc cccccc

cccccc cccccc 011101 cccccc

cccccc cccccc cccccc 011101

Stack Effects: ( a -- )

Carry: no change

Function:

Pop S on the data stack and store it to the T register. The original contents in the T register is copied into the A register. This instruction initializes the A register so that it can be used to fetch data from memory or store data into memory.

Coding Example:

CODE ! ( n a -- )

sta st ret

NOP

Code: 1E

Usage: 011110 xxxxxx xxxxxx xxxxxx

cccccc 011110 xxxxxx xxxxxx

cccccc cccccc 011110 xxxxxx

cccccc cccccc cccccc 011110

Stack Effects: ( -- )

Carry: no change

Function:

No operation. This instruction will force the execute state to slot 0, to get the next word to be fetched and executed.

Coding Example: usually inserted by assembler.

DROP

Code: 1F

Usage: 011111 cccccc cccccc cccccc

cccccc 011111 cccccc cccccc

cccccc cccccc 011111 cccccc

cccccc cccccc cccccc 011111

Stack Effects: ( n -- )

Carry: unchanged

Function:

Pop S on the data stack and store it to the T register. The original contents in the T register are lost.

Coding Example: see example for jump.