CHAPTER 13.

AN INTERACTIVE OPERATING SYSTEM

FOR AP120B ARRAY PROCESSOR

1. INTRODUCTION

Developing software for array processors(AP) is always a laborious and tedious task because one has to deal with a host of software tools to write and to test a program. Typically one has to write AP programs using a microcode assembler to assemble the microcode, which has to be bound with a main line program to be executed by the host computer. During execution, AP programs have to be loaded into the AP program memory. Arrays of data have to be formatted and moved into the data memory in AP. After AP routines are executed, resulting data have to be moved back to the host for examination or further processing.

Most AP manufacturers provide FORTRAN callable library packages with the hardware, assuming that the user will use FORTRAN to do all their work with the array processor. Since FORTRAN is a compiler language, it is very difficult to build an interactive system which allows the user to control the array processor intimately. One has to go through the editing-compiling-linking-loading process to get a shot at the program and repeat the process if anything has to be changed.

Floating Point System supplies a big software toolbox for the AP120B Array Processor, which has been a work horse for many scientific and engineering applications. Among the tools, APDBUG and APSIM do allow the user to try out various things interactively and observe the reaction of the array processor. However, the command set is severely limited and does not give the user full control over the entire range of AP120B's capability. Ideally, one should be able to access all the facilities in the array processor, direct the array processor to do elementary operations as defined by programs loaded in the program memory, and also construct and execute high level commands built from these elementary commands in an interactive fashion.

FORTH is a very powerful software tool, allowing the user to access and utilize all the facilities in a computer system. It was originally developed for instrumental control and programming. User can program at the lowest machine code level to take advantage of the resources provided in the computer system. Yet, it is also a high level language which enables the user to express his algorithm by strings of English- like commands, which can be either executed by an interpreter to cause immediate action in the computer system or by a compiler to construct more powerful high level commands to be executed later when called.

In our laboratory, an AP120B Array Processor from Floating Point System was installed on an Harris 80 Computer for experiments in real time signal processing. A FORTH interpreter-compiler was written for the Harris 80 Computer to facilitate the testing and maintenance of the interface between this computer and a number of peripheral equipments to collect real time data and to control some experiments actively. Since the AP120B is a very important link to close the control loop in this type of experiment, an effort was initiated to put the AP120B under the control of this FORTH system so that we might be able to test the entire system interactively.

2. THE FORTH OPERATING SYSTEM

This implementation of FORTH on the Harris 80 Computer was essentially a transcription of the fig-FORTH for the NOVA computer(1), released by the FORTH Interest Group in 1981. The NOVA FORTH model was chosen because the architecture of the Harris computer is very similar to that of the NOVA computer produced by Data General Corp. These two CPU's have the same types of registers and similar addressing modes. A requirement was to follow the fig-FORTH model as closely as possible so that the programs developed on the Harris computer can be transported to other computers with minimal modifications.

The most serious problem in adapting the fig-FORTH model to the Harris computer is that the Harris computer is a 24-bit machine while the fig-FORTH model presumes a 16-bit machine with byte addressing capability. Standard fig-FORTH commands use the same address to access bytes and 16 bit words. This is not feasible in Harris 80 because memory is accessed in 24-bit words. The solution adopted here was to make a clear differentiation between byte- addressing commands and word addressing commands. For the byte addressing commands, addresses are manipulated in a byte memory space. Two special commands were used to convert addresses from the byte space to the word space and vice versa. If proper care is exercised in the addressing modes, programs can be transported from the fig-FORTH system to this Harris FORTH system.

The dictionary in this FORTH system occupies about 3 Kwords in the core. With stacks and buffers, the entire FORTH load module is about 8 Kwords in size. Two disk buffers are allocated to interface with disk files, each one is 1 Kword in length or 3 Kbytes, equivalent to three standard FORTH blocks. This is a convenient size for the Harris operating system to handle the data transfer between a random access disk file and the FORTH module.

The FORTH module contains the FORTH nucleus, the text interpreter, and the colon compiler, with some extra utilities. A text editor and a Harris machine code assembler were written in FORTH and saved in a disk file, which can be loaded if needed. It was anticipated that the assembler will be necessary to write the device driver for the FPS Array Processor. It turned out that we only had to implement a small code routine APDR in the FORTH nucleus to call the array processor service routines already installed in the Harris computer. Everything else needed to interface to the AP120B were written in high level FORTH language.

This FORTH implementation is described more fully elsewhere in this book(2).

3. PROGRAMMING MODEL OF AP120B ARRAY PROCESSOR

The array processor AP120B is a very complicated machine, with many processing elements, a multitude of interconnecting buses, and a large number of memories and registers. It takes a fair amount of time to learn this machine well enough to use this machine successfully. In our design of the AP operating system using FORTH as the underlying control executive, one major design goal was to simplify as much as possible the AP software as viewed from the programmer's direction and hide the internal complexity of the AP to shorten the learning curve in using the array processor. The programming model, or the structures in the AP relevant for its utilization, is shown schematically in Fig. 23.

In this model of AP120B, there are only four elements in the AP to be dealt with by the user: the main data memory MD, the program storage memory PS, the scratchpad registers SPAD, and the interface registers including the switch register SWR, the lights register LITES, and the function register FN. The user only has to manage these few elements in order to command the array processor to perform desired functions.

Among these elements, the most crucial element is the program storage memory PS, where AP program or subroutine modules are stored. To simplify the implementation and the use of this AP control system, it is assumed that the program storage memory is preloaded with a collection of AP subroutines selected from the AP math library. To ease the task in calling these subroutines from the FORTH operating system, a table of subroutine entry points is constructed and stored at the beginning of the PS memory. Each entry in this table contains two AP instructions: a JSR instruction to call the appropriate subroutine, and a HALT instruction to stop the AP activity after the subroutine is successfully executed. This way, the AP math library routines can be used without any modification. It is possible to optimize the library subroutines to eliminate the calling overhead required by APEX and to halt the AP at the end of the subroutine. These modifications will speed up the execution of AP functions and reduced the program memory size. However, this optimization will be left as future projects.

In the main data memory MD, the first 100 memory elements are reserved to store scalar constants and variables, as needed by some AP functions. Starting at memory element 100 is the vector stack, which will be used by most AP functions implicitly. The vector stack is managed by two variables in FORTH, the variable VP, vector stack pointer, pointing to the top of the vector stack, and the variable FRAME, defining the size of the vector elements. In this FORTH system, all vector elements are of the same size. This may be a limitation if the applications requires vector elements of different sizes. However, for dedicated real time applications in which data are generally of the same size and format, this uniform vector format may be quite adequate. The advantage is that the language syntax is greatly simplified because the user does not have to specify explicitly the addresses, lengths, and increments of the vectors involved in an AP function, as in most AP FORTRAN subroutine calls.

The scratchpad registers SPAD are used by the AP to retrieve parameters needed in executing an AP function. Up to 16 parameters can be passed to an AP subroutine via SPAD. Before the FORTH system commands the AP to start executing an AP subroutine, it has to fill the SPAD with the necessary list of parameters into a buffer in the Harris computer. The address of this buffer is then passed to the FORTH command SPLDGO ( SPAD LoaD & Go) so that this parameter list is read into the SPAD registers in the AP before the AP function is executed.

The AP120B is attached to the Harris 80 computer as an I/O peripheral device. From the side of the Harris computer, the AP120B appears as three registers in an I/O device: two read/write registers SWR and FN, and a read-only register LITES. Commands and parameters are transferred to the AP and AP status can be examined by the Harris computer through these registers. The programming task is essentially developing specific commands which control these three registers.

4. IMPLEMENTATION OF THE AP OPERATING SYSTEM

The entire program to control AP120B from the FORTH operating system is only three pages of FORTH source codes, which are shown in its entirety in Listing 14. It is equivalent to about a hundred pages of FORTRAN codes for the equivalent functions. For the readers who are not familiar with the FORTH operating system and its peculiar language syntax, a few guidelines are offered here to help reading the source code.

1. Source codes are arranged in screens of 1024 bytes. In each screen, codes are grouped into 16 lines, each of 64 bytes in length. A line number precedes the line of code, but is not part of the code.

2. Words are separated by one or more spaces. A word can be a command, a number, or a string which must follow a string command.

3. ( is a comment command, causing the FORTH interpreter to ignore all the text up to and including the delimiter ')'.

4. : is a command causing the FORTH interpreter to construct a new command and add it to the dictionary in the FORTH operating system. The syntax of a new definition is:

: <name> <list of valid FORTH words> ;

; is the command to terminate a new command and make it available for execution or compilation.

5. The stack effect of a command is documented as a comment after the name of the new command. Items before the --- marks are those on the top of the stack before executing the command, and the items after the --- marks are those numbers left on the stack after executing the command.

6. DECIMAL and OCTAL alternate the number base between the regular decimal system and the octal system often using in addressing AP registers and the Harris memory.

7. EXIT terminates the interpretation of a screen of text. The texts after EXIT are ignored by the interpreter.

5. AP DEVICE DRIVER

Two physical devices are assigned to the AP120B in the Harris I/O structure. Device 65 is used to handle the SWR, LITES and FN interface registers, and device 66 is used to handle interrupts from the AP120B. The AP device driver routines are installed in the Harris I/O service package, and the elementary AP functions can be called directly through the standard Harris I/O calling protocol:

TLO IOPAR Transfer address of parameter list to the K register.

BLU $IOW Call the I/O service.

...

IOPAR

DATA 'xxxyy Device number xxx and function code yy in octal.

DATA word count

DAC buffer-address

Two FORTH commands I/O and APDR were implemented in the FORTH module. I/O is to handle general input/output service to all the Harris peripheral devices, and APDR is a adaption of I/O to the AP120B. I/O requires three parameters as input on the FORTH data stack: the device-function code, the word count and the parameter list address. APDR also requires three parameters on the data stack: the AP driver function code, the pattern to be copied into SWR register, and the command pattern to be copied into the FN register. APDR assumes that the device to be addressed is AP120B, Device 65 in Harris. I/O was used only once to define the function APOPEN, because it has to address Device 66 to initialize the interrupt handler for AP120B. All other elementary AP functions are derived from APDR.

I/O ( buffer-addr word-count function-code --- )

The most elementary I/O command passing control to the I/O service routines in the Harris operating system. The top item specifies the I/O channel and the function to be performed. If the function requires the transferring of additional parameters, the buffer address and the the size of the parameter buffer must be specified as the next two items under the function code. If address and count are not needed, dummy numbers must be supplied.

APDR ( parameter1 parameter2 function --- )

Fill the I/O parameter buffer with the two parameter values and the function code on the stack and executed the AP function. It calls the AP driver routines installed in the Harris computer.

6. ELEMENTARY AP FUNCTIONS

Elementary AP functions are simple derivatives of APDR. For some functions, two parameters and the function code used by APDR are sufficient to specify the operations and APDR is executed immediately. For more complicated functions, more parameters have to be moved into a buffer called IOPAR and one of the parameter is used to point to IOPAR. APDR then picks up these additional parameters in IOPAR for its execution. A few supporting functions are also included in this category. They are used to move data among buffers, memory and registers.

APOPEN ( --- )

Open the logic Devices 65 and 66, which must be assigned to the physical devices associated with AP120B hardware interfaces to the Harris computer by Harris system commands. This command must be executed before any other AP commands.

APIN ( register --- value )

Read the contents of one of the AP interface register whose number is placed on the stack. Returned value on the stack is its contents.

APOUT ( value register --- )

Store the value into the AP interface register whose number is given on the top of the stack.

RREG ( function --- lights )

Examine an AP register or memory by writing the FUNCTION register and reading the LITES register.

WREG ( function switch --- )

Deposit into an AP register or memory by writing the SWITCH register and the FUNCTION register.

WTRUN ( --- error )

Wait for the current AP program to finish and return the completion code. Error occurred if the returned code is not zero.

WTDMA ( --- error )

Wait for the completion of a DMA transfer. A completion code is returned on the stack.

@REG ( register --- value )

Use RREG to fetch the contents of an AP register.

!REG ( register value --- )

Store a value into an AP register.

APBUF ( --- buffer-addr )

Return the address of an array where the I/O parameters are stored to be retrieved by the command I/O.

SPAD ( --- address )

Return the address of an array where parameters are stored and moved into the SPAD in AP before an array processing function is executed.

!PAR ( heap addr count --- )

A set of numbers piled as a heap on the stack are dumped to the memory starting at addr. The number of items moved is given on the top of the stack as a count.

!DMA ( control word-count ap-addr host-addr --- )

The four parameters on the stack are stored in the I/O parameter buffer to specify the DMA actions to be followed.

RUNDMA ( --- )

Executed a DMA transfer. The detailed actions must be specified by an appropriate !DMA command.

!SPAD ( nspads slist start function errloc noload psa --- )

Store seven parameters required of an AP process into the I/O parameter buffers. Nspads is the numbers of SPAD items to be used, slist is the address of the array SPAD from which SPAD items are to be passed into AP, start is the starting address of the executable code in AP, function is the functional command, errloc is the address where an error code will be returned, noload indicates whether codes are to be loaded from the host, and psa is the address of the PS memory.

RUNAP ( --- )

Execute an AP process. The process must be specified by the !SPAD command.

APERR ( --- error )

Return the error code produced by the last AP operation.

APRSET ( --- )

Reset AP by stopping any DMA activities, halting the AP, resets the interface, and initializes various flags and data values.

7. VECTOR STACK MANAGEMENT

Most AP math library subroutines require long lists of parameters to specify the addresses, lengths, and increments of the vectors involved in their operations. These parameters are passed to the AP subroutines via SPAD registers. Using the elementary AP function defined above, we can pass all the necessary parameters explicitly with !SPAD command. However, to pass long lists of parameters on the data stack in FORTH is very messy and often ensures the unreadability of the program. Assuming that the vectors to be are of the same size and format, we can construct a vector stack in the MD memory to manipulate these vectors. AP stack operations will remove their required vector operands from the top of the vector stack, and leave only the explicit results on the vector stack for the subsequent operations to used as operands. Thus all references to the vector operands are implicit and the user does not have to supply the parameter lists. This vector stack greatly simplifies the syntax and the programming of this AP operating system, similar to the use of a data stack in FORTH.

VP and FRAME are two basic tools for managing the vector stack in the MD memory. They are defined as variables so that the structure and the location of the vector stack can be changed dynamically. Other commands are defined to support AP functions to handle the vector stack more efficiently.

VP ( --- pointer )

Return the current pointer to the top of the vector stack. It is initialized to point at MD location 100.

FRAME ( --- size )

Return the frame size of the vectors on the vector stack. it is initialized as 10 for demonstration purposes.

?VP ( --- )

Check the vector stack pointer. If it points below 100, abort the current AP process and re-initialize VP to 100.

+VP ( --- )

Increase the vector stack pointer VP by one frame.

-VP ( --- )

Decrease the vector stack pointer VP by one frame. The net effect of this command is popping the top vector off the vector stack.

V@ ( host-address --- )

Read one frame of data from the buffer starting at host-address in the Harris computer into AP through DMA transfer and push this frame of data on the vector stack.

V! ( host-address --- )

Pop the top frame on the vector stack and write this frame of data to the host buffer starting at host-address.

V? ( --- )

Remove the top frame on the vector stack and print its contents on the CRT terminal.

V. ( --- )

Display the contents of the entire vector stack on the CRT terminal This is a very useful command to inspect the vector stack without disturbing the size and contents of the vector stack. It calls a low level command (V.) to do the printing.

SAME ( --- )

Fill the SPAD with three parameters: the address, size, and increment of the top frame on the vector stack. This command is used to initialize the SPAD for AP function involving only one vector.

BINARY ( --- )

Fill the SPAD with 7 parameters, specifying that the topmost two frames are to be the source vectors and the second frame on the vector stack to be the destination vector for the following AP function. It is used to set up SPAD for binary vector functions like add, subtract, multiply, and divide.

INPLACE ( --- )

Fill the SPAD with 5 parameters, specifying an in-place AP function which uses the topmost vector frame as the source and the destination. ```````````

8. VECTOR STACK OPERATIONS

Most of the vector stack operators call their corresponding library subroutines installed in the PS memory. Taking their operands off the vector stack and leaving their results on the vector stack, the parameter lists required by the library subroutines can be generated automatically by the supporting commands like SAME, BINARY, and INPLACE. All the vector operators eventually call SFLDGO command, which passes the SPAD parameter list and executes the AP operation at the PS addresses given to it on the data stack.

The mnemonic names chosen in this FORTH system for AP operations mimic the names of their corresponding subroutine defined in the FPS math library, except those for which FORTH type generic names are more appropriate, like V+, V-, V# and V/. The AP operations included in this sample system are very limited in its scope. The AP functions are limited only by the availability of the subroutine library and the size of the PS memory.

SPLDGO ( address --- )

Reset AP120B, copy parameters into SPAD. and start AP120B at the PS address given on the data stack.

V0 ( --- )

Clear the top frame on the stack to zero's.

VDUP ( --- )

Duplicate the top frame on the vector stack.

VOVER ( --- )

Duplicate the second topmost frame and push it on the top.

VDROP ( --- )

Discard the topmost frame on the vector stack.

V+ ( --- )

Add the top two frames on the vector stack, pop the topmost frame, and replace the original second frame with the sum.

V- ( --- )

Remove the topmost two frames on the vector stack and push the difference vector on the stack (second-first).

V* ( - - )

Remove the topmost two frames on the vector stack and push the product vector on the stack.

V/ ( --- )

Remove the topmost two frames on the vector stack and push the quotient vector on the stack (second/first).

VSIN ( --- )

Convert the top frame on the stack to its sines.

VCOS ( --- )

Convert the top frame on the stack to its cosines.

VFILL ( source --- )

Push a new frame on the vector stack and fill this frame with the value stored in the MD address by the number on the data stack.

VRAMP ( source increment --- )

Push a new frame on the vector stack and fill this frame with a ramp function whose starting value and increment value are in the MD memory addresses by the top two numbers on the data stack.

9. DIAGNOSTIC TOOLS

A few commands for diagnosing the AP system and its operations are also defined for the user to inspect the registers and some parts of the memories in the AP120B.

STATUS ( --- )

Display the contents of all the registers in AP120B addressable by the Harris computer. It is a power tool to inspect the current status of the AP for diagnostics.

APSTEP ( --- )

Causes the AP to execute the next instruction in the PS memory.

APCONT ( --- )

Continue the AP execution from the point of last interruption.

10. A DEMONSTRATION SESSION

To demonstrate the interactive features of this AP operating system a short terminal session was recorded and shown in Figure 24. Prior to entering this AP operating system, an AP load module must be loaded into the PS memory. The load module listing is shown in Figure 25, which contains only a set of entry points with subroutine calls to the AP math library. It was assembled using APAL and the load module was generated by APLINK. The module with the library routines is then loaded into PS memory by APDBUG. A list of constants is also loaded via APDEBUG into the MD memory for testing. This list of constants is shown in Figure 26. After the AP is thus configured, the FORTH module is loaded into Harris 80 computer. The AP control program shown in Figure 2 is then compiled into the FORTH dictionary. At this point all the AP commands defined in the control program are available for execution and compilation, as commanded by the user through the Harris console terminal.

In the demonstration session shown in Figure 24, we started by pushing some constant vectors on the vector stack and performed a vector add. Contents of the stack were displayed using the V. command. Then a ramp vector was pushed on the stack and converted to a sine vector. The last part of the demonstration showed that a new function VTAN were defined and tested. The ability to compiled new commands from existing command set interactively is one of the very powerful features of the FORTH language.

11. CONCLUSION

It is demonstrated that the FPS AP120B array processor can be used interactively under the control of a FORTH operating system. Here the AP120B is used as a fixed instruction set slave processor to process vector arrays of fixed length and identical format. These restrictions reduces the APEX overhead to a minimum because the library subroutines are loaded into PS memory only once and remain static in the PS memory. FORTH operating system only has to start the AP at known entry points in order to invoke needed AP functions. Fixed vector format allows the manipulating of vector arrays using a LIFO vector stack structure. The vector stack greatly reduces the overhead in passing SPAD parameters because vector addresses can be computed automatically, and the AP commands can be issued, using very simple syntax, making AP a very friendly computing device.

The limitations on the fixed instruction set and on the fixed data format are artificial for this demonstration implementation. It is possible to reload or overlay PS memory with new subroutines, so that the AP can be reconfigured dynamically for multi-task applications. Any vector data structure can be accommodated if the user are willing to specify all the SPAD parameter explicitly.

A microassembler(3) to develop microcodes for bit-slice microprocessor was completed in this laboratory, under a FORTH operating system. It can be adapted to the Harris-AP120B system so that AP microcodes can be written and tested interactively. This facility, once completed, will allow us to bypass all the vendor software tools and used the array processor under a single operating system using a single language.

REFERENCES

1. C. H. Ting, 'fig-FORTH for NOVA-Computer', FORTH Interest Group, San Carlos, CA. . 1981.

2. See 'FORTH for the Harris 80 Computer' in this volume.

3. C. H. Ting, 'Microassembler' in 'Forth Notebook', Offete Enterprises, 1983, pp. 136- 169.

[1]