
Programmer's report for my Texas Instruments TMS320C30 simulator.

by Chris Moy

December 5, 1996

Supervising Professor:  Dr. Brian L. Evans


Commands

The C30 simulator is controlled through a command-line interface.
These are the current commands and their functions.

reset
flushes the pipeline, clears the status register, and resets the cycle count

load <filename> <address>
loads the contents of the file into simulated memory, starting at the
specified address, sets program counter to starting address

run
executes the current program until a breakpoint is encountered

step
executes one c30 cycle

+break <address>
adds a breakpoint at the specified address

-break <address>
deletes the breakpoint at the specified address

modify <location> <value>
loads the contents of the specified location (register or address)
with the new value

read <location>
displays the contents of the specified location (register or address)

cycle
displays the number of c30 cycles executed since the last reset

+dasm
turns on instruction decode display

-dasm
turns off instruction decode display (default)

Opcode Status

The instruction routines are the least developed part of the code.
Most of the opcodes are implemented, but not completely.  The opcodes
that are not completed are:

	PUSHF
	RETI
	RETIc
	RETS
	RETSc
	ROR
	ROL
	RORC
	ROLC
	RND
	STF
	SUBC
	SUBF
	TRAP
	TRAPc
	TSTB
	SWI

The floating-point instructions that are implemented are not 100%
compatible because they are stored as 32-bit numbers instead of the
required 40-bits.  All opcodes should be tested and examined carefully.
This was not my primary focus in this project.  This should, perhaps,
be the next development stage of the simulator:  completing the
instruction routines.

Timing

I tested the simulator in a single-repeat loop, executing a XOR3
instruction 10,000 times, on a Pentium-100 processor.  XOR3 is a
slower than average operation.  The total execution time was about
1.1 seconds, resulting in 9000 cycles per second.  This was better
than was hoped for.

Improvements

The disassembler/decode stage creates a bottleneck for the simulator.
Most of the time is spend finding the opcode in a table.  Each look-up
consists of a mask, then a comparison.  Then there is the overhead of
incrementing a counter.  I believe this bottleneck can be completely
removed by using a tree instead of a table.  Therefore, to find an
opcode, the routine would only have to traverse nine branches of the
tree.  This is because there are nine determining bits in the instruction
word.  The result would be a constant look-up time.

Future additions

There are several possible future development projects for this simulator.
These include:

Graphical user interface - for several platforms

Step specification - run routine accepts number of cycles to be run as
input, but the command-line interface does not provide an option to
specify this number; add this feature

Memory map update - include peripheral bus

Cache - write an algorithm in the fetch stage that checks for cache hits
and misses and writes to the cache when necessary

Interrupts - include interrupt register checking, timers, external
port checking, DMA

Alternate modes - currently in c30, microcomputer mode; include c31 and
microprocessor modes

Power consumption - add feature to calculate the power consumption of
the chip

Error checking - currently very little user-error checking; adding
error checking to improve stability

Breakpoint improvements - currently the simulator looks at every
breakpoint value between every cycle; make breakpoints a linked list
with insertions, or allow user to specify maximum number of breakpoints

Source

Programmers should refer to the TMS320C3x User's Guide to add features.
This will be essential for completing the instruction set and adding
hardware emulation features.
