ADD: SD/MMC interface for SCSI hard disc emulation.
ENH: FPU Overhaul. Pipelined operations.
ENH: PIPE5 fast stores.
And various tweaks here and there.
Fast stores for PIPE5
This is a small optimization for certain forms of the ST instructions which only need two register reads.
From the beginning, the FPU was meant to be pipelined. Floating point instructions already lasted several cycles, but they were processed independently. The FPU is a large piece of combinatorial mess (larger than the IU), with double precision data paths, it is too “expensive” not to be fully exploited. It needed a big overhaul.
Now, many instructions have ‘one cycle throughput’: as long as operands are independent, the FPU can begin a new instruction every cycle. Exceptions are double precision multiply, divisions and square roots and operations with denormal numbers.
The new architecture also features a few optimizations, like faster forwarding of results between dependent instructions.
Last but not least, instructions now take one less cycle to complete: A FADD now lasts 4 cycles instead of 5.
In addition to the existing Xilinx SystemACE CompactFLASH interface, it is now possible to use a SD/SDHC/MicroSD/MMCplus/MMCmobile card as the [SCSI] hard disk, using a small daugterboard with a SD connector and a few resistors plugged into the SP605 board.
This new interface is potentially faster (with quality SD/MMC cards) and not being based on obsolete proprietary chips like SystemACE, it could help porting the design to other boards…
The hardware debugger can control CPU operations from a serial port. It can run and stop the CPU, pick some registers and make it execute any instruction; it also manages one data and one instruction breakpoint.
As there is only one serial port on the SP605 board, it must be used both for the console and for the debugger, toggling is done using either the CTS signal or special “break” characters.
In the archive there is also a C program which is both a terminal communicating over the serial port and a debug monitor able to control the hardware debugger. Using this program, you can read and write registers, read and write virtual and physical memory, disassemble code, manage breakpoints, etc.
This debugger program is not complete, but it is already quite useful.
The most important missing feature is probably a GDB interface.
There are also some blocks which were moved to simplify retargetting the design to other boards, and a few tweaks for compatibility with Alteras (Quartus can compile the not-Xilinx-specific-parts of R4)…
All these subjects will be detailed in future articles. The FPU deserves many.
This development cycle was quite difficult; the release comes much later than expected.
And there will be certainly a R4.1 soon!