Introduction

The sf32 family of 32-bit microprocessors is targeted at applications where high performance and small core sizes are most important. Fixed length 32-bit instruction coding enables low decoding complexity which results in high clock rates and small core foot prints. Multiple ISAs are available to address control & computing as well as DSP applications with optimized solutions.

sf32b: Base ISA for general purpose control & computing
sf32d: DSP ISA with extensions for high precision DSP & audio applications

Resources

Quick_Reference_Guide

Evaluation_Package (contains free sf32bu)

Details

ISAs

Base DSP

Implementations

sf32bu sf32bl sf32dl

ISAs (Instruction Set Architectures)

Base ISA features

The sf32b is a 32-bit microprocessor architecture for embedded control & computing applications. Main focus of the ISA definition is on high clock rates and small core implementations.

The sf32b is a load/store architecture. All operands of computation instructions are either constants or contained in registers. Load/store instructions are used to transfer operands between registers and memory.

The sf32b defines a generic and complete instruction set for efficient high level language compiler implementations.

Features

Harvard architecture with separate instruction and data buses
4GBytes instruction address space
4GBytes data address space
Fixed length 32-bit instruction coding
16 interrupts with programmable start addresses
24 x 32-bit general purpose registers plus 7 special registers
System (protected) and application operation modes
Native support for 8-bit, 16-bit and 32-bit signed and unsigned integer data types
Higher precision integer and float data types supported by multi-instruction sequences
Rich set of load/store addressing modes, including indirect with index and update
Little endian byte ordering
Load/store multiple instructions for code efficient copying and function prologue/epilogue
Bit manipulation & test instructions: set, clear, toggle & test
32*32 multiply with either 32-bit high word or 32-bit low word results
Instructions for endianess conversion
Flexible debug concept with application specific debug modules

Resources

ISA_Reference_Manual

DSP extension ISA

The sf32d is an extension of the sf32b base ISA and is fully backward compatible with the sf32b. Main target are 32-bit DSP in general and specifically audio applications. The DSP extension adds only a few special registers but no general purpose registers to the programming model of the sf32b ISA. Main additions are addressing modes with memory source operands and special add/subtract instructions that improve the performance of audio and general DSP algorithms. With the same pipeline architecture and computation resources implementations of sf32d processors are only slightly larger than base ISA implementations.

The sf32d deviates from the puristic load/store architecture of the sf32b. Some performance critical extension instructions have one source operand in memory.

The instructions with additional addressing modes and the additional, special add/subtract instructions can't be used easily from high level languages. The targeted use model suggests hand optimized assembler routines for performance critical DSP functions using the special addressing modes and instructions. The less performance critical higher layers and control code is written in C and compiled to the sf32b base ISA instruction set.

Extension Features

Multiply-high and MAC (Multiply & Accumulate) instructions with one source operand in memory
Optional 1-bit or 2-bit left-shift before accumulation
Multiple indirect addressing modes for memory source operands with offset, index and auto-update
Add/sub instructions with preceding left-shift of one source operand
32-bit iterative Divide instruction
Clip to signed 16-bit, clip to signed maximum and clip to unsigned byte instructions
Dual entries accumulation extension cache (patented) for sum-of-products calculation with 64-bit precision

Implementations

sf32bu

The sf32bu ultra light implementation is focused on low resource comsumption. A dual-ported RAM implements the general-purpose and most special registers. Instructions with two register source operands have 2 cycles effective execution time. A combined ALU/load/store unit has shared resources for computation and load/store instructions. Load/store instructions have 2 cycles effective execution time. Shift instructions are executed iteratively with 1 bit per cycle. Together this leads to an average IPC of ~0.5 which is still good enough for many embedded control & compute applications.

Features

Focus on low resource consumption
Size: ~2300 LEs + 1 block RAM on FPGAs
32-bit/32-bit instruction/data buses
Register-file with 1/1 read/write ports, can be implemented as dual-ported RAM
Average IPC (Instr. per Cycle) of ~0.5
Max clock ~110Mhz on low end FPGAs
Iterative shift execution, 1 bit per clock

Resources

IMA_Reference_Manual

sf32bl

The sf32bl light implementation is focused on high performance and moderate resource consumption. A 3 read-ports register-file makes sure that all instructions potentially can be executed in one cycle effective. A decoupled unit for instruction fetch and flow-instruction execution together with separate execution units for computation and for load/store instructions enable high pipeline throughput. Branch speculation, loop cache and conditional instructions minimize performance penalties of program flow changes. Average IPCs strongly depend on instruction sequences e.g. branches and operand dependencies. Performance optimized sequences can get close to an IPC of 1, with the loop cache loop execution with IPCs > 1 is possible.

Features

Focus on performance
Size: ~5600 LEs on FPGAs
32-bit/32-bit instruction/data buses
Register-file with 3/2 read/write ports
Max clock ~110Mhz on low end FPGAs
Average IPC (Instr. per Cycle) ~0.8
Decoupled unit for instruction fetch and flow-instruction execution
Branch speculation
Separate execution pipelines for computation and load/store instructions
Barrel-Shifter, single cycle effective shift execution
Loop Cache, zero-cycle loop branch from 2nd iteration

Resources

IMA_Reference_Manual

sf32dl

The sf32dl light implementation has the same basic pipeline architecture as the sf32bl. Only difference is an extra execution unit for multiply/MAC instructions. To support this unit the register-file is upgraded to 4/3 read/write ports.

Features

Focus on DSP performance and precision
Size: ~7500 LEs on FPGAs (estimated)
32-bit/32-bit instruction/data buses
Register-file with 4/3 read/write ports
Max clock ~100Mhz on low end FPGAs
Average IPC (Instr. per Cycle) ~0.9
Barrel-Shifter, single cycle effective shift execution
Loop Cache, zero-cycle loop-back branch from 2nd iteration
Single cycle effective MAC instructions with one register and one memory source operand
Non-blocking divide

sf32 family of 32-bit processors

Navigation

Contact

Introduction

Resources

Details

ISAs

Implementations

ISAs (Instruction Set Architectures)

Base ISA features

Features

Resources

DSP extension ISA

Extension Features

Implementations

sf32bu

Features

Resources

sf32bl

Features

Resources

sf32dl

Features


			sf32 family of 32-bit processors

	Navigation Home Impressum Info sf16 sf20 sf32 eco16 eco32 Ethernet SD card Contact info@racors.com		Introduction The sf32 family of 32-bit microprocessors is targeted at applications where high performance and small core sizes are most important. Fixed length 32-bit instruction coding enables low decoding complexity which results in high clock rates and small core foot prints. Multiple ISAs are available to address control & computing as well as DSP applications with optimized solutions. sf32b: Base ISA for general purpose control & computing sf32d: DSP ISA with extensions for high precision DSP & audio applications Resources Quick_Reference_Guide Evaluation_Package (contains free sf32bu) Details ISAs Base DSP Implementations sf32bu sf32bl sf32dl ISAs (Instruction Set Architectures) Base ISA features The sf32b is a 32-bit microprocessor architecture for embedded control & computing applications. Main focus of the ISA definition is on high clock rates and small core implementations. The sf32b is a load/store architecture. All operands of computation instructions are either constants or contained in registers. Load/store instructions are used to transfer operands between registers and memory. The sf32b defines a generic and complete instruction set for efficient high level language compiler implementations. Features Harvard architecture with separate instruction and data buses 4GBytes instruction address space 4GBytes data address space Fixed length 32-bit instruction coding 16 interrupts with programmable start addresses 24 x 32-bit general purpose registers plus 7 special registers System (protected) and application operation modes Native support for 8-bit, 16-bit and 32-bit signed and unsigned integer data types Higher precision integer and float data types supported by multi-instruction sequences Rich set of load/store addressing modes, including indirect with index and update Little endian byte ordering Load/store multiple instructions for code efficient copying and function prologue/epilogue Bit manipulation & test instructions: set, clear, toggle & test 32*32 multiply with either 32-bit high word or 32-bit low word results Instructions for endianess conversion Flexible debug concept with application specific debug modules Resources ISA_Reference_Manual DSP extension ISA The sf32d is an extension of the sf32b base ISA and is fully backward compatible with the sf32b. Main target are 32-bit DSP in general and specifically audio applications. The DSP extension adds only a few special registers but no general purpose registers to the programming model of the sf32b ISA. Main additions are addressing modes with memory source operands and special add/subtract instructions that improve the performance of audio and general DSP algorithms. With the same pipeline architecture and computation resources implementations of sf32d processors are only slightly larger than base ISA implementations. The sf32d deviates from the puristic load/store architecture of the sf32b. Some performance critical extension instructions have one source operand in memory. The instructions with additional addressing modes and the additional, special add/subtract instructions can't be used easily from high level languages. The targeted use model suggests hand optimized assembler routines for performance critical DSP functions using the special addressing modes and instructions. The less performance critical higher layers and control code is written in C and compiled to the sf32b base ISA instruction set. Extension Features Multiply-high and MAC (Multiply & Accumulate) instructions with one source operand in memory Optional 1-bit or 2-bit left-shift before accumulation Multiple indirect addressing modes for memory source operands with offset, index and auto-update Add/sub instructions with preceding left-shift of one source operand 32-bit iterative Divide instruction Clip to signed 16-bit, clip to signed maximum and clip to unsigned byte instructions Dual entries accumulation extension cache (patented) for sum-of-products calculation with 64-bit precision Implementations sf32bu The sf32bu ultra light implementation is focused on low resource comsumption. A dual-ported RAM implements the general-purpose and most special registers. Instructions with two register source operands have 2 cycles effective execution time. A combined ALU/load/store unit has shared resources for computation and load/store instructions. Load/store instructions have 2 cycles effective execution time. Shift instructions are executed iteratively with 1 bit per cycle. Together this leads to an average IPC of ~0.5 which is still good enough for many embedded control & compute applications. Features Focus on low resource consumption Size: ~2300 LEs + 1 block RAM on FPGAs 32-bit/32-bit instruction/data buses Register-file with 1/1 read/write ports, can be implemented as dual-ported RAM Average IPC (Instr. per Cycle) of ~0.5 Max clock ~110Mhz on low end FPGAs Iterative shift execution, 1 bit per clock Resources IMA_Reference_Manual sf32bl The sf32bl light implementation is focused on high performance and moderate resource consumption. A 3 read-ports register-file makes sure that all instructions potentially can be executed in one cycle effective. A decoupled unit for instruction fetch and flow-instruction execution together with separate execution units for computation and for load/store instructions enable high pipeline throughput. Branch speculation, loop cache and conditional instructions minimize performance penalties of program flow changes. Average IPCs strongly depend on instruction sequences e.g. branches and operand dependencies. Performance optimized sequences can get close to an IPC of 1, with the loop cache loop execution with IPCs > 1 is possible. Features Focus on performance Size: ~5600 LEs on FPGAs 32-bit/32-bit instruction/data buses Register-file with 3/2 read/write ports Max clock ~110Mhz on low end FPGAs Average IPC (Instr. per Cycle) ~0.8 Decoupled unit for instruction fetch and flow-instruction execution Branch speculation Separate execution pipelines for computation and load/store instructions Barrel-Shifter, single cycle effective shift execution Loop Cache, zero-cycle loop branch from 2nd iteration Resources IMA_Reference_Manual sf32dl The sf32dl light implementation has the same basic pipeline architecture as the sf32bl. Only difference is an extra execution unit for multiply/MAC instructions. To support this unit the register-file is upgraded to 4/3 read/write ports. Features Focus on DSP performance and precision Size: ~7500 LEs on FPGAs (estimated) 32-bit/32-bit instruction/data buses Register-file with 4/3 read/write ports Max clock ~100Mhz on low end FPGAs Average IPC (Instr. per Cycle) ~0.9 Barrel-Shifter, single cycle effective shift execution Loop Cache, zero-cycle loop-back branch from 2nd iteration Single cycle effective MAC instructions with one register and one memory source operand Non-blocking divide