Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
gabriel-tenma-white authored Sep 1, 2019
1 parent a7cd9be commit 8b4cfad
Showing 1 changed file with 10 additions and 12 deletions.
22 changes: 10 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Resource usage is on par with Xilinx FFT IP core, and Fmax is up to 30% higher f
# Architecture
The basic architecture is based on subdividing a size N = N1*N2 FFT into N2 FFTs of size N1 followed by reordering and multiplication by twiddle factors, then N1 FFTs of size N2.

![block diagram](overview.png)
![block diagram](docs/diagrams/overview.png)

# Usage
Top level VHDL code is generated by the script codegen/gen_fft.py. The VHDL sub-blocks in this repository are referenced by the generated code.
Expand All @@ -56,25 +56,23 @@ fft4096 = \
FFT4Step(4096,
FFT4Step(64,
FFT4Step(16,
FFTBase(4, 'fft4_serial3', 'SCALE_NONE'),
FFTBase(4, 'fft4_serial3', 'SCALE_NONE')),
FFTBase(4, 'fft4_serial3', 'SCALE_NONE')),
fft4_scale_none,
fft4_scale_none),
fft4_scale_none),
FFT4Step(64,
FFT4Step(16,
FFTBase(4, 'fft4_serial3', 'SCALE_DIV_N'),
FFTBase(4, 'fft4_serial3', 'SCALE_DIV_N')),
FFTBase(4, 'fft4_serial3', 'SCALE_DIV_N')),
fft4_scale_div_n,
fft4_scale_div_n),
fft4_scale_div_n),
16); # twiddleBits
```
FFTBase represents a base FFT implementation (butterfly), and FFT4Step represents the combination of two sub-FFTs to form a larger FFT of size N1*N2.

Scaling modes for fft4 are SCALE_NONE (do not scale), SCALE_DIV_N (divide by N), and SCALE_DIV_SQRT_N (divide by sqrt(n)). For best accuracy defer scaling until it is necessary (like shown above).
fft4_scale_none is a base FFT instance (butterfly), and FFT4Step represents the combination of two sub-FFTs to form a larger FFT of size N1\*N2. See user guide for more details.

To use a generated FFT core it is necessary to generate all the twiddle ROM sizes used (twiddle ROM size is equal to N). For N <= 32 use gen_twiddle_rom_simple.py, and gen_twiddle_rom.py otherwise. On Linux the shell script gen_roms.sh will generate all common twiddle bit sizes and depths.

Data input and output order are described as an address bit permutation. The exact permutation varies by layout and can be found in the comments at the top of generated files. The script also has an option to generate input/output reorderers for a given FFT definition. For a natural order FFT use do_gen_fft which also generates reorderers and a wrapper (that instantiates and wires up the reorders and the FFT core).
Data input and output order are described as an address bit permutation. The exact permutation varies by layout and can be found in the comments at the top of generated files. The do_gen_fft script will generate reorderers and natural order wrappers in addition to the FFT core.

When composing a FFT layout prefer 4-FFT butterflies over 2-FFT. The new 4-FFT butterfly (fft4_serial7) internally uses a single double-rate 2-FFT butterfly multiplexed between input and internal transposer data for the lowest possible LUT usage. No DSP units are used inside a 4-FFT butterfly. A 2-FFT butterfly is only needed for FFT sizes that are not perfect square.
When composing a FFT layout prefer 4-FFT butterflies over 2-FFT. Several 4-FFT butterfly implementations are provided, each with a different tradeoff between area and speed. A 2-FFT butterfly can be used for FFT sizes that are not perfect square.

**Timing constraints**

Expand Down

0 comments on commit 8b4cfad

Please sign in to comment.