doc/transforms.md: add document title and fix heading structure

Add a top-level title and demote former section headings (MD041-style hierarchy).

Add blank lines around headings and fenced code blocks where appropriate (MD022 and MD031-style). Some Markdown parsers, including kramdown, only recognize headings that are preceded by a blank line.
This commit is contained in:
Arien Shibani
2026-04-18 16:05:43 +02:00
committed by Gyan Doshi
parent 6e3366e9bc
commit 849e8307ce
+17 -9
View File
@@ -1,14 +1,17 @@
The basis transforms used for FFT and various other derived functions are based
on the following unrollings.
# Transforms
The basis transforms used for FFT and various other derived functions are based on the following unrollings.
The functions can be easily adapted to double precision floats as well.
# Parity permutation
## Parity permutation
The basis transforms described here all use the following permutation:
``` C
void ff_tx_gen_split_radix_parity_revtab(int *revtab, int len, int inv,
int basis, int dual_stride);
```
Parity means even and odd complex numbers will be split, e.g. the even
coefficients will come first, after which the odd coefficients will be
placed. For example, a 4-point transform's coefficients after reordering:
@@ -33,7 +36,8 @@ register or 0. This allows to reuse SSE functions as dual-transform
functions in AVX mode.
If length is smaller than basis/2 this function will not do anything.
# 4-point FFT transform
## 4-point FFT transform
The only permutation this transform needs is to swap the `z[1]` and `z[2]`
elements when performing an inverse transform, which in the assembly code is
hardcoded with the function itself being templated and duplicated for each
@@ -80,7 +84,8 @@ static void fft4(FFTComplex *z)
}
```
# 8-point AVX FFT transform
## 8-point AVX FFT transform
Input must be pre-permuted using the parity lookup table, generated via
`ff_tx_gen_split_radix_parity_revtab`.
@@ -193,7 +198,8 @@ This theme continues throughout the document. Note that in the actual assembly c
the paths are interleaved to improve unit saturation and CPU dependency tracking, so
to more clearly see them, you'll need to deinterleave the instructions.
# 8-point SSE/ARM64 FFT transform
## 8-point SSE/ARM64 FFT transform
Input must be pre-permuted using the parity lookup table, generated via
`ff_tx_gen_split_radix_parity_revtab`.
@@ -305,7 +311,8 @@ static void fft8(FFTComplex *z)
Most functions here are highly tuned to use x86's addsub instruction to save on
external sign mask loading.
# 16-point AVX FFT transform
## 16-point AVX FFT transform
This version expects the output of the 8 and 4-point transforms to follow the
even/odd convention established above.
@@ -445,7 +452,8 @@ static void fft16(FFTComplex *z)
}
```
# AVX split-radix synthesis
## AVX split-radix synthesis
To create larger transforms, the following unrolling of the C split-radix
function is used.
@@ -705,8 +713,8 @@ beginning to overlap, particularly `[o1]` with `[0]` after the second iteration.
To iterate further, set `z = &z[16]` via `z += 8` for the second iteration. After
the 4th iteration, the layout resets, so repeat the same.
## 15-point AVX FFT transform
# 15-point AVX FFT transform
The 15-point transform is based on the following unrolling. The input
must be permuted via the following loop: