This series of blog posts describes the development of a pure Haskell arbitrary length fast Fourier transform library. This goes well beyond the usual textbook treatment of the FFT, dealing with efficient decomposition of transforms of arbitrary lengths (i.e. not just powers of two), specialised transforms for small sizes and treatment of prime-length transforms using Rader’s algorithm.

As well as implementation of the main algorithms, there’s quite a bit of material covering empirical optimisation of this type of numerical code, using profiling and benchmarking to guide the optimisation process, and using gradual refinement with localised mutability, unboxing and other optimisations to eventually get within a factor of five or so of the vector-fftw package (“the fastest Fourier transform in the West”).

1. The Discrete Fourier Transform: Introduction to the DFT and a naïve implementation.

2. Matrix factorisation: The Cooley-Tukey approach to factorising Fourier matrices.

3. Cooley-Tukey for Powers of Two: The “textbook” Cooley-Tukey algorithm for powers-of-two input lengths.

4. A Simple Application: Audio filtering example.

5. Transforming Vectors of Arbitrary Lengths: Extending the Cooley-Tukey idea to arbitrary input lengths.

6. Implementing the Mixed-Radix FFT: Haskell code (and testing) for the arbitrary length Cooley-Tukey algorithm.

7. Where We’ve Got To, Where We’re Going: Taking stock.

8. Benchmarking Experiments: Benchmarking with Criterion, and a plan for optimisation.

9. Prime-Length FFT With Rader’s Algorithm: Dealing with prime input lengths– some number theory, some group theory, and achieving $O(N \log N)$ scaling.

10. Building a package: Code organisation.

11. Optimisation Part 1: Mutable vectors, unboxing.

12. Optimisation Part 2: Plan generation: heuristics and empirical plan selection.

13. Optimisation Part 3: Improvements to Rader’s algorithm, compiler flags, final benchmarking.

14. Wrap-up: What we’ve done, lessons learned, what’s left to do.