Haskell FFT 1: The Discrete Fourier Transform
This is the first of a series of posts about implementing the Fast Fourier Transform in Haskell (and trying to make it go fast). This part is going to be pretty pedestrian, but we need to lay some groundwork before the interesting stuff.
Fourier transforms turn up everywhere. They’re important in pure mathematics, but they’re also used in pretty much any application that deals with time series: filtering, signal analysis, numerical solution of differential equations and many others. While the classical theory of Fourier analysis is based on functions, in applications we most often deal with discrete series of values sampled from a function. The discrete Fourier transform is how we do Fourier analysis in this setting.
In this series of posts, we’re not going to look very much at applications (well, maybe we’ll have one little exception a bit later) and we’re not going to talk a lot about the theory behind the Fourier transform. Instead, after briefly introducing the discrete Fourier transform, we’re going to use the fast Fourier transform algorithm and some variations to explore some aspects of Haskell programming. In particular, we’ll look at how Haskell deals with complex numbers, vectors, profiling and benchmarking, meta-programming and finally, as our pièce de résistance, we’ll use all of this to build a Haskell fast Fourier transform package that does compile-time empirical optimisation for arbitrary-sized transforms.
In practice, if you want to do Fourier transforms of discrete data, you use something called FFTW (“the Fastest Fourier Transform in the West”). Our code has no chance of competing with FFTW, which is incredibly cleverMatteo Frigo and Steven G. Johnson, “The Design and Implementation of FFTW3,” Proceedings of the IEEE 93 (2), 216-231 (2005). Invited paper, Special Issue on Program Generation, Optimization, and Platform Adaptation. PDF here, but we can demonstrate some of the methods that they use and learn some techniques along the way that are applicable to more complicated problems. (There are Haskell bindings for FFTW–the vector-fftw
package is a good choice.)
Visualising Large Numbers
I was out walking Winnie the other morning and got to thinking about the storage capacity of neural systems (I’m reading Nils Nilsson’s The Quest for Artificial Intelligence at the moment, which was what triggered this line of thought). I’ll write about that in another article, but I realised as I was figuring through this that I would end up talking about some rather large numbers. Everyone throws around big numbers (millions, billions, trillions, and up and up), but how often do you stop to visualise what these things mean?
There are some nice animations around that help with thinking about relative scales (for instance, this one is really good). What these things don’t do is to give you a sense of numbers. What does a billion of something look like? Can you get a really strong physical feeling for numbers of this magnitude and larger?
Here, I’ll show you what I do. It’s nothing ground-breaking, and it’s kind of obvious once you get started, but it is effective. It’s also quite a nice mental exercise.
Data Analysis in Haskell
I’m in the process of starting up a blogging series, to write about Data Analysis in Haskell. This is a sort of introduction/manifesto...
Cruel Britannia
by Ian Cobain
It used to be that there was a tiny splinter of a fragment of me that was proud to be British. Tolerance, fair play, all that nonsense. Glossing over Blighty’s imperial adventures, surely we could be just a little proud of the legacy of our great nation? Nothing jingoistic, of course, just a low-key sort of inoffensive smugness.
That splinter of a fragment had been feeling a bit dull and lustreless for a while now. We had Afghanistan, Iraq and associated war crimes, the “War on Terror” (a.k.a. Bash A Brown Person For Britain), all the NSA/GCHQ spying revelations, and on and on it went. And of course Tony Blair isn’t rotting in a cell in The Hague which, if you ever needed another one, is a glaring signal that whatever justice there is in the world, it never intrudes into the worlds of the “great” and the “good”.
And then I read Cruel Britannia. Oh jolly day.
Oh, to be a Dutchman
Part of the “flat route”
We’ve started running recently, with the slightly ridiculous goal of doing a marathon up a mountain in almost exactly a year’s time. (Idiots.)
All the stuff you read about running for beginners says “Stay away from the hills! Do nice easy runs on the flat!”. They talk about keeping to a conversational pace, or running so that you take one in-and-out breath for eight steps, or something like that. Holland would be perfect for beginner runners–nothing very steep, and certainly no long hills to wear your legs out.
Back to School!
The last couple of weeks have been a bit of a shock to the system. Rita went back to school two weeks ago, which meant that I went back to walking Winnie a couple of times a day (not quite every day, because Rita’s schedule often has gaps, but certainly a lot more than I’d done over the summer). Of course, I wanted to keep on doing all the other things I’d been doing: work (standing at my desk for 8+ hours per day), yoga (an hour or so every morning), circuit training (3-4 times per week), running (4 times a week), and so on.
Anyone who wasn’t an idiot would have seen where this was going. I was completely knackered by the end of the first week, was suffering a bit from “being on my feet”-related malaise (pain while running, pain while walking, pain while standing: oops). Silly boy.
Multilingual Blinkenlights!
Lights, mid-blink
I got hold of a BeagleBone Black a few weeks ago (courtesy of Tom Nielsen, who’d had it for a while, but had no time to play with it). This is a small (credit card sized) Linux machine with an ARM processor, intended for, well, pretty much anything you can use a little computer for. It has 512 Mb of RAM, plus 2 Gb of flash storage, so it’s not a completely trivial machine.
Obviously, to do anything really significant with it requires some hardware work (there are lots of general purpose I/O pins to play with, plus UARTs, Ethernet, USB, SPI, a couple of analogue-to-digital converters, PWM drivers for motor control, and even HDMI video output!), but there are a few LEDs on the board itself that lend themselves to a blinkenlights demo...
The BeagleBone comes with some built-in software to let people write code quickly, using JavaScript of all things. There’s also a bit of documentation about programming the thing in C. But (of course) I wanted Haskell blinkenlights!
So, the goal was to write a little bit of code to count in binary on the four user-addressable LEDs on the board, first in Javascript, then in C, then (somehow) in Haskell.
Learning the Haskell FFI with C2HS
I’ve recently started helping with the maintenance of C2HS, a tool for generating Haskell foreign function interface (FFI) bindings from C header files. I started using C2HS because of a Haskell library I was writing to read Unidata NetCDF files. I didn’t fancy writing all of the bindings to the hundreds of functions in the C NetCDF library by hand and C2HS seemed like a good way to get started with the Haskell FFI.
GitHub: Merging Repos, Migrating Issues
For the BayesHive project, we have code split over a number of different Git repositories, all hosted on GitHub. We use GitHub for issue tracking and keeping notes on wikis.
Some of the splitting into separate repositories makes sense (our web app is in its own repository, for example), but some of the splitting is mostly historical and makes things a little inconvenient–the infrastructure for the Baysig statistical modelling language lives in three different repositories, which made sense at one time, but doesn’t now. Having these closely related bits of code in different repositories makes it more difficult to track down regressions (git bisect
doesn’t work across multiple repositories) and can make branching messy (if you’re implementing a feature that requires work in multiple different repositories, you need to remember to create a branch in each of them, otherwise it’s easy to commit inconsistent changes to the master branch in one repo and your topic branch in another repo).
As a result of all this, we decided to merge some of our repositories together. Based on a bit of Googling, this seems to be a relatively common requirement, and the advice that’s on offer out there is a little conflicting and confusing. In this article, I’ll try to reduce some of the confusion related to repository merging, as well as showing how to migrate GitHub issues from the pre-existing multiple repositories into the new merged repository. I’ve written a couple of small Haskell programs to do these tasks–they’re available here.
Marmots!
Here be marmots!
On our trip to Kärnten the other week, we drove up past the Großglockner, the highest mountain in Austria. We didn’t actually see the thing, since it was wreathed in clouds, but there was some other cool stuff to see. There were marmots!
Now, we’ve seen marmots before. They’re pretty common in the mountains here, way up above the treeline. They live in among the rock piles up there and eat the little alpine plants. And Winnie loves them. And hates them.
Haskell Quasiquotation
For BayesHive, one of the things we need to do a lot of is program transformation. The main reason for this is to generate code to represent probability distribution functions derived from descriptions of probability models–the models are described using code written in the probability monad, and this needs to be manipulated quite extensively to determine the PDF, which is needed for the Markov chain Monte Carlo sampling we use for estimating parameters in probability models.
Program transformation is something that Haskell is really great at, since it basically just involves pattern matching against lots of different cases of syntax trees in the language you’re transforming. Unfortunately, this leads to lots of code that looks like this (picking a simple case at random):
... case last rvs of Observed (EApp (EApp (EVar "map") (EVar f)) (EVar nm)) -> rewriteFst f nm _ -> rvs ...
All that stuff with EApp
and EVar
is the explicit AST representation of the expression map f nm
in the language (Baysig) that we’re manipulating.
At first, doing things like this is pretty unavoidable, but wouldn’t it be nice if you could write something like this instead?
... case last rvs of Observed [baysig|map $f $nm|] -> rewriteFst f nm _ -> rvs ...
More Steel, Less Custard
We just spent a few days visiting friends in Kärnten and staying in their little mountain house. Apart from a little bit of rain and needing to extract Thomas’s car from the ditch where he’d decided to park it, it was a fun weekend. The best bit was that I got to spend lots of time talking to Rita, since there was quite a lot of driving. Work and Winnie things often result in us spending less time together than I’d like, so this was a nice change.
One day, the conversation came round to what it means to be “hard-core”. Rita said “We used to be tough!” and she was right. We used to do lots of things like white water kayaking, mountain biking and (especially) caving. Caving is funny: at least in the UK, if someone is a good caver, you don’t call them a “good” caver, you call them a “hard” caver. The sort of activity that often involves lying in cold water in the dark, crawling through mud for hours on end, has more of a requirement for toughness than most other things.
So, we did use to be pretty tough. Complaining was allowed, but you still had to suck it up and get on with whatever it was we were doing, even if it involved 600 metres of crawling through a narrow rocky tube resulting later in full-body bruising (Rita’s first and only trip down the notorious Daren Cilau in Wales). We even had a slogan: “More steel, less custard!” was the cry when spirits were flagging.
Angular ui-router with Yesod
For BayesHive, we have a fairly big Yesod web application plus a lot of client-side JavaScript code. We’ve gone through a couple of iterations of how we organise all this. We started with a quick and nasty manual approach to chaining state between different pages of the app. That was pretty horrible. Then, as described in an earlier article, we switched to using AngularJS in the browser, which made the JavaScript side of things much nicer, along with Michael Snoyman’s yesod-angular
module for managing the interaction between the client and server. Eventually, we found that Angular’s default routing provider wasn’t flexible enough for our needs in the browser (we have quite a few “wizard”-type stateful interactions, plus a fairly complex dashboard with lots of more independent states).
So we decided to switch over to using the Angular ui-router
state-based routing system. This is pretty good and is definitely flexible enough for our current needs. It allows you to define a hierarchical structure of UI states, cleanly controlling transitions between states and assigning URLs to states to allow good interaction with the browsers back and forward buttons and to allow bookmarking of application states. The only problem? Not something that the yesod-angular
code supports...
Twenty Palaces
by Harry Connolly
Child of Fire, Game of Cages, Circle of Enemies
I have a bit of a science fiction and fantasy habit. If you read a lot of SF&F, it can be hard to find the diamonds among the dross. For some reason, the sub-genre usually called urban fantasy has more than its fair share of dross, which made it a pleasant surprise to discover Harry Connolly’s Twenty Palaces series. These three novels (plus a more recently published prequel) hit a lot of the usual urban fantasy tropes (fairly violent, noirish characters, magic that isn’t all ponies and rainbows, a “world behind the world”) but it confounds a lot of others in ways that make it anything but dross.
All of which makes it hard to understand why these books haven’t been much more successful than they have, and why Connolly has had to abandon the series for the moment to concentrate on other things. From what he says, his publisher (Del Rey) was unfailingly supportive, but the market just doesn’t seem to like what he was writing. He wrote a long blog article about why this might be, but it makes little sense–the most common complaints about the series are about the things that make it interesting and different!
Getting down and dirty with Haskell
I’ve started using c2hs recently, for a Haskell NetCDF library I’m writing (which might see the light of day in a couple of months). I like c2hs, so I volunteered to help out with maintenance. The first thing we did was transfer the source code repository from Darcs to GitHub, which was easy, using Steve Purcell’s darcs-to-git utility. Once that was done, the next task was to transfer all the tickets on the c2hs Trac site to GitHub’s issue tracking. That’s what I want to talk about here.