I found myself doing something weird recently. Not for the first time, of course, but this was a special kind of weird.
I was editing a file that’s part of my websiteIn Emacs. Of course. Emacs abides.. It’s the implementation of the search form (reachable from the little magnifying glass in the nav bar). This ends up as a normal HTML page, for sure, but it’s written using an unholy mishmash of Pollen syntax, Alpine.js custom attributes and JavaScript. That makes it unenjoyable to edit, since syntax highlighting and language-aware indentation don’t really work.
This mixed language thing is well-known from React and Vue and Svelte and all those JavaScript frameworks that (one way or another) put HTML templates, CSS and JavaScript in the same file. People using React, Vue or Svelte enjoy the economies of scale that come from using popular things, and there are nice editor modes for them.
If you do weird stuff, you’re on your own.
Actually, you’re not on your own. Polymode has your back. You can read about it below the fold.
Click here to read Effortless Emacs Multiple Major Modes with Polymode
]]>One thing I wanted to add to my new and shiny Pollen website was full-text search. I have about 200 old blog articles on here, and it would be convenient to be able to find things now and then. It seemed like an easy enough thing to do, especially since I only intended to do it at the level of software bricolage, using existing tools.
Click here to read Full-Text Search with Minisearch and Alpine.js
]]>Let’s start with a digression.
It will come as no surprise to anyone who knows me even a little bit that I’m just a tad disappointed with the modern internet. You look back at the visions of the early workers in hypertext and networking, and their open-hearted dreams of a world of free information bring a little tear to the eye. Fast forward to 2021, and the web is 90% spam, social media giants and ad-tech. It’s not the shiny future we were promised! It’s a swamp.
The alternative title for this article was “The FrankenBlog: It Lives!”. You’ll see why “FrankenBlog” in a minute.
The idea here was to convert my website and blog over to use Pollen, Matthew Butterick’s marvellous Racket-based authoring system. The main reason for wanting to do that is that I have a couple of longer-form writing projects in mind, and Pollen gives much more flexibility than writing in Markdown or something similar.
Starting a new blog with Pollen is easy. Unfortunately, I had some history I wanted to preserve, some new ideas I wanted to try, and a whole row of yaks lined up waiting for a shave.
I’ve been doing some work recently with a little Silicon Labs EFM8UB3 microcontroller. This is a nice 8051 clone with a bunch of peripherals all in a QFN24 package. I’ve been using a minimal development board Silicon Labs calls the Thunderboard. It’s mostly fine, but it has one failure mode that’s pretty annoying. This article is about how to back out of that failure mode if you encounter it.
The Thunderboard has a built-in JLink debugger, and it’s unfortunately possible to put the MCU into a state where the debugger can’t talk to it. That means that you can’t program the board any more, so you might think you’re stuck. You’re not!
One thing I need to do for my pump monitor is to determine when the pump is running. I’m going to do this by measuring the current flowing in the power cable to the pump. That seems like it ought to be an easy thing to do, but doing it safely (it’s a 240V AC mains cable to the pump) and reliably is more complicated than I expected. There might be a simpler way to do things than what I’m trying to do, but this is a good opportunity to do a little bit of analogue electronics and to learn some things. (My analogue electronics skills are negligible, so I’m always on the lookout for small projects or parts of projects that have the right “challenge level” to help me get better!)
We live on the side of a hill where rainwater runs down into our garden, gathering at the lowest point to make a “sumpy” patch. The soil has a high clay content, so the runoff is efficient. We also have a cellar. (You can see where this is going already, right?) We have a shaft next to the house that contains a pump to clear out water draining into the sumpy area. A couple of months ago I took the lid off the shaft and found the pump under two metres of water, not running. Opening up the stairs down to the cellar revealed half a metre of water down there... Not good.
Kolyma is a region in the far east of Russia. It’s brutally cold, sparsely vegetated, mostly covered in permafrost, and has huge mineral reserves. It was a “favourite” destination of convicts in the Soviet penal system.
I’d not even really heard of the place until a few months ago, but since then I’ve read two books set in Kolyma. Two very different books, one a modern thriller and one something completely other.
In this final article of three, we’re going to make our DMA-based ADC example from the second article run off a timer, and we’ll do a small demonstration of how this might be used in a realistic application.
There’s a video demonstrating each of the examples covered in this series of articles. It’s probably most useful to watch it in conjunction with reading the articles.
In this second article of three, we’re going to change our polling ADC example from the first article to use DMA.
DMA is a subject that is unavoidably a little complicated. It’s useful to understand what DMA really is to get an idea of where that complexity comes from. That might help to build some enthusiasm to battle your way through the DMA section of the reference manual!
There’s a video demonstrating each of the examples covered in this series of articles. It’s probably most useful to watch it in conjunction with reading the articles.
I’ve been doing some STM32 programming recently as part of my Mini-Mapper project (using an STM32F767ZI). I needed to collect samples from several analog inputs at a fixed frequency, for monitoring motor torque. A simple thing to do, right? But the obvious way to do it isn’t necessarily the best.
In this series of three articles, I’m going to try to show a better way. Some of this will be quite boring (it’s just configuring microcontroller peripherals, after all), so I allowed myself a bit of time for a fun “finisher” at the end.
There’s a video demonstrating each of the examples covered in this series of articles. It’s probably most useful to watch it in conjunction with reading the articles.
More datasheets and application notes: starting to learn about DACs in the Analog Mini-Tutorials series, another DC/DC converter (in a slightly less silly package), plus a couple of randoms.
An interesting application note about crystals (I know very little about them, and need to know more) and a paper about how best to solder QFN packages, plus some jellybeanish analogue ICs and a tiny tiny DC/DC converter.
Bit behind on this again, but I have some interesting ones for next week, so maybe that will make me stick with it better.
Got a bit distracted with reading microcontroller datasheets and reference manuals for some projects, mostly for the ST Micro STM32F767ZI and for the Silicon Labs “Universal Bee” series (I’m using the EFM8UB3 for a project at the moment). But here’s some “normal” datasheet reading.
I just this week finished a project that has been hanging around for a while, mostly because it turned out to be much more work than I estimated. Earlier this year, I did some work using Nordic Semiconductor’s nRF52840 ARM chip. Although the chip is nice, I didn’t enjoy the software development experience using the tools that Nordic recommends, so I decided to do a “little” project to review some of the alternatives.
Turned out not to be so little, after all.
In any case, you can read the results on the project page. The highlights of the work were definitely learning about the Zephyr project and CircuitPython. I’m going to try to do some Rust stuff on the nRF52840 later to get an idea of how practical that is too, but I need a break from this stuff for a while first!
The whole “datasheet a day” thing got slightly derailed this month. I have been reading datasheets for projects, but my “datasheet a day” reading has been spent on Op Amps For Everyone, which is 464 pages of opampy goodness from Ron Mancini at TI.
I certainly didn’t understand it all, and some sections I just skimmed, but I definitely have a better understanding of some things to do with opamp design. I was treating it mostly as a warm-up for Chapter 4 of The Art of Electronics, which I hope to start on this week or next.
Next week, back to normal.
Slightly fewer datasheets this week, because I’m reading “Op Amps For Everyone”, which is a 464-page thing about op amps from TI. That’s a bit too long to manage one of them in a day!
I tried one new thing this week with the datasheets. I took a number of datasheets for “standard” diodes, and just looked through them to get a sense of the kinds of parameters that are usually quoted for these things. It was a useful exercise, and I’ll do the same for some transistors next week. Otherwise, I continued with ploughing through the Analog Devices “mini-tutorials” series. They’re of varying difficulty and relevance, but they all feel like things it would be good to know about.
I’ve had some time to work on my Teensy Load project recently. I got the boards a couple of weeks ago, assembled one, found that it didn’t really work, and tracked that problem down to a wrong-way-round diode (due to a weird symbol definition I’d used from the wrong KiCad library).
And yesterday I did some end-to-end testing using the tl-meter
software that I wrote. And it (partially) worked!
More datasheets!
My regular Sunday datasheets round-up...
More datasheets/app notes/unclassifiable electronics things. I should get around to writing some blog articles about something else as well, eh?
The daily datasheet thing continues. It’s been surprisingly easy to get in at least one a day, mostly because a lot of them are quite short. This week, one day was taken up with reading a big pile of datasheets for motor driver ICs, which was repetitive but educational.
Here’s this week’s datasheet-a-day reading...
I started a new thing this last week. I’ve been impressed by Adrian Colyer’s the morning paper for a long time. It’s a project where Colyer read and commented on a computer science paper every week day during term time, and wrote a summary of the paper. Some of the summaries are quite long. It was really useful, for those of us who were working in a similar field, because it meant that we could get a quick overview of papers without going to the trouble of reading them.
I have no intention of displaying the level of diligence that Colyer did, but it did make me think about a less ambitious “read a thing every day” project. I’ve been thinking of it as “datasheet a day”, but I’ve been reading a mixture of datasheets and application notes.
I completed a fun little electronics project yesterday: a small solder fume extractor based on a PC fan. I’ve been calling it the Solder Snorter, in homage to Jon Thomasson’s Solder Sniffer 9000—this thing is more or less a redo of that idea.
Just as I was starting to make some progress on personal projects, we had to move house. Our landlord got divorced and needed to move into our flat herself. (That’s about the only reason that you can be forced to move out of rented property before the end of a contract here.) We were pretty annoyed about it to start with, because we liked the place a lot and the rental property market in Villach is not great right now.
However, we lucked out. Oh, how we lucked out. After seeing a few not so nice places, we realised we were going to need to spend a little more than we’d been planning, so we expanded what we were looking at. And we found a little house for rent in Drobollach, a couple of minutes walk from Faaker See.
I’ve been making some progress with Contextual Electronics, and have been learning a lot, although some of the learning has been a little painful and frustrating.
I’ve started learning electronics recently, following Chris Gammell’s Contextual Electronics course, which is really good. One of the first exercises there is called “Getting to Blinky”, and it’s mostly about getting used to using the KiCad EDA suite, and getting over the initial barrier to getting PCBs made.
I worked through the tutorial, but didn’t really feel like sending boards out just for a blinky. I wanted to do something more entertaining. It seemed like it might be fun to make blinkies that blinked messages. In Morse code? Yeah, why not?
It’s hard to believe that I’ve not written anything here since 2017. That’s a long break. But now I’m back, with lots of new ideas and projects and stuff!
Anyway, what’s happened in the last couple of years? Quite a lot. Here are the highlights:
Houses sold: 1
Jobs quit: 2
Passports surrendered: 1
Last spring, I was taking a look at what I’d been reading recently. I read a lot of novels. Reading a novel seems to be my default state. I read a lot of science fiction, some crime novels, some historical fiction, some “straight” fiction (whatever that really is). But what I noticed was that the authors I tend to read skew very male. I decided I needed to do something about that, if only because I was probably missing some great stuff. So for six months I decided to read only novels written by women.
And I was right! I was missing some great stuff. I read about 60 novels in those six months, and discovered some amazing things that I really wish I’d known about before. Here are a few of the stand-outs.
I had to do a slightly weird bit of system admin recently, and there was one step that was kind of sneaky and not something I’d seen mentioned anywhere else. So I thought I’d better write it up...
What I wanted to do was make my desktop machine dual boot Arch Linux and Windows 10 (I needed Windows to use some CAD software). That’s not so unusual, but the recommended procedure seems to be to start from a clean machine, install Windows, then install Linux. I was starting from a pre-existing Arch Linux installation and didn’t want to lose anything. I also had things set up using syslinux
in BIOS mode, which wasn’t going to work with Windows. So I needed to switch to UEFI booting first.
Click here to read Setting up dual boot Arch Linux/Windows 10
]]>I’ve just finished my first week in a new job, which is looking like it’s going to be a lot of fun. I’m working for MemCachier, a small company (I’m employee #4) that offers a multi-tenant cloud-based memcache
service. The code is mostly Haskell and Go (and maybe in the future some Rust), the people are all very smart and friendly and committed to trying to do things the right way. I’m still (obviously, after one week) in that initial state of confusion when dropping into a new codebase, but it looks very promising indeed.
The work should be a good mix of web programming, DevOps automation things and distributed systems problems. There’s going to be lots to learn, which is ideal for me! I’m also very happy to be back working in Haskell after a year-long (or a bit more) diversion into Python.
So, as far as I’m concerned, 2017 has started off very well.
Well, that’s pretty poor. It’s a year and a half since I last wrote an article here. A lot has happened in that time, and I have some cool new things coming up that I want to write about, so I’d better dust this off.
Quick summary of the last 18 months:
I started a new job in September last year, and had my last day there on December 9. It was OK, but obviously not perfect, otherwise I wouldn’t have left. There were some great people there, but the management of the organisation left a lot to be desired. I’m starting something new tomorrow (!) which I’m quite excited about and that I’ll write about in the next couple of weeks.
We bought a house! It has some doer-upper aspects to it, but it’s pretty great. Very big garden, plenty of space (we have more rooms here than in all the places we’ve lived before now put together!), good dog walking nearby, quiet but handy for transport (we’re about two minutes drive from a motorway, but you don’t hear it much), work for Rita nearby, etc. We moved in in July and had an initial spasm of renovation work where we transformed the rather manky kitchen into something nice. We’re just finishing up renovating the living room, and have lots of ideas of more things to do. One thing is to build a separate office building for me to use, which is going to my big project for the spring. Lots of things to learn! (As well as having more rooms now than we’ve ever had before, we also have more power tools than I’ve ever owned in my life...)
I recently did some work for Andy Ridgwell, an old colleague from Bristol, writing a build and configuration system and GUI for a medium-sized climate model called GENIE. GENIE is an EMIC, an Earth system Model of Intermediate Complexity. It’s about 55,000 lines of Fortran and includes models of the atmosphere and ocean plus models of atmospheric chemistry and biogeochemistry in the ocean and ocean sediments.
This model had been in use for some years by different groups, and the infrastructure around it had become quite baroque. Andy wanted this tidied up and made nice (i.e. rewritten...) to make the model easier to set up and use. He also wanted a cross-platform GUI for configuring and running the model, allowing you to keep track of the model state in real-time, to pause and restart model runs, changing the model configuration in between, and so on.
A major consideration for this work was that as well as being easy to use the new system had to be easy to install (on both Linux and Windows) and easy for scientists to hack on. That ruled out Haskell, my usual tool of choice. I decided to use Python instead, for a couple of reasons.
One of the things that C2HS is lacking is a good tutorial. So I’m going to write one (or try to, anyway).
I’ve started doing a new thing this year to try to help with “getting things done”. I normally have a daily to-do list and a list of weekly goals from which I derive my daily tasks, but I’ve also now started having a list of quarterly goals to add another layer of structure. Three months is a good timespan for medium-term planning, and it’s very handy to have that list of quarterly goals in front of you (I printed it out and stuck it to the front of my computer so it’s there whenever I’m working). Whatever you’re doing, you can think “Is this contributing to fulfilling one of my goals?” and if the answer is “No, watching funny cat videos is not among my goals for this quarter”, it can be a bit of a boost to get you back to work.
So, how did I do? Not all that badly, although there were a couple of things that fell by the wayside.
OK, so we’re done with this epic of climate data analysis. I’ve prepared an index of the articles in this series, on the off chance that it might be useful for someone.
The goal of this exercise was mostly to try doing some “basic” climate data analysis tasks in Haskell, things that I might normally do using R or NCL or some cobbled-together C++ programs. Once you can read NetCDF files, a lot of the data manipulation is pretty easy, mostly making use of standard things from the hmatrix
package. It’s really not any harder than doing these things using “conventional” tools. The only downside is that most of the code that you need to write to do this stuff in Haskell already exists in those “conventional” tools. A bigger disadvantage is that data visualisation tools for Haskell are pretty thin on the ground–diagrams
and Chart
are good for simpler two-dimensional plots, but maps and geophysical data plotting aren’t really supported at all. I did all of the map and contour plots here using UCAR’s NCL language which although it’s not a very nice language from a theoretical point of view, has built-in capabilities for generating more or less all the plot types you’d ever need for climate data.
I think that this has been a reasonably useful exercise. It helped me to fix a couple of problems with my hnetcdf
package and it turned up a bug in hmatrix
. But it went on a little long–my notes are up to 90 pages. (Again: the same thing happened on the FFT stuff.) That’s too long to maintain interest in a problem you’re just using as a finger exercise. The next thing I have lined up should be quite a bit shorter. It’s a problem using satellite remote sensing data, which is always fun.
Click here to read Non-diffusive atmospheric flow #15: Wrap-up
]]>This is going to be the last substantive post of this series (which is probably as much of a relief to you as it is to me...). In this article, we’re going to look at phase space partitioning for our dimension-reduced $Z_{500}$ PCA data and we’re going to calculate Markov transition matrices for our partitions to try to pick out consistent non-diffusive transitions in atmospheric flow regimes.
Click here to read Non-diffusive atmospheric flow #14: Markov matrix calculations
]]>I took over the day-to-day support for C2HS about 18 months ago and have now finally cleaned up all the issues on the GitHub issue tracker. It took a lot longer than I was expecting, mostly due to pesky “real work” getting in the way. Now seems like a good time to announce the 0.25.1 “Snowmelt” release of C2HS and to summarise some of the more interesting new C2HS features.
This is going to be the oldest of old hat for the cool Haskell kids who invent existential higher-kinded polymorphic whatsits before breakfast, but it amused me, and it’s the first time I’ve used some of these more interesting language extensions for something “real”.
(There’s no code in this post, just some examples to explain what we’re going to do next.)
Suppose we define the state of the system whose evolution we want to study by a probability vector $\mathbf{p}(t)$–at any moment in time, we have a probability distribution over a finite partition of the state space of the system (so that if we partition the state space into $N$ components, then $\mathbf{p}(t) \in \mathbb{R}^N$). Evolution of the system as a Markov chain is then defined by the evolution rule
$$ \mathbf{p}(t + \Delta{}t) = \mathbf{M} \mathbf{p}(t), \qquad (1) $$ where $\mathbf{M} \in \mathbb{R}^{N \times N}$ is a Markov matrix. This approach to modelling the evolution of probability densities has the benefit both of being simple to understand and to implement (in terms of estimating the matrix $\mathbf{M}$ from data) and, as we’ll see, of allowing us to distinguish between random “diffusive” evolution and conservative “non-diffusive” dynamics.
We’ll see how this works by examining a very simple example.
Click here to read Non-diffusive atmospheric flow #13: Markov matrix examples
]]>The analysis of preferred flow regimes in the previous article is all very well, and in its way quite illuminating, but it was an entirely static analysis–we didn’t make any use of the fact that the original $Z_{500}$ data we used was a time series, so we couldn’t gain any information about transitions between different states of atmospheric flow. We’ll attempt to remedy that situation now.
What sort of approach can we use to look at the dynamics of changes in patterns of $Z_{500}$? Our $(\theta, \phi)$ parameterisation of flow patterns seems like a good start, but we need some way to model transitions between different flow states, i.e. between different points on the $(\theta, \phi)$ sphere. Each of our original $Z_{500}$ maps corresponds to a point on this sphere, so we might hope that we can some up with a way of looking at trajectories of points in $(\theta, \phi)$ space that will give us some insight into the dynamics of atmospheric flow.
Click here to read Non-diffusive atmospheric flow #12: dynamics warm-up
]]>A quick post today to round off the “static” part of our atmospheric flow analysis.
Now that we’ve satisfied ourselves that the bumps in the spherical PDF in article 8 of this series are significant (in the narrowly defined sense of the word “significant” that we’ve discussed), we might ask what to sort of atmospheric flow regimes these bumps correspond. Since each point on our unit sphere is really a point in the three-dimensional space spanned by the first three $Z_{500}$ PCA eigenpatterns that we calculated earlier, we can construct composite maps to look at the spatial patterns of flow for each bump just by combining the first three PCA eigenpatterns in proportions given by the “$(x, y, z)$” coordinates of points on the unit sphere.
Click here to read Non-diffusive atmospheric flow #11: flow pattern visualisations
]]>The spherical PDF we constructed by kernel density estimation in the article before last appeared to have “bumps”, i.e. it’s not uniform in $\theta$ and $\phi$. We’d like to interpret these bumps as preferred regimes of atmospheric flow, but before we do that, we need to decide whether these bumps are significant. There is a huge amount of confusion that surrounds this idea of significance, mostly caused by blind use of “standard recipes” in common data analysis cases. Here, we have some data analysis that’s anything but standard, and that will rather paradoxically make it much easier to understand what we really mean by significance.
Click here to read Non-diffusive atmospheric flow #10: significance of flow patterns
]]>The Haskell kernel density estimation code in the last article does work, but it’s distressingly slow. Timing with the Unix time
command (not all that accurate, but it gives a good idea of orders of magnitude) reveals that this program takes about 6.3 seconds to run. For a one-off, that’s not too bad, but in the next article, we’re going to want to run this type of KDE calculation thousands of times, in order to generate empirical distributions of null hypothesis PDF values for significance testing. So we need something faster.
Click here to read Non-diffusive atmospheric flow #9: speeding up KDE
]]>Up to this point, all the analysis that we’ve done has been what might be called “normal”, or “pedestrian” (or even “boring”). In climate data analysis, you almost always need to do some sort of spatial and temporal subsetting and you very often do some sort of anomaly processing. And everyone does PCA! So there’s not really been anything to get excited about yet.
Now that we have our PCA-transformed $Z_{500}$ anomalies though, we can start to do some more interesting things. In this article, we’re going to look at how we can use the new representation of atmospheric flow patterns offered by the PCA eigenpatterns to reduce the dimensionality of our data, making it much easier to handle. We’ll then look at our data in an interesting geometrical way that allows us to focus on the patterns of flow while ignoring the strengths of different flows, i.e. we’ll be treating strong and weak blocking events as being the same, and strong and weak “normal” flow patterns as being the same. This simplification of things will allow us to do some statistics with our data to get an idea of whether there are statistically significant (in a sense we’ll define) flow patterns visible in our data.
Click here to read Non-diffusive atmospheric flow #8: flow pattern distribution
]]>Although the basics of the “project onto eigenvectors of the covariance matrix” prescription do hold just the same in the case of spatio-temporal data as in the simple two-dimensional example we looked at in the earlier article, there are a number of things we need to think about when we come to look at PCA for spatio-temporal data. Specifically, we need to think bout data organisation, the interpretation of the output of the PCA calculation, and the interpretation of PCA as a change of basis in a spatio-temporal setting. Let’s start by looking at data organisation.
Click here to read Non-diffusive atmospheric flow #7: PCA for spatio-temporal data
]]>The pre-processing that we’ve done hasn’t really got us anywhere in terms of the main analysis we want to do–it’s just organised the data a little and removed the main source of variability (the seasonal cycle) that we’re not interested in. Although we’ve subsetted the original geopotential height data both spatially and temporally, there is still a lot of data: 66 years of 181-day winters, each day of which has $72 \times 15$ $Z_{500}$ values. This is a very common situation to find yourself in if you’re dealing with climate, meteorological, oceanographic or remote sensing data. One approach to this glut of data is something called dimensionality reduction, a term that refers to a range of techniques for extracting “interesting” or “important” patterns from data so that we can then talk about the data in terms of how strong these patterns are instead of what data values we have at each point in space and time.
I’ve put the words “interesting” and “important” in quotes here because what’s interesting or important is up to us to define, and determines the dimensionality reduction method we use. Here, we’re going to side-step the question of determining what’s interesting or important by using the de facto default dimensionality reduction method, principal components analysis (PCA). We’ll take a look in detail at what kind of “interesting” and “important” PCA give us a little later.
Click here to read Non-diffusive atmospheric flow #6: principal components analysis
]]>Note: there are a couple of earlier articles that I didn’t tag as “haskell” so they didn’t appear in Planet Haskell. They don’t contain any Haskell code, but they cover some background material that’s useful to know (#3 talks about reanalysis data and what $Z_{500}$ is, and #4 displays some of the characteristics of the data we’re going to be using). If you find terms here that are unfamiliar, they might be explained in one of these earlier articles.
The code for this post is available in a Gist.
Update: I missed a bit out of the pre-processing calculation here first time round. I’ve updated this post to reflect this now. Specifically, I forgot to do the running mean smoothing of the mean annual cycle in the anomaly calculation–doesn’t make much difference to the final results, but it’s worth doing just for the data manipulation practice...
Before we can get into the “main analysis”, we need to do some pre-processing of the $Z_{500}$ data. In particular, we are interested in large-scale spatial structures, so we want to subsample the data spatially. We are also going to look only at the Northern Hemisphere winter, so we need to extract temporal subsets for each winter season. (The reason for this is that winter is the season where we see the most interesting changes between persistent flow regimes. And we look at the Northern Hemisphere because it’s where more people live, so it’s more familiar to more people.) Finally, we want to look at variability about the seasonal cycle, so we are going to calculate “anomalies” around the seasonal cycle.
We’ll do the spatial and temporal subsetting as one pre-processing step and then do the anomaly calculation seperately, just for simplicity.
Click here to read Non-diffusive atmospheric flow #5: pre-processing
]]>In the last article, I talked a little about geopotential height and the $Z_{500}$ data we’re going to use for this analysis. Earlier, I talked about how to read data from the NetCDF files that the NCEP reanalysis data comes in. Now we’re going to take a look at some of the features in the data set to get some idea of what we might see in our analysis. In order to do this, we’re going to have to produce some plots. As I’ve said before, I tend not to be very dogmatic about what software to use for plotting–for simple things (scatter plots, line plots, and so on) there are lots of tools that will do the job (including some Haskell tools, like the Chart library), but for more complex things, it tends to be much more efficient to use specialised tools. For example, for 3-D plotting, something like Paraview or Mayavi is a good choice. Here, we’re mostly going to be looking at geospatial data, i.e. maps, and for this there aren’t really any good Haskell tools. Instead, we’re going to use something called NCL (NCAR Command Language). This isn’t by any stretch of the imagination a pretty language from a computer science point of view, but it has a lot of specialised features for plotting climate and meteorological data and is pretty perfect for the needs of this task (the sea level pressure and $Z_{500}$ plots in the last post were made using NCL). I’m not going to talk about the NCL scripts used to produce the plots here, but I might write about NCL a bit more later since it’s a very good tool for this sort of thing.
Click here to read Non-diffusive atmospheric flow #4: exploring Z500
]]>In this article, we’re going to look at some of the details of the data that we’re going to be using in our study of non-diffusive flow in the atmosphere. This is still all background material, so there’s no Haskell code here!
Click here to read Non-diffusive atmospheric flow #3: reanalysis data and Z500
]]>As I said in the last article, the next bit of this data analysis series is going to attempt to use Haskell to reproduce the analysis in the paper: D. T. Crommelin (2004). Observed nondiffusive dynamics in large-scale atmospheric flow. J. Atmos. Sci. 61(19), 2384–2396. Before we can do this, we need to cover some background, which I’m going to do in this and the next couple of articles. There won’t be any Haskell code in any of these three articles, so I’m not tagging them as “Haskell” so that they don’t end up on Planet Haskell, annoying category theorists who have no interest in atmospheric dynamics. I’ll refer to these background articles from the later “codey” articles as needed.
Click here to read Non-diffusive atmospheric flow #2: outline & plan
]]>I never really intended the FFT stuff to go on for as long as it did, since that sort of thing wasn’t really what I was planning as the focus for this Data Analysis in Haskell series. The FFT was intended primarily as a “warm-up” exercise. After fourteen blog articles and about 10,000 words, everyone ought to be sufficiently warmed up now...
Instead of trying to lay out any kind of fundamental principles for data analysis before we get going, I’m just going to dive into a real example. I’ll talk about generalities as we go along when we have some context in which to place them.
All of the analysis described in this next series of articles closely follows that in the paper: D. T. Crommelin (2004). Observed nondiffusive dynamics in large-scale atmospheric flow. J. Atmos. Sci. 61(19), 2384–2396. We’re going to replicate most of the data analysis and visualisation from this paper, maybe adding a few interesting extras towards the end.
It’s going to take a couple of articles to lay out some of the background to this problem, but I want to start here with something very practical and not specific to this particular problem. We’re going to look at how to gain access to meteorological and climate data stored in the NetCDF file format from Haskell. This will be useful not only for the low-frequency atmospheric variability problem we’re going to look at, but for other things in the future too.
Click here to read Haskell data analysis: Reading NetCDF files
]]>Here’s a mixed bag of interesting links, some sciencey, some mathsy, some miscellany:
Network Rail Virtual Archives: OK, this might not, at first sight, sound like something interesting, but it really is. This site has original Victorian-era engineering drawings for a whole range of British railway infrastructure. Bridges, viaducts, stations, tunnels. All rendered in lovely 19th Century penmanship. The Forth Bridge is particularly nice.
open.NASA: A couple of years ago, NASA started a project to open-source code and data from their Earth observing and planetary missions. Open.NASA is gateway to these resources. I’ve not had a chance to look at it in huge detail yet, but there is a lot of stuff there. The list of projects on the code.NASA part looks particularly entertaining.
Game of Primes: Giganotosaurus is a science fiction site that publishes one (longish) short story each month. They’re often very good, and this one was particularly striking–it’s quite beautifully done, full of mystery, and feels like it could be a part of something much larger and deeper.
Surprising connections in mathematics: This one is a bit more technical, from the Math Overflow Q&A website. A lot of the connections people mention are very technical, but some are more accessible, for instance the link between algebra and geometry developed by Descartes and others in the 17th Century. This is something we learn about in school, and something that we don’t think about too much because it seems “obvious”. Only obvious in retrospect, of course, since it took hundreds of years for the connection to be discovered!
De Bruijn grids and tilings: Another technical one, but very interesting. Aperiodic tilings of the plane, like Penrose tilings, are slightly mysterious. This article gives a really clear description of one systematic method for generating such tilings. It’s a very odd and intriguing little bit of mathematics.
Atul Gawande on end-of-life care: Atul Gawande is one of my favourite writers on medical and ethical issues. This article is quite long, but well worth a read.
by Eric Schlosser
My reading list recently has been chock-full of light-hearted and mood-lifting material: some Irvine Welsh novels (always guaranteed to shed a gentle light on all that’s best about the human condition), a long book about clinical depression, M. R. Carey’s interesting sort-of-zombie apocalypse/extreme mycology novel, The Girl With All The Gifts, de Becker’s The Gift Of Fear, a book all about fear and violence, and Piper Kerman’s prison memoir, Orange Is The New Black (which did spoil the mood a little having a few sparks of hope in among the gloom).
Among all this bleakness and blackness, Command and Control somehow manages to stand out as a particularly grim monument to human folly and our collective crimes against all sense and reason. It’s a book about nuclear weapons, so it never really had much chance of being too jolly, but even so, Schlosser’s decision to focus in parallel on US nuclear doctrine and nuclear weapons safety makes for some horrifying reading. It’s something of a mystery how we made it through the Cold War without either a “hot” war or at least some sort of unintended detonation of a nuclear weapon.
Second round of “many books”...
I’ve been doing quite a bit of reading lately, so I have 28 novels to review! All but one are from series of novels, so that’s not quite as daunting as it sounds. Still, I’ll split this into two posts to make it manageable.
In particular, getting from where you are now to where you want to be, in terms of your career.
As a result of an email I sent to the Haskell-Cafe mailing list a couple of weeks ago looking for someone to take over a contract I had been working on, someone contacted me asking for career advice. Clearly not someone who knew me at all, otherwise they would have known what a crazy idea that was. Anyway, this person was asking about one of the fundamental problems when you’re starting out in more or less any profession: how do you acquire the experience you need to apply for jobs that say “experience required”, which is more or less all of them?
They asked: “What is the path to getting involved in this stuff? How do I bridge the gap from just playing around with these technologies to having real world experience? It seems that most opportunities are for people with experience.” And this is exactly right. Particularly for contracting, no-one wants to hire someone they think will have to learn on the job. You need to know what you’re doing, which means getting experience somehow. And it would of course be nice to be able to eat and have a life while getting that experience.
I wrote an epic email in reply, and was told that it would have worked better as a blog post (or perhaps a short novel). So here I am, turning it into a blog post!
It’s more than two months since I last wrote a blog article. I’ve been ridiculously busy since then and things are only just now calming down. It now looks as though I’m going to try something new, at least for three months or so, and that should provide more time for blogging. I had to drop more or less all of my personal projects for the last couple of months, which has been frustrating (no work on my data analysis book, no work on arb-fft
, very little work on C2HS, a huge backlog of technical reading piling up and up and up like some Tower of Techno-Babel). Things should get back to something more like normal from now on though.
One benefit of working like a donkey for the last couple of months is that I now have a bit of money in the bank, and I’m planning to use that financial window to push some personal projects forwards. I have a few ideas, starting with “finishing” arb-fft
and getting back to some work on my book. I’ll do a couple of days of paid work a week, do a bit of open-source stuff (C2HS and Hackage mostly) and work on those personal projects. And blogging. There will be blogging.
Starting tomorrow. Now though, I’m going to go outside and lie myself down in the sunshine.
by Robert Harris; Laurent Binet & Sam Taylor
These two books are tied together by the name of Reinhard Heydrich. I can’t think of a polite way of describing Heydrich. He was one of the architects of the Holocaust, a fervid Nazi, and an all-round total bastard. Hitler called him “the man with the iron heart”, which gives you some kind of idea of what kind of a git he was.
In the real world, Heydrich dies in 1942 from injuries sustained during an assassination attempt by Czech and Slovak commandos while he was “Acting Reich Protector of Bohemia and Moravia”. In the world of Fatherland, he survives, the Germans win the Second World War and Europe languishes under Nazi rule, with a sympathetic administration in the USA led by Kennedy (père rather than fils) providing no effective check on their activities. Heydrich continues in his role as second-in-command of the SS and goes on being as much as a bastard as ever.
A useful but little used feature of Haddock is the ability to include inline images in Haddock pages. Here are a few examples. You can use images for diagrams or for inserting mathematics into your documentation in a readable way. In order to use an image, you just put <<
path-to-image>>
in a Haddock comment. The image can be of any format that browsers support: PNG, JPEG, SVG, whatever.
Click here to read Using Images in Haddock Documentation on Hackage
]]>by Gerald Jay Sussman & Jack Wisdom (with Will Farr)
This book follows very much in the mould of Sussman & Wisdom’s Structure and Interpretation of Classical Mechanics in that it’s an attempt to take a body of mathematics (differential geometry here, classical mechanics in SICM) and to “computationalise” it, i.e. to use computer programs to make the mathematical structures involved manifest in a way that a more traditional approach to these subjects doesn’t.
This computational viewpoint is a very powerful one to adopt, for a number of reasons.
I recently read The Quest for Artificial Intelligence by Nils Nilsson. Interrupting a fairly linear history of the AI field is an interlude on some more philosophical questions about artificial intelligence, minds and thought. Searle’s Chinese Room, things like that.
I got to thinking about some of these questions on my daily peregrinations with Winnie, and I started wondering about the scales that are involved in most discussions of minds and AI. Searle talks about an individual person in his Chinese Room, which makes the idea of some sort of disembodied intelligence actually understanding the Chinese sentences it’s responding to seem pretty absurd. But is that in any sense a realistic representation of a brain or a human mind?
In keeping with my upbringing as a baby physicist, I’m going to take a very reductionist approach to this question. I want to get some idea of exactly what the information storage capacity of a human brain is. I’m deliberately not going to think about information processing speeds (because that would involve too much thinking about switching rates, communication between different parts of brains, and other biological things about which I know very little). I’ll treat this in the spirit of a Fermi problem, which means I’ll be horrifyingly slapdash with anything other than powers of ten. There will be big numbers.
After releasing the arb-fft
package last week, a couple of issues were pointed out by Daniel Díaz, Pedro Magalhães and Carter Schonwald. I’ve made another couple of releases to fix these things. They were mostly little Haskell things that I didn’t know about, so I’m going to quickly describe them here. (I’m sure these are things that most people know about already...)
Click here to read Haskell FFT 15: Some small Haskell things
]]>I was obscurely disappointed to discover that I wasn’t the first person to come up with “MOOCopalypse”. I think I could claim first dibs on “MOOCaclysm” if I wanted, since there are no Google hits for it, but it’s not quite as good.
Unless you’ve been living under a rock for the last couple of years, you’ve probably heard of MOOCs (Massive Open Online Courses). These are courses offered online by universities or other suppliers, intended to bring high-quality higher education resources to a wide and varied audience. I’ve done a few of these courses: Peter Norvig and Sebastian Thrun’s Stanford AI course, one on fantasy and science fiction, and more recently Jeff Ullman’s automata theory course and another on general game playing (about which I’ll write something once I’ve had some time to work on the software I developed as part of the course).
After thirteen blog articles and about two-and-a-half weeks’ equivalent work time spread out over three months, I think it’s time to stop with the FFT stuff for a while. I told myself when I started all this that if I could get within a factor of 20 of the performance of FFTW, I would be “quite obscenely pleased with myself”. For most $N$, the performance of our FFT code is within a factor of 5 or so of the performance of the vector-fftw
package. For a pure Haskell implementation with some but not lots of optimisation work, that’s not too shabby.
But now, my notes are up to nearly 90 pages and, while there are definitely things left to do, I’d like to move on for a bit and do some other data analysis tasks. This post is just to give a quick round-up of what we’ve done, and to lay out the things remaining to be done (some of which I plan to do at some point and some others that I’d be very happy for other people to do!). I’ve released the latest version of the code to Hackage. The package is called arb-fft
. There is also an index of all the posts in this series.
In this article, we’ll finish up with optimisation, tidying up a couple of little things and look at some “final” benchmarking results. The code described here is the pre-release-4
version in the GitHub repository. I’m not going to show any code snippets in this section, for two reasons: first, if you’ve been following along, you should be familiar enough with how things work to find the sections of interest; and second, some of the optimisation means that the code is getting a little more unwieldy, and it’s getting harder to display what’s going on in small snippets. Look in the GitHub repository if you want to see the details.
In the last article, we did some basic optimisation of our FFT code. In this article, we’re going to look at ways of reordering the recursive Cooley-Tukey FFT decomposition to make things more efficient. In order to make this work well, we’re going to need more straight line transform “codelets”. We’ll start by looking at our $N=256$ example in detail, then we’ll develop a general approach.
I haven’t done one of these for a long time, but I have a big old pile of interesting things I’ve read recently. I might make this a weekly thing–Harry Connolly does a Friday Randomness roundup and there’s always something interesting or weird there. I don’t think I can quite attain his level of randomness, so I’ll have to settle for “worth reading” as a criterion.
Dogs Are People, Too: Functional MRI for dogs! And the dogs get to choose whether or not to play, which is really great.
The Mother of All Squid Builds a Library: Lovely short story.
What the World Eats: Twelve families from very different parts of the world photographed with a week’s worth of groceries. I keep going back to look at them.
Meat Atlas: Friends of the Earth and the Böll Foundation produced this report about global meat production, the economics, the environmental impacts, and so on.
Hofstadter interview in The Atlantic: After reading this long interview with Douglas Hoftstadter, I was inspired to go back and re-read Gödel, Escher, Bach and I now have a pile of other Hofstadter things to read. He’s a very interesting guy.
The Guardian: Ask a Grown-Up: This is something The Guardian has been doing for a while, but I’ve only just seen it. It’s a great idea. Children can write in with questions about anything at all, and they get answers from experts in whatever field they’re asking about (or sometimes slightly random celebrities, but mostly the answerers know what they’re on about). Best one I’ve seen so far: Douglas Hofstadter (yes, him again!) answering a six-year-old’s question “Does my cat Oscar know he’s a cat?”!
The UK National Archives: I’ve not had much of a chance to rummage around in here yet, but it looks like an amazing resource where I could probably waste infinite amounts of time.
How Doctors Die: End-of-life care choices and the differences between what doctors recommend to their patients and what they do themselves. Pretty grim, but needs to be talked about.
Early Onset Dementia: More grimness, also needing to be talked about. Early onset dementia (not just Alzheimer’s–the article is about something different) is just terrifying. Imagine losing the ability to recognise all of your loved ones, to know who you are or where you are, and this not to be something that happens towards the end of your life, but at a time when you might conceivably have to live like that for 40 or 50 years. Now imagine that it’s not you that has this problem, but your partner...
Implementing a JIT Compiled Language with Haskell and LLVM: On a completely different topic, this is a Haskell rendering of a long LLVM tutorial that looks really useful. I’ve only read part of it so far, but it’s good.
Based on what we saw concerning the performance of our FFT code in the last article, we have a number of avenues of optimisation to explore. Now that we’ve got reasonable looking $O(N \log N)$ scaling for all input sizes, we’re going to try to make the basic Danielson-Lanczos step part of the algorithm faster, since this will provide benefits for all input sizes. We can do this by looking at the performance for a single input length (we’ll use $N=256$). We’ll follow the usual approach of profiling to find parts of the code to concentrate on, modifying the code, then looking at benchmarks to see if we’ve made a positive difference.
Once we’ve got some way with this “normal” kind of optimisation, there are some algorithm-specific things we can do: we can include more hard-coded base transforms for one thing, but we can also try to determine empirically what the best decomposition of our input vector length is–for example, for $N=256$, we could decompose as $2 \times 2 \times 2 \times 2 \times 2 \times 2 \times 2 \times 2$, using length-2 base transforms and seven Danielson-Lanczos steps to form the final transform, or as $16 \times 16$, using a length-16 base transform and a single Danielson-Lanczos step to form the final result.
So far in this series, we’ve been looking at code piecemeal, writing little test modules to explore the algorithms we’ve been developing. Before we start trying to optimise this code, it makes sense to put it into a Cabal package with a more organised module structure, to provide a sensible API and to make a few other small changes.
From now on, the code we’re going to be talking about will be the arb-fft
package, which is hosted on GitHub. In this article, we’ll be talking about the version of the code tagged pre-release-1
. (We’ll be uploading a version of the code to Hackage once it’s ready, but there will be a few pre-release versions that we’ll look at before we get to that point.)
by Douglas Hoftstadter
I first read GEB about 25 years ago, as far as I can remember. I’m not sure how much I took from it then (I was about 16 years old and knew a little bit of maths, but it was my first exposure to logic and the theory of computation), although I do remember the dialogues and all the wonderful Escher pictures.
I picked up a copy again recently as a result of reading an interview with Hofstadter in The Atlantic. The book has aged really well, with just a few technological fossils from the 1970s (record players playing records that they cannot play, for instance). Hofstadter has a timeless and almost overwhelmingly fecund imagination, bubbling over with novel ideas that seem to pop up from nowhere. His description of the thought process that led to the construction of one of the dialogues (each of which is modelled on a piece of Bach’s music and attempts to explore questions of linguistics or metamathematics on a number of levels at once) gives an impression of a mind of almost obsessional twistiness, trying to pack as many ideas as possible into the smallest possible amount of text using puns, structure and patterns at all available levels. It’s quite wonderful.
Click here to read Gödel, Escher, Bach: An Eternal Golden Braid
]]>[The code from this article is available as a Gist]
In the last article in this series, benchmarking results gave a pretty good indication that we needed to do something about prime-length FFTs if we want to get good scaling for all input sizes. Falling back on the $O(N^2)$ DFT algorithm just isn’t good enough.
In this article, we’re going to look at an implementation of Rader’s FFT algorithm for prime-length inputs. Some aspects of this algorithm are not completely straightforward to understand since they rely on a couple of results from number theory and group theory. We’ll try to use small examples to motivate what’s going on.
Click here to read Haskell FFT 9: Prime-Length FFT With Rader's Algorithm
]]>I’ve just finished the final exam for Stanford’s Automata Theory online course, run by Jeff Ullman. I originally chose to follow this course because 1). I wanted to learn (and in some cases re-learn) about finite automata and context-free grammars, and 2). it’s Jeff Ullman (duh). The finite automata and context-free grammar stuff was fun and useful and pretty easy, but the most interesting part of the course ended up being the material about decidability, computability and tractability (P vs. NP and all that jazz). I was re-reading Hoftstadter’s Gödel, Escher, Bach at the same time as doing the course, and so questions of decidability and computability were bouncing around in my mind.
I wrote this piece of silliness (a sort of Tolkien/Lovecraft mash-up) a while ago as a writing exercise. It’s been languishing on my computer ever since. I’ve been getting behind on Haskell FFT stuff, so I thought I’d release it into the wild...
We’re going to do a number of different things to optimise the code developed in the previous article, but before we do that, we need to do some baseline performance measurements, so that we know that our “improvements” really are improving things. The performance results we see will also help to guide us a little in the optimisations that we need to do.
Benchmarking is a fundamental activity when working with numerical code, but it can be quite difficult to do it correctly in a lazy language like Haskell. It’s very easy to get into a situation where a benchmarking loop (perhaps to run a computation 1000 times to collect accurate timing information) performs the computation on the first time round the loop, then reuses the computed result for all the other 999 times round.
We can bypass these worries quite effectively using Bryan O’Sullivan’s Criterion library. As well as providing a clean framework for ensuring that pure calculations really do get rerun for timing purposes, Criterion gives us a nice API for setting up benchmarks, running them and collecting results.
I’ve read a couple of books about North Korea recently, neither of which were much fun. The first was Nothing to Envy: Real Lives in North Korea by Barbara Demick. This was interesting partially for the “arm’s length” approach that Demick had to take–the only North Koreans she could find to talk to were escapees from the Kim regime living in the US or South Korea. Despite those limitations, Demick does a good job (as far as I can tell) of portraying the day-to-day challenges of life in North Korea. The most striking thing? The pettiness of the regime: the megalomania and the monuments to Kim père et fils is one thing, but the Songbun system of what sociologists call “ascribed status” is almost biblical in its horror. So your grandfather found himself on the wrong side of the border at the end of the Korean War? Tough luck, sister, you’re on the Dear Leader’s shit list. Three generations dumped into the “hostile class”. Fuck yeah. I’d be hostile too. No opportunities for employment, constant suspicion from the authorities and the stooges they recruit from the population (who can blame anyone for what they do in that sort of environment? any one of us would probably end up being a stooge too if we could–choice is a luxury in North Korea), and the constant fear of incarceration for one misspoken word. And all the while, the Kims live in luxury and mismanage the country into famine and destitution. Demick’s description of the kochebi, “wandering swallows”, homeless orphans whose parents had died in the famine of the 1990s, is depressing as hell.
Let’s summarise what we’ve done so far, as well as pointing out some properties of our general mixed-radix FFT algorithm that will be important going forward. We’ll also think a little bit about just what we are going to do next. Here’s what we’ve done:
Written down the basic expression for the DFT and implemented it directly in Haskell. POST
Built some “toy algebra” tools to help us understand the Fourier matrix decomposition at the heart of the Cooley-Tukey FFT algorithm. POST
Implemented a basic powers-of-two FFT. POST
Extended the “toy algebra” code to the case of general input vector lengths. POST
Implemented a full mixed-radix FFT in Haskell for general input vector lengths. POST
Click here to read Haskell FFT 7: Where We've Got To, Where We're Going
]]>Tryfan, North Wales, 2006
In our house, there are four dates that are important. We don’t do birthdays so much, and religious festivals aren’t really of interest. However, the solstices are important–they’re the “corners of the year” and if you spend as much time outside as we do, in all weathers, the length of the day makes a big difference to your mood.
Then there’s September 4th, which is the reason we spend so much time outside and why the solstices are important! That was the day a couple of years ago when we adopted our dog Winnie, changing our lives beyond all recognition, almost all for the better.
And then there’s today. Ten years ago today, Rita and I went on our first date, to a small South American restaurant in Bristol, where we were both living at the time. Ten years later, asking Rita out to dinner seems like one of the best decisions I ever made (and I’m glad she said yes!). Since then, I think that the longest period we’ve had apart was about six weeks, when I went on a long trip to Canada (before we lived there). Now, spending just a day apart feels like too much.
What do I love most about Rita? She’s compassionate, honest, funny. She cares about people and the world. Her compassion rubs off on me and makes me a better person. We laugh together so much, sometimes it hurts. After ten years, we know each others’ moods pretty well. We’ve learnt to deal with each other’s little foibles, and we support and care for each other as much as we ever can.
When I see her smile, my world just lights up. I feel as though I spend most of my days awash in an ocean of love and warmth.
So, as tough as it may be for her to deal with, I think she may very well be stuck with me now...
by Peter Seibel
I like the premise of this book–find some of the best programmers out there and talk to them about their craft (or art, or science, if you prefer). First thing to say though, is that the list of interviewees tells you something about the state of the computing industry today: out of fifteen people interviewed, only one is a woman. Of the other fourteen, all of them are white men. You can argue that most of these people are pretty senior and are mostly more a product of the culture of the 1960s, 1970s and 1980s than now, but still. I’ll come back to this in a minute, because it turns out that the interview with the sole woman, Fran Allen, was one of the most interesting.
[The code from this article is available as a Gist]
In this article, we’re going to implement the full “sorted prime factorisation mixed-radix decimation-in-time Cooley-Tukey FFT”. That’s a bit of a mouthful, and you might want to review some of the algebra in the last article to remind yourself of some of the details of how it works.
The approach we’re going to take to exploring the implementation of this mixed-radix FFT is to start with a couple of the smaller and simpler parts (prime factor decomposition, digit reversal ordering), then to look at the top-level driver function, and finally to look at the generalised Danielson-Lanczos step, which is the most complicated part of things. This code has an unavoidably large number of moving parts (which is probably why this algorithm is rarely covered in detail in textbooks!), so we’ll finish up by writing some QuickCheck properties to make sure that everything worksAn admission: it took me quite a while to get everything working correctly here. The powers-of-two FFT took about an hour to write an initial version and perhaps another hour to tidy things up. The mixed-radix FFT took several hours of off-line thinking time (mostly walking my dog), a couple of hours to put together an initial (non-working) version, then several more hours of debugging time..
Click here to read Haskell FFT 6: Implementing the Mixed-Radix FFT
]]>For my blog, I use Jasper Van der Jeugt’s Hakyll system. When I was first looking for a blogging platform, I rejected all the usual choices (Wordpress, etc.), mostly because I’m a borderline obsessive control freak and they didn’t give enough configurability. Plus, they weren’t Haskell. I started writing my own blogging system, but then I found Hakyll. “Great,” I thought, “that’s perfect, except I want something almost completely different!”
[The code from this article is available as a Gist]
So far, we’ve been dealing only with input vectors whose lengths are powers of two. This means that we can use the Danielson-Lanczos result repeatedly to divide our input data into smaller and smaller vectors, eventually reaching vectors of length one, for which the discrete Fourier transform is just the identity. Then, as long as we take account of the bit reversal ordering due to the repeated even/odd splitting we’ve done, we can use the Danielson-Lanczos lemma or its matrix equivalent to efficiently reassemble the entries in our data vector to form the DFT (at this point, you may want to review the second article to remind yourself how this matrix decomposition works).
This is all standard stuff that you’ll find in a lot of textbooks. Starting in this article though, we’re going to veer “off piste” and start doing something a bit more interesting. It used to be that, if you wanted to do FFTs of vectors whose lengths weren’t powers of two, the professional advice was “don’t do that” (that’s what my copy of Numerical Recipes says!). However, if you’re prepared to brave a little bit of algebra, extending the Cooley-Tukey approach to vectors of arbitrary lengths is not conceptually difficult. It is a bit fiddly to get everything just right, so in this article, we’ll start exploring the problem with some more “toy algebra”.
Click here to read Haskell FFT 5: Transforming Vectors of Arbitrary Lengths
]]>[The code from this article is available as a Gist.]
Before we go on to look at how to deal with arbitrary input vector lengths (i.e. not just powers of two), let’s try a simple application of the powers-of-two FFT from the last article.
We’re going to do some simple frquency-domain filtering of audio data. We’ll start with reading WAV files. We can use the Haskell WAVE
package to do this–it provides functions to read and write WAV files and has data structures for representing the WAV file header information and samples.
We’ll use a sort of sliding window FFT to process the samples from each channel in the WAV file. Here’s how this works:
We split the audio samples up into fixed-length “windows” (each $w$ samples long) with adjacent windows overlapping by $o$ samples. We select $w$ to be a power of 2 so that we can use our power-of-two FFT algorithm from the last section. We calculate the FFT of each window, and apply a filter to the spectral components (just by multiplying each spectral component by a frequency-dependent number). Then we do an inverse FFT, cut off the overlap regions and reassemble the modified windows to give a filtered signal.
I’m not going to talk much about the details of audio filtering here. Suffice it to say that what we’re doing isn’t all that clever and is mostly just for a little demonstration–we’ll definitely be able to change the sound of our audio, but this isn’t a particularly good approach to denoising speech. There are far better approaches to that, and I may talk about them at some point in the future.
I’ve been doing some work on interactive features for the Radian plotting library recently, mostly just floating UI elements to allow you to switch between linear and log axes, to change the number of bins in histograms, and so on. There’s an article on the BayesHive blog showing some examples of how this works.
Now I just have to write some documentation to go with the spiffy new features...
[The code from this article is available as a Gist. I’m going to go on putting the code for these articles up as individual Gists as long as we’re doing experimentation–eventually, there will be a proper repo and a Hackage package for the full “bells and whistles” code.]
The decomposition of the Fourier matrix described in the previous article allows us to represent the DFT of a vector of length $N$ by two DFTs of length $N/2$. If $N$ is a power of two, we can repeat this decomposition until we reach DFTs of length 1, which are just an identity transform. We can then use the decomposition
$$F_{2N} = \begin{pmatrix} I_N & D_N \\ I_N & -D_N \end{pmatrix} \begin{pmatrix} F_N & \\ & F_N \end{pmatrix} P_{2N} \qquad (1)$$
to build the final result up from these decompositions.
At each step in this approach, we treat the even and odd indexed elements of our input vectors separately. If we think of the indexes as binary numbers, at the first step we decompose based on the lowest order bit in the index, at the next step we decompose based on the next lowest bit and so on.
Click here to read Haskell FFT 3: Cooley-Tukey for Powers of Two
]]>by Nils Nilsson
Nils Nilsson has been involved in artificial intelligence research more or less since the inception of the field in the late 1950s. (He’s probably best known for his work on Shakey the robot, A* search, and his idea of teleo-reactive planning.) This makes him ideally placed to write a history of artificial intelligence.
[The code from this article is available as a Gist.]
In the first article in this series, we considered a set of $N$ data values from a function $h(t)$ sampled at a regular interval $\Delta$:
$$h_n = h(n \Delta) \qquad n = 0, 1, 2, \dots, N-1.$$ and defined the discrete Fourier transform of this set of data values as
$$H_n = \sum_{k=0}^{N-1} h_k e^{2\pi i k n/N} \qquad n = 0, 1, 2, \dots, N-1. \qquad (1)$$
The original Cooley and Tukey fast Fourier transform algorithm is based on the observation that, for even $N$, the DFT calculation can be broken down into two subsidiary DFTs, one for elements of the input vector with even indexes and one for elements with odd indexes:
$$\begin{aligned} H_n &= \sum_{k=0}^{N-1} h_k e^{2\pi i k n/N} \\ &= \sum_{k=0}^{N/2-1} h_{2k} e^{2\pi i (2k) n/N} + \sum_{k=0}^{N/2-1} h_{2k+1} e^{2\pi i (2k+1) n/N} \\ &= \sum_{k=0}^{N/2-1} h_{2k} e^{2\pi i k n/(N/2)} + \omega^n \sum_{k=0}^{N/2-1} h_{2k+1} e^{2\pi i k n/(N/2)} \\ &= H^e_n + \omega_N^n H^o_n, \end{aligned}$$
where we write $\omega_N = e^{2\pi i / N}$, and where $H^e_n$ is the $n$th component of the DFT of the evenly indexed elements of $h$ and $H^o_n$ is the $n$th component of the DFT of the oddly indexed elements of $h$. This decomposition is sometimes called the Danielson-Lanczos lemma.
In this article, we’re going to look in detail at this decomposition to understand just how it works and how it will help us to calculate the DFT in an efficient way. In the next article, we’ll see how to put a series of these even/odd decomposition steps together to form a full DFT.
This is the first of a series of posts about implementing the Fast Fourier Transform in Haskell (and trying to make it go fast). This part is going to be pretty pedestrian, but we need to lay some groundwork before the interesting stuff.
Fourier transforms turn up everywhere. They’re important in pure mathematics, but they’re also used in pretty much any application that deals with time series: filtering, signal analysis, numerical solution of differential equations and many others. While the classical theory of Fourier analysis is based on functions, in applications we most often deal with discrete series of values sampled from a function. The discrete Fourier transform is how we do Fourier analysis in this setting.
In this series of posts, we’re not going to look very much at applications (well, maybe we’ll have one little exception a bit later) and we’re not going to talk a lot about the theory behind the Fourier transform. Instead, after briefly introducing the discrete Fourier transform, we’re going to use the fast Fourier transform algorithm and some variations to explore some aspects of Haskell programming. In particular, we’ll look at how Haskell deals with complex numbers, vectors, profiling and benchmarking, meta-programming and finally, as our pièce de résistance, we’ll use all of this to build a Haskell fast Fourier transform package that does compile-time empirical optimisation for arbitrary-sized transforms.
In practice, if you want to do Fourier transforms of discrete data, you use something called FFTW (“the Fastest Fourier Transform in the West”). Our code has no chance of competing with FFTW, which is incredibly cleverMatteo Frigo and Steven G. Johnson, “The Design and Implementation of FFTW3,” Proceedings of the IEEE 93 (2), 216-231 (2005). Invited paper, Special Issue on Program Generation, Optimization, and Platform Adaptation. PDF here, but we can demonstrate some of the methods that they use and learn some techniques along the way that are applicable to more complicated problems. (There are Haskell bindings for FFTW–the vector-fftw
package is a good choice.)
Click here to read Haskell FFT 1: The Discrete Fourier Transform
]]>I was out walking Winnie the other morning and got to thinking about the storage capacity of neural systems (I’m reading Nils Nilsson’s The Quest for Artificial Intelligence at the moment, which was what triggered this line of thought). I’ll write about that in another article, but I realised as I was figuring through this that I would end up talking about some rather large numbers. Everyone throws around big numbers (millions, billions, trillions, and up and up), but how often do you stop to visualise what these things mean?
There are some nice animations around that help with thinking about relative scales (for instance, this one is really good). What these things don’t do is to give you a sense of numbers. What does a billion of something look like? Can you get a really strong physical feeling for numbers of this magnitude and larger?
Here, I’ll show you what I do. It’s nothing ground-breaking, and it’s kind of obvious once you get started, but it is effective. It’s also quite a nice mental exercise.
I’m in the process of starting up a blogging series, to write about Data Analysis in Haskell. This is a sort of introduction/manifesto...
by Ian Cobain
It used to be that there was a tiny splinter of a fragment of me that was proud to be British. Tolerance, fair play, all that nonsense. Glossing over Blighty’s imperial adventures, surely we could be just a little proud of the legacy of our great nation? Nothing jingoistic, of course, just a low-key sort of inoffensive smugness.
That splinter of a fragment had been feeling a bit dull and lustreless for a while now. We had Afghanistan, Iraq and associated war crimes, the “War on Terror” (a.k.a. Bash A Brown Person For Britain), all the NSA/GCHQ spying revelations, and on and on it went. And of course Tony Blair isn’t rotting in a cell in The Hague which, if you ever needed another one, is a glaring signal that whatever justice there is in the world, it never intrudes into the worlds of the “great” and the “good”.
And then I read Cruel Britannia. Oh jolly day.
Part of the “flat route”
We’ve started running recently, with the slightly ridiculous goal of doing a marathon up a mountain in almost exactly a year’s time. (Idiots.)
All the stuff you read about running for beginners says “Stay away from the hills! Do nice easy runs on the flat!”. They talk about keeping to a conversational pace, or running so that you take one in-and-out breath for eight steps, or something like that. Holland would be perfect for beginner runners–nothing very steep, and certainly no long hills to wear your legs out.
The last couple of weeks have been a bit of a shock to the system. Rita went back to school two weeks ago, which meant that I went back to walking Winnie a couple of times a day (not quite every day, because Rita’s schedule often has gaps, but certainly a lot more than I’d done over the summer). Of course, I wanted to keep on doing all the other things I’d been doing: work (standing at my desk for 8+ hours per day), yoga (an hour or so every morning), circuit training (3-4 times per week), running (4 times a week), and so on.
Anyone who wasn’t an idiot would have seen where this was going. I was completely knackered by the end of the first week, was suffering a bit from “being on my feet”-related malaise (pain while running, pain while walking, pain while standing: oops). Silly boy.
Lights, mid-blink
I got hold of a BeagleBone Black a few weeks ago (courtesy of Tom Nielsen, who’d had it for a while, but had no time to play with it). This is a small (credit card sized) Linux machine with an ARM processor, intended for, well, pretty much anything you can use a little computer for. It has 512 Mb of RAM, plus 2 Gb of flash storage, so it’s not a completely trivial machine.
Obviously, to do anything really significant with it requires some hardware work (there are lots of general purpose I/O pins to play with, plus UARTs, Ethernet, USB, SPI, a couple of analogue-to-digital converters, PWM drivers for motor control, and even HDMI video output!), but there are a few LEDs on the board itself that lend themselves to a blinkenlights demo...
The BeagleBone comes with some built-in software to let people write code quickly, using JavaScript of all things. There’s also a bit of documentation about programming the thing in C. But (of course) I wanted Haskell blinkenlights!
So, the goal was to write a little bit of code to count in binary on the four user-addressable LEDs on the board, first in Javascript, then in C, then (somehow) in Haskell.
I’ve recently started helping with the maintenance of C2HS, a tool for generating Haskell foreign function interface (FFI) bindings from C header files. I started using C2HS because of a Haskell library I was writing to read Unidata NetCDF files. I didn’t fancy writing all of the bindings to the hundreds of functions in the C NetCDF library by hand and C2HS seemed like a good way to get started with the Haskell FFI.
For the BayesHive project, we have code split over a number of different Git repositories, all hosted on GitHub. We use GitHub for issue tracking and keeping notes on wikis.
Some of the splitting into separate repositories makes sense (our web app is in its own repository, for example), but some of the splitting is mostly historical and makes things a little inconvenient–the infrastructure for the Baysig statistical modelling language lives in three different repositories, which made sense at one time, but doesn’t now. Having these closely related bits of code in different repositories makes it more difficult to track down regressions (git bisect
doesn’t work across multiple repositories) and can make branching messy (if you’re implementing a feature that requires work in multiple different repositories, you need to remember to create a branch in each of them, otherwise it’s easy to commit inconsistent changes to the master branch in one repo and your topic branch in another repo).
As a result of all this, we decided to merge some of our repositories together. Based on a bit of Googling, this seems to be a relatively common requirement, and the advice that’s on offer out there is a little conflicting and confusing. In this article, I’ll try to reduce some of the confusion related to repository merging, as well as showing how to migrate GitHub issues from the pre-existing multiple repositories into the new merged repository. I’ve written a couple of small Haskell programs to do these tasks–they’re available here.
Here be marmots!
On our trip to Kärnten the other week, we drove up past the Großglockner, the highest mountain in Austria. We didn’t actually see the thing, since it was wreathed in clouds, but there was some other cool stuff to see. There were marmots!
Now, we’ve seen marmots before. They’re pretty common in the mountains here, way up above the treeline. They live in among the rock piles up there and eat the little alpine plants. And Winnie loves them. And hates them.
For BayesHive, one of the things we need to do a lot of is program transformation. The main reason for this is to generate code to represent probability distribution functions derived from descriptions of probability models–the models are described using code written in the probability monad, and this needs to be manipulated quite extensively to determine the PDF, which is needed for the Markov chain Monte Carlo sampling we use for estimating parameters in probability models.
Program transformation is something that Haskell is really great at, since it basically just involves pattern matching against lots of different cases of syntax trees in the language you’re transforming. Unfortunately, this leads to lots of code that looks like this (picking a simple case at random):
... case last rvs of Observed (EApp (EApp (EVar "map") (EVar f)) (EVar nm)) -> rewriteFst f nm _ -> rvs ...
All that stuff with EApp
and EVar
is the explicit AST representation of the expression map f nm
in the language (Baysig) that we’re manipulating.
At first, doing things like this is pretty unavoidable, but wouldn’t it be nice if you could write something like this instead?
... case last rvs of Observed [baysig|map $f $nm|] -> rewriteFst f nm _ -> rvs ...
We just spent a few days visiting friends in Kärnten and staying in their little mountain house. Apart from a little bit of rain and needing to extract Thomas’s car from the ditch where he’d decided to park it, it was a fun weekend. The best bit was that I got to spend lots of time talking to Rita, since there was quite a lot of driving. Work and Winnie things often result in us spending less time together than I’d like, so this was a nice change.
One day, the conversation came round to what it means to be “hard-core”. Rita said “We used to be tough!” and she was right. We used to do lots of things like white water kayaking, mountain biking and (especially) caving. Caving is funny: at least in the UK, if someone is a good caver, you don’t call them a “good” caver, you call them a “hard” caver. The sort of activity that often involves lying in cold water in the dark, crawling through mud for hours on end, has more of a requirement for toughness than most other things.
So, we did use to be pretty tough. Complaining was allowed, but you still had to suck it up and get on with whatever it was we were doing, even if it involved 600 metres of crawling through a narrow rocky tube resulting later in full-body bruising (Rita’s first and only trip down the notorious Daren Cilau in Wales). We even had a slogan: “More steel, less custard!” was the cry when spirits were flagging.
For BayesHive, we have a fairly big Yesod web application plus a lot of client-side JavaScript code. We’ve gone through a couple of iterations of how we organise all this. We started with a quick and nasty manual approach to chaining state between different pages of the app. That was pretty horrible. Then, as described in an earlier article, we switched to using AngularJS in the browser, which made the JavaScript side of things much nicer, along with Michael Snoyman’s yesod-angular
module for managing the interaction between the client and server. Eventually, we found that Angular’s default routing provider wasn’t flexible enough for our needs in the browser (we have quite a few “wizard”-type stateful interactions, plus a fairly complex dashboard with lots of more independent states).
So we decided to switch over to using the Angular ui-router
state-based routing system. This is pretty good and is definitely flexible enough for our current needs. It allows you to define a hierarchical structure of UI states, cleanly controlling transitions between states and assigning URLs to states to allow good interaction with the browsers back and forward buttons and to allow bookmarking of application states. The only problem? Not something that the yesod-angular
code supports...
by Harry Connolly
I have a bit of a science fiction and fantasy habit. If you read a lot of SF&F, it can be hard to find the diamonds among the dross. For some reason, the sub-genre usually called urban fantasy has more than its fair share of dross, which made it a pleasant surprise to discover Harry Connolly’s Twenty Palaces series. These three novels (plus a more recently published prequel) hit a lot of the usual urban fantasy tropes (fairly violent, noirish characters, magic that isn’t all ponies and rainbows, a “world behind the world”) but it confounds a lot of others in ways that make it anything but dross.
All of which makes it hard to understand why these books haven’t been much more successful than they have, and why Connolly has had to abandon the series for the moment to concentrate on other things. From what he says, his publisher (Del Rey) was unfailingly supportive, but the market just doesn’t seem to like what he was writing. He wrote a long blog article about why this might be, but it makes little sense–the most common complaints about the series are about the things that make it interesting and different!
I’ve started using c2hs recently, for a Haskell NetCDF library I’m writing (which might see the light of day in a couple of months). I like c2hs, so I volunteered to help out with maintenance. The first thing we did was transfer the source code repository from Darcs to GitHub, which was easy, using Steve Purcell’s darcs-to-git utility. Once that was done, the next task was to transfer all the tickets on the c2hs Trac site to GitHub’s issue tracking. That’s what I want to talk about here.
When we lived in Canada, we spent a lot of time biking. Road biking, mountain biking, a bit of touring, we were always on our bikes (when we weren’t in kayaks...).
A recent video on PinkBike brought it all back to us. We sat there (me with a little tear in my eye, I have to say) watching a couple of guys MTB touring around British Columbia, ending up on Vancouver Island, where they visited a bunch of places we used to go to all the time (the Riding Fool in Cumberland!), took ferries all over the place (it’s the BC way) and rode in the rain, in the sun and everything in between.
It looked amazing, and brought back so many memories. I do still miss Canada.
I remember being pretty blown away by the constraint-based drawing tools in AutoCAD the first time I used them, and I’ve been try for a long time to make some space to work on this sort of thing for myself (I got started a while ago, but stalled through lack of time).
Of course, there really is nothing new under the sun. Constraint-based drawing started in the 1960s, using hardware built in the 1950s (built using individual transistors: no integrated circuits back then!). Ivan Sutherland‘s Sketchpad was where it all started. I was reminded of this by a short article about Sutherland I only got around to reading this morning. The article is only so-so, but there are links to some videos taken from a TV programme made in the 1960s about the MIT Lincoln lab where Sutherland worked. They knew how to make science TV in those days. The steady hum of the air conditioning in the machine room, the formal “Well, here’s how it works, John” presentation, the suits and ties. Ah, I can almost smell the Bakelite and ozone. It’s great. Go and watch them and wallow in the techno-nostalgia.
So, this TV programme was recorded in 1963. And they had: GUI display with light pen interaction; 2-D constraint-based drawing; zooming user interfaces; 3-D graphics with hidden line removal; 3-D constraint-based drawing; object linking and embedding. 1963! How the hell did they do all that? Most of those ideas didn’t make it into commercial software until the 1990s, and a lot of people still don’t really know about the constraint-based stuff.
There’s a great quote on the Wikipedia page about Sutherland:
When asked, “How could you possibly have done the first interactive graphics program, the first non-procedural programming language, the first object oriented software system, all in one year?” Ivan replied: “Well, I didn’t know it was hard.”
That’s the attitude you need: forge ahead in ignorance of what received wisdom says is possible. That’s a credo I could like by. If it doesn’t contradict the laws of physics, it’s not impossible, so get on and try it!
My Radian HTML plotting library is now open source! You can read about it on the BayesHive website here.
For the past few months, I’ve been spending most of my working hours on BayesHive, the brainchild of Tom Nielsen, CEO of OpenBrain. This is a web platform for doing Bayesian data analysis, intended to make these powerful methods more accessible for scientists and other data analysts.
We opened the system to some alpha users a while ago, but this week we’ve decided we’re ready for wider beta testing, for people to have a look, kick the tyres and tell us what they think.
I’ve mostly been working on the web app part of the system, but in this post I want to talk a little bit about the general idea behind the system and the Bayesian approach to statistics. (As far as BayesHive goes, the system is made up of a reasonably complicated single page web app using the AngularJS framework, a Yesod server process to deal with the management of documents, data and models, and a model compiler and inference engine that does all the clever stuff.)
I’ve been working on BayesHive with Tom Nielsen for something like six months now, and it’s been a lot of fun. We’ve been ironing bugs out of the back-end of the system, which implements a probabilistic functional language (called Baysig). We found an entertaining thing the other day–I raised the bug report, Tom found out what was going on, and I laughed quite a lot when he told me.
We use the Scrap Your Boilerplate generics framework for doing various transformations of the AST of the Baysig language. It’s a great way to define transformations of complex hierarchical data structures in a nice compact way. It does have a trap for the unwary though...
Well, who’d have thought it? Cut back on public services, work as hard as you can to entrench the inequalities in society that lead to unemployment, homelessness and mental illness, and what do you get? Increased suicide rates, widespread depression and suffering. But no matter. We have to hold the course. Austerity is the watchword. Belt tightening all round. Jonquil might even not get a new pony this year! What are a few community centres closing, a few old people being ever more isolated, prisoners in their own homes, compared to that?
Seriously, it’s pretty damn obvious that a policy of rigid austerity (for the little people) is not an appropriate response to a recession and widespread unemployment. But, if you’re too stupid to attend to lessons like that of the US post-World War II public works programme, there is now some more recent and pretty unequivocal evidence.
Remember Iceland? A few years ago, they were faced with financial disaster, and all the predictions were dire. In the face of widespread condemnation from other European governments and the banking sector, the Icelandic public essentially voted to tell the banks holding their national debt to piss off. And now? They’re doing fine, thank you very much. Unemployment down, economy booming.
That’s just one example of many. Across the board, austerity measures are correlated with negative outcomes, both economically and socially. Investment in public health and social welfare provide a better return, euro for euro, than investment in any other sector, including tax cuts for industry and defence, the two holy cows of the right.
Go and read the article. It’s pretty depressing.
In Kim Stanley Robinson’s Antarctica, there is a character, Ta Shu, who is a famous Chinese feng shui expert. Ta Shu is travelling around Antarctica with an adventure tour group, relaying his experiences back to an audience at home. His descriptions of the landscape are detailed and insightful and extremely poetic. In Chinese, at least. His English is more limited, which leads to some of the other characters in the book underestimating his intelligence, particular as he responds to most of their queries about what he thinks of places with the phrase “This is a good place” as a substitute for the detailed explanation he would give in Chinese.
Anyway, yesterday we went to a place that Ta Shu would have described as a “good place”. We were spending the day with Rita’s parents, her brother and his girlfriend, and her parents. They were all visiting our part of the country for a “parents meet the parents” holiday. We first drove from Igls to Achensee, a long lake in the mountains about 50 km east of Innsbruck. While the others went on a boat trip, Rita’s parents, Rita, me and Winnie went up on a cable car to do a bit of walking. The weather wasn’t great and there was still a lot of snow around, so we didn’t get all that far (although Winnie did a bit of marmoset-chasing, which was more fun for her than for me, as I had to then chase her...). After coffee and schnapps in the hut at the top of the cable car, we headed back down to meet the others.
TL; DR: use package version number bounds in the build-depends
clauses of your Cabal files. You may save yourself some time and frustration.
Most Haskell programmers seem to have Cabal war stories to tell. Library version management in Haskell is complicated by the aggressive inter-module optimisation that GHC tries to do, which makes Cabal’s life tough and sometimes necessitates blowing away both Cabal and GHC’s knowledge of installed packages to start over. It’s also frustratingly easy to inadvertently get into a state where you have versions of packages installed that prevent Cabal from finding a consistent set of packages to install to satisfy the constraints for a new package you’re trying to install.
I thought I’d cleverly avoided most of these problems by using the hsenv sandboxing tool, which allows you to isolate the set of packages needed for a particular project, and even to maintain projects that use different versions of GHC, and cabal-meta, which allows you to build multiple packages from source together so that Cabal can find a set of package versions that are acceptable for all of the packages at once. I often found myself succumbing to a smug sense of self-satisfaction as I read about other peoples’ Cabal problems on haskell-cafe.
Of course, pride goeth before a fall. My hubristic little bubble was burst the other day when I decided to reset my sandbox for a project I’m currently working on. For some reason, typing cabal-meta install
just wasn’t working, and I was being told that there was no set of package versions that could consistently satisfy the requirements of the packages I was trying to build from source.
In the last three days, I’ve averaged 10 hours of billable time per day. That’s ten hours per day actually standing at my desk coding, perhaps 10-20% more than that “on the job”. I’ve been working on social and search aspects of the BayesHive web app: we have a pile of different kinds of items (data sets, documents, models, and some others) that are all searchable and shareable in slightly different ways, and this all needs to be managed in a sensible way.
After a frenzy of hacking, I merged the branch I’d been working on just before dinnertime. It all works pretty well, I think, although there are probably some holes. It’s pretty inefficient, but we’re planning to move to a different storage system for documents and data soon, and that will make full-text indexing much easier, streamlining all of this stuff, so I was happy to produce a clean design and not worry too much about optimisation for now.
I’ve been really enjoying this work. I think what we have now may be one of the more complex Haskell web apps out there, and we’re gradually converging on a design that looks really pretty nice. The combination of Haskell, Yesod and AngularJS is very effective, and there’s enough Haskell work to balance out the horror that is JavaScript.
Tom should be inviting some people to start playing with BayesHive some time in the next week or so. It will be good to get some other eyes on what we’re doing. I’m going to spend some of tomorrow trying to do some example analyses, but I’m too familiar with the foibles of the web app to be a really good tester.
The only question now is whether, after three glasses of wine, I’ll still wake up at dawn (about 5:30 here right now) tomorrow morning, as I’ve done for the last few days...
I’m a good dog, I am!
Our dog Winnie is a great little dog. Of course, we would say that, but it really is true. The more I read about dog development and behaviour, the more pleased I feel about how far she’s come since we adopted her.
Sweet as she is though, there’s one thing she does that drives both me and Rita spare, something that can make a nice outing into a big pile of stress. Winnie is a hunting dog, and oh, does she love to hunt. She never catches anything, but she loves to chase things. Birds and squirrels mostly, but deer, hares, cats, anything that runs away. (She will stand and watch deer that aren’t moving or that haven’t seen her, but if they run, she’s off.) She has in the past thought about chasing Gämse in the high mountains, but I managed to put her off that–she had no chance of getting anywhere near them and she would probably have killed herself falling off a cliff trying to keep up with them...
Copyright 2010 Raymond Long
When I started doing yoga, it looked simple. Twist yourself into a little pretzel, sit there for a little while, then choose a different pretzel shape and repeat. It turns out that things aren’t quite so simple. You have to engage muscles to remain in these twisty positions and even to pull yourself further into the postures.
Knowing which muscles to engage and where to focus your attention in each posture is something that a good teacher would help with, of course, but I’m a bit teacherless. There is some advice available online, much of it expressed in terms of analogies (“embrace your thigh”, and so on). That’s OK, but it’s not a substitute for a precise anatomical description of what’s going on.
UPDATE: Radian is now open source. Read about it on the BayesHive website here.
Courtesy of Hideharu Sakai, this article is also available in Japanese.
One of the things we needed to be able to do for the BayesHive software I’m working on with Tom Nielsen of OpenBrain is easily produce plots within our front-end web app. There are lots of JavaScript libraries for doing this: Tom started off using Flot, which is very quick and easy to use, but wasn’t really flexible enough for what we wanted.
I’d just been working on another part of the software, a tool for constructing algebraic representations of dynamical systems (defined by systems of ordinary or stochastic differential equations), and I’d been using the AngularJS framework. It took a while to get used to, but it seemed as though it might, in conjunction with D3.js, provide a way to produce a declarative plotting API that would be really easy to use and could take advantage of the clever data binding features of Angular.
Thus was Radian born: a declarative extension to HTML for rendering data and functional plots as SVG graphics. You can go straight to a gallery of examples here. Details and some more examples below the fold. (Note that, since Radian uses SVG graphics, you’ll need a fairly recent browser to see anything...)
When we lived in Canada, we exercised a lot: mountain and road biking, running, sea kayaking, CrossFit. That dropped off in France, but I managed to do some gymnastics and basic conditioning, which continued for the first summer here in Austria (lots of gymnastics: I even managed to do a few muscle-ups on my rings!).
Over the winter in Igls, I’ve not been doing a whole lot, partially for lack of time (I keep telling myself that three hours of dog-walking a day has to be worth something), partially for lack of inclination, and partially because the pull-up bar is too cold to hold onto in the mornings when it’s -10°C outside (that sounds like a better excuse than “lack of inclination”!). I’d been thinking for a while that I needed to start doing something again, and Rita had a yoga book lying around, so I thought I’d give that a go, if only for a while.
So, I lied a little in my last post. I didn’t quite get to go off right away and update my Beeminder totals, because I had a bit a problem with my blogging software. I use Jasper Van der Jeugt’s Hakyll, a Haskell static site generator. Jasper recently (well, not that recently...) released a new major version of Hakyll, Hakyll 4. I had been putting off upgrading because I have so much custom code that it’s kind of a big job–my blog is a little different to most Hakyll sites, so I had to write quite a bit of extra stuff to work around that.
I learnt a new word yesterday: akrasia (rhymes with “aphasia” and “Malaysia”). It means not doing the things you want to do and know that you want to do; acting against your better judgement. The classic example of akratic behaviour is weight loss (or weight control). That extra slice of pie always looks like such a good idea at the time and there’s no short-term consequence. It’s only later that you look back with a queasy feeling of regret at the empty pie dish and think of how your rolls of kidney flab are just not fashionable any more (and probably haven’t been since the 1830s).
Everyone is akratic to some extent. I know that I am. Some days I just can’t face the work I have to do (even though it’s usually fun once I get started) and just doodle around on the internet. I’ve tried various methods for organising myself (I use org-mode in Emacs for pretty much everything: TODO lists, scheduling, time tracking for invoicing), I’ve tried structured procrastination. Sometimes these things work and sometimes they don’t. The main problem is that, while these methods can help you to organise yourself, they don’t impose any short-term consequences on you to make you do things.
So, I’ve not done any blogging since the ill-fated Midwinter Polyphasic Sleep Experiment. I have a backlog of articles I want to write, but most of those involve some significant investment of time and effort. So you get this instead...
I’ve been doing a lot of JavaScript programming recently. One of the two contracts I’m working on at the moment is for a UK university spin-off called OpenBrain, who are trying to bring Bayesian statistics to the masses by making tools that allow scientists to build complex (or simple!) Bayesian models and have them analysed using Markov chain Monte Carlo methods. All the really clever stuff is in the server-side MCMC inference code, but the user-facing side of things is a web app with a Yesod backend. My mission is to prettify this and make it ready for prime-time.
It’s a fairly complicated application that needs to manage literate Markdown documents describing statistical models, data sets for analysis and a bunch of other stuff. The first thing I looked at was a tool for building statistical models based on dynamical systems represented as systems of ordinary or stochastic differential equations. This tool allows you to give algebraic representations of ODEs or SDEs, which are then rendered as MathML (for prettiness and familiarity), you can do simulations of the systems and visualise the results, you can specify data sets that correspond to observations of system variables, you can set up prior distributions for system parameters, and you can then get some code (in a proprietary Haskell-like language specialised for statistical modelling) that can be used to drive MCMC inference for the parameter values.
All that stuff happens on the client side, except for the listing of the available data sets and the rendering of the model into code. This means that you need client side code for expression parsing, analysis of coupled systems of ODEs and SDEs (to turn them into a canonical form that you can simulate from), numerical integration of ODEs and SDEs, graphing, and a framework to tie it all together. That’s a lot of client side code!
Click here to read JavaScript Choices for Haskell Programmers
]]>Urgh. Well, that didn’t go quite as I was hoping. I discovered that I didn’t much enjoy being awake on my own through the 16 hours of darkness we currently have here. Not as a habit anyway. The trade-off between the vistas of potentially infinite extra time and feeling like a very slightly warmed-over turd in human form sent me off to bed for a proper sleep at about 4:00 this morning.
Right now, it’s early afternoon on “Day 2”, just about 24 hours after I started this experiment. Things haven’t gone completely smoothly, but I’ve learnt a few things:
I need a louder alarm than the one I started with! I went to bed for a 30-minute nap at 1 a.m. and woke up at about 5:10. Not really to plan! I slept right through the alarm. I’ve turned the volume up now and it seems to work better.
People weren’t joking when they said the adaptation period wasn’t much fun. I’m feeling a bit poo and not able to get much work or serious thinking done. Fortunately, Rita is away visiting her parents this week, so she won’t be exposed to an even more grumpy than usual me.
It’s going to feel like pulling an all-nighter every night for a few days. Urgh.
I’ve decided to work with a 1-5-9 nap plan, i.e. a 20-30 minute nap at 1:00, 5:00, 9:00, 13:00, 17:00 and 21:00 (and no other sleep, unless I feel like crap and need to insert another nap somewhere).
by Octavia Butler
I think I remember reading part of one of these books when I was very much younger. I don’t remember getting a lot out of it, so I’m very glad that I reread them again now. There’s a lot to think about here.
Since I’ve found some contracting work, my only real problem in life has been lack of time. I spend three or four hours a day walking and caring for our dog Winnie (she’s very high energy and needs a lot of exercise to be happy), I sleep for eight hours or so, and I try to work for about eight hours per day. That doesn’t leave a lot of time for “everything else”.
So what to do? Winnie time isn’t really negotiable, and Rita’s schedule during the week doesn’t allow her to help too much, especially in the winter when it gets dark early. She does that stuff on the weekend, but then I also want to hang out with her and Winnie. Reducing work time isn’t really an option, both because it’s interesting and because I need to earn enough to support us.
That leaves sleep. It’s interesting that monophasic sleep, the idea of going to bed between ten and midnight and getting up between six and eight, with the whole intervening period ideally spent asleep, seems to be something of a modern invention, dating from the Industrial Revolution. In Europe in medieval times, a pattern called segmented sleep or biphasic sleep was more common–people who were tired after a long day’s work in the fields would come home and go to sleep immediately for “first sleep”, then would wake up some time in the night to do whatever (eat, have sex, visit neighbours, pray, etc.). Then they would go back to bed for “second sleep” before getting up for the next day’s work. And obviously, among nomadic or pastoralist people, sleep was often taken when and where it was possible, perhaps only as short naps.
It turns out that there are people who have experimented with a whole bunch of different sleep patterns. Most of this isn’t really validated scientific research, and there seems to be a very wide range of individual adaptability to different patterns, so it might be difficult to come up with any conclusions that would apply to everyone out there, but some of the anecdotal evidence is very interesting indeed.
Ah, another involuntary blogging hiatus. This time, for a good reason though. I have work. And it’s work that pays money, is interesting and has some future in it. It’s pretty cool. I’ve also had an offer of more work that I’ve had to defer until next year, which is encouraging. It seems as though this contracting thing might actually work out.
I have two clients I’m working with at the moment. The first is my old research group in Bristol. I’m doing some Fortran programming for them, extending a tool they use for generating input files for the main climate model that they use. It’s not a very complicated job, apart from the historical aspects–there are three main versions of the code for this tool, one of which comes in 269 distinct, slightly different versions... So sorting out which version is most “canonical” is an interesting problem. I’ve calculated Levenshtein distances between all the distinct versions and have used hierarchical clustering to get some idea of the historical relationships between all these versions. I think that this is probably going to be the trickiest part of the job! (There’s also a Tcl/Tk GUI that will need to be modified, which will be about as much fun as a root canal, but the changes needed should be pretty localised.)
The second client is a start-up that’s doing some really interesting stuff. They’re still in a pre-release development phase, so I’m not sure how much I’ll be able to blog about for now, but suffice it to say that there’s Haskell, Bayesian statistics, Markov chain Monte Carlo and data visualisation aspects to it. I’ve spent the last few weeks working on part of the web GUI front end, which has been an eye-opener, since it involved rather more JavaScript than is good for my mental health. I’ll write about some of that in general terms over the next few days.
I have another mad idea that I’m going to start on Monday. I’ll write about the details of that tomorrow, but it will either result in a 100% productivity increase or some sort of institutionalisation. We’ll see. More from the Department of Irresponsible Human Experimentation tomorrow...
This won’t sound like fun (or Fay!) to start with, but we’ll get there...
In the previous article, we looked at some of the details of a scheme for calculating the Lyapunov exponents of a system of differential equations. In this post, we’ll see how to implement this scheme in Haskell. The code we describe is available as a Gist here.
The archetypal chaotic dynamical system is the Lorentz system, defined by the differential equation system
where
Determining whether a given dynamical system will exhibit chaotic behaviour is, in general, difficultIn fact, the Lorenz system was only proven to be chaotic in 2002–see: W. Tucker (2002). A rigorous ODE solver and Smale’s 14th problem. Found. Comp. Math. 2(1), 53-117 (PDF).. The orbit structure of chaotic systems can be very complex, and it can be difficult to distinguish true chaos from a number of related phenomena.
One approach to characterising the orbit structure of dynamical systems is via Lyapunov exponents, a spectrum of values that measure the stretching and squashing of orbits–“normal” chaotic systems normally have at least one positive Lyapunov exponent and are dissipative, which means that the sum of all of their Lyapunov exponents is negative. While it is relatively straightforward to define Lyapunov exponents, in practice calculating them is tricky. We’ll see how to do this in some practical cases in this and the next couple of articles, following mostly the approach of Rangarajan et al. (1998).
The motivation for this series of articles was basically that I have recently been wondering about how practical it would be to use Haskell for more “traditional” mathematics: there is a big body of work and code relating to category theory and type theory in Haskell, but I’ve not seen so much about matrix algebra, differential equations, dynamical systems and so on. I implemented the Rangarajan method in Mathematica some years ago and thought it might be a good test case. I’m going to gloss over a lot of technical details about the computation of Lyapunov exponents since my main interest is in the Haskell implementation issues.
We woke up to snow in the garden, snow all around this morning. There had been some rather slushy sleety attempts at snow yesterday evening, but nothing that would have made us believe that we would wake up to the first white blanketing of the year in the morning.
It was pretty grey when I set out from the house with Winnie for our morning walk. After some crazy sniffing for mice in the fields below our house, we climbed up into the forest near Lanser Kopf, and the sun came out through the clouds. It was quite beautiful and very mysterious, mist rising from the ground, wisps of cloud around the mountains, beams of bright sunlight shining through the trees.
The sun came and went during the two hours of our walk, and we both got pretty wet from snow falling from the trees as it melted. Winnie also got dirty enough to need a rinse in the bathtub when we got back too!
Most of the snow has gone again this evening, at least down here in Igls, although the mountains all around are still coated. I guess we’ll be seeing a lot more of the white stuff soon!
So, GHC 7.6.1 was released last month. And there was a new release of Cabal too, bumping the version number to 1.16 from 0.14. The Arch Haskell people did a great job of quickly producing Arch packages for the new version, which meant that everyone pulled the new versions when they did an update via pacman -Su
. Unfortunately, there seems to be a problem. The new cabal-install
version can’t read its own default configuration file and just fails more or less immediately. There seems to be a fix, but it’s not yet percolated into the Arch packages.
Michael Snoyman recently had a blog post about composability in the Yesod web framework for Haskell, where he responded to comments about it being difficult to build reusable components for Yesod by, well, building one. I’m pretty much a newbie with Yesod, but the composability aspect of things had never looked too difficult to me, so I thought I’d also have a go at the same kind of exercise.
I decided to implement a slider form field, using one of the nicest jQuery-based slider widgets I’ve found. All the code for this example is available as a Gist. I’ll follow the usual convention here of putting more or less all of the code in the blog post too.
Some recent-ish papers of mine, that is. Although I didn’t enjoy my last post-doc in Montpellier a whole lot, there was a small amount of output from it, which probably deserves recording. So here goes.
So, we’ve embarked on a new life here in Igls. Rita has started school, and I have done most of the paperwork to set myself up as an independent software contractor. It’s all rather exciting!
There will be some changes here over the next couple of weeks, since I want to use www.skybluetrades.net as my “professional” website. I’m also going to start blogging again, after a break over the summer.
After the two days we spent in Vorarlberg, we headed for Rita’s parents’ place in Lower Austria, in the village of Ravelsbach. The drive wasn’t too bad, going through the Vorarlberger tunnel (13 km long, the longest in Austria), then past Innsbruck (very nice), a bit through Germany (now with speed limits on the motorway, although they appear to be entirely optional), past Linz and towards Vienna, then a little “short cut” courtesy of Rita once we got closer to Ravelsbach...
So, over the weekend Rita and myself made a big step. We both sent in resignation letters for our jobs in France! Neither of us likes living in Montpellier and, although Rita’s job is more interesting for her than mine is for me, she wants to have a change of career and is trying to get a place to study occupational therapy in Innsbruck.
For me, my current post-doc hasn’t really worked out the way I wanted it to, and I don’t want to apply for a permanent job with CNRS in France, so I’m going to branch out a bit and try to get some freelance contracting work, either programming or writing (or more likely, a bit of both), spend some time working on open-source and personal projects, and learn to make beer and wine and schnapps!
We’re going to spend the summer at Rita’s parents’ place in Ravelsbach, so as to have a bit of time to configure our brains for a complete change of scene. We have lots of things to do over the summer, and we both have that feeling you get when you’ve made a big decision, that you just want, in some sense, to move on to the next thing. We’ll have three weeks more of work in France when we get back from holiday to clear everything up and get all our projects into a state where they can be continued by other people, then we’ll be packing up and moving here to Austria.
We’re both kind of excited by the whole thing. Even if it does mean that I really really have to learn German properly this time!
So, we’ve come to Austria on holiday for a month. We drove from Montpellier and stopped off in Vorarlberg in western Austria for a couple of days of mountain climbing before coming here to Ravelsbach, where Rita’s parents live.
The drive from Montpellier through France and Switzerland was fine, and Winnie was well-behaved. We stayed in a village called Hittisau, just over the border into Austria. Navigation was no problem until we actually reached the village itself, which probably has a population of about 2000 people. All of whom appear to be called Hagspiel, which was the name of the family who owned the place we were staying. It took three goes and a phone call to other family members to locate exactly which “Pension Hagspiel” we were booked into. Confusion was compounded when the people at the first place responded to Rita saying “Hello! We have a reservation for two days: we’re the couple with a dog.” with “Are you sure? The couple with the dog already arrived this morning!”. Obviously a different couple with a dog arriving the same day as us at a different pension owned by different members of the same family...
I was up at our experimental site at Puéchabon the other day, installing a new PC to collect images from our webcam. The previous one had been stolen by some local yokels, along with all the solar panels and the electronics to go with them, and a load of other hardware. The technicians who maintain the site have made a Fort Knox-like enclosure for the instrumentation in our little cabin, and I had a shelf in there to put the Eee PC that collects the images from the camera at the top of our flux tower. (We’re monitoring colour changes in the foliage to see if we can use these not-very-remotely sensed data to detect important phenological changes.)
I got the computer attached to the Ethernet cable and the PoE box going to the camera, connected to the power supply in the cabin, and checked that I could see images from the camera on the live web page view (we’re using a StarDot camera). That all worked fine, so I just needed to set up the regular FTPing of images from the camera to the laptop.
Oops. I’d forgotten to install an FTP server on the laptop. The laptop is running Ubuntu Linux, the camera runs an embedded Linux distribution, and there’s a cron job on the camera that takes an image and FTPs it to an archival server at regular intervals. Of course, the “archival server” in this case is a little laptop in a shed in the woods, and it needs an FTP server. Which I hadn’t installed. And I was off in a shed in the woods, far from an internet connection...
I swore at my stupidity for a couple of minutes, then had a rummage in my rucksack. Yes! USB cable. Two minutes later, I was connected to the internet via my smart phone and a USB tethered network interface on the laptop. Wow. I had to hold the phone up in the air to get decent reception to download an FTP server package, and it wasn’t the fastest, but it was a moment that made me think about how weird our constantly connected world has become. We may not have electricity at our experimental site, but we can haz internets!
I was originally planning to write a quick-and-dirty implementation of the simplex algorithm myself to demonstrate some of the gritty details, but I decided to leave that kind of thing for later, since there are going to be other, less well-known, constraint solving algorithms that I want to look atThe Cassowary algorithm is also a tableau-based algorithm, like the simplex algorithm, so I might spend a bit of time thinking about that in a bit more detail.. There is a Haskell implementation of a simple version of simplex algorithm in the Matrix.Simplex
module in the dsp
package, although that comes with a comment that says “I only guarantee that this module wastes inodes”, so let’s view this more as a way to play with some interface issues than a serious attempt to implement any kind of constraint solver! The code that goes with this article is available here.
Caterpillars!
We saw something cool on the way to work this morning. It turns out that they’re considered pests in this part of the world, but a five metre string of nose-to-tail caterpillars counts as cool in my books even if they are pests. Apparently, these little critters make nests high in pine trees (we think we saw the nest– looked like a big ball of spiderweb), then troop down the tree and off into the world to make caterpillary mischief of one sort or another.
My second attempt at a PhD went off rather better than my first. It was all done, dusted and examined in the allotted three years, without too much psychic suffering. Even more remarkable was that, in a period of four years, both my partner Rita and myself finished, wrote up and graduated. Without any domestic bickering at all (well, more or less).
Here’s a rag-bag of entertaining things...
Whatever: John Scalzi’s blog
John Scalzi writes entertaining science fiction books and is also an all-round good egg. His blog, Whatever, has been around since forever (in internet terms). Scalzi has a magic touch when it comes to moderating comment threads: Whatever must be the politest place on the internet, all without threats or banning or rudeness from John. Scalzi’s books tend towards the light and funny, but his blog posts are sometimes more serious. Here are a couple that really stray dangerously close to thought-provoking: Being Poor, Things I Don’t Have To Think About Today.
Mathematical Fiction
Yes, everything is there to be found on the internet. Even a page detailing more or less every work of fiction involving mathematics even tangentially, scored for “Mathematical Content” and “Literary Quality” (although they do need some graphs to show whether there’s any sort of correlation between these scores!). There’s some good stuff on there, and it’s quite interesting just to browse along the “similar story” links to see where you end up.
Structured Procrastination
I’ve never been a fan of the Getting Things Done type of self-help/self-organisation book, but John Perry, a philosophy professor at Stanford, has what sounds like a perfect recipe for making progress on all those tasks that get pushed aside in favour of more interesting things. The key idea seems to be to make a task list with tasks at the top that you’re never really going to do (“Write novel.” “Learn Icelandic.” “Found religion.”) then use your other tasks as excuses to avoid these. That way you can trick yourself into actually doing things. It sounds like a great way to go, and it fits very nicely with the way that a lot of natural procrastinators already work. We don’t do nothing, we just do something more interesting and perhaps less important that what we’re supposed to be doing...
How Your Cat Is Making You Crazy
Finally, here is one of the freakiest things I have heard about for a long time. You think you have free will? You think you’re in control of the decisions you take? Maybe you should think again. Rats infected with the parasite Toxoplasma gondii, which they pick up from cat faeces, exhibit a strange mix of behavioural changes, the most striking of which is that they no longer fear cats. Male rats infected with T. gondii find the smell of cat urine sexually exciting. Result: infected rats get eaten by cats and the parasite goes on its way to explore the next part of its charming life cycle. Big deal, you say. Who cares about freaky rats? The big deal is that T. gondii is a zoonose–it infects humans too. There’s not a perfect match between the parasite and human brains, since T. gondii evolved to jump between cats and rats, but the parasite has enough affinity for human grey matter to cause significant changes in behaviour in some carriers. It gets a little scary when you hear just what those changes are. One is increased recklessness (particularly in males), leading to detectable differences in car accident statistics for carriers compared to non-carriers. Another is schizophrenia, which shows strong correlations with T. gondii status. How weird is that? A parasite that lives in cats and rats may be at least partially responsible for one of the most mysterious and terrifying of mental illnesses. There is still a lot of work to be done on this, but there are some serious people involved in the research, and it sounds pretty solid.
For me, I guess there are two separate “whoah!” moments that come out of this. The first is about the power of evolution. The coevolution of parasites and their hosts is already strange, even before you get to mind-altering parasitic cysts that make one of the parasite’s carriers more likely to be eaten by another. I find it quite hard to get my head around this: it’s like the parasites are farming the rats and cats (albeit unintentionally). The second thing is the idea that schizophrenia in humans may just be collateral damage in the T. gondii/Felis catus/Rattus norvegicus arms race. It would be one thing for humans to suffer the by-blows of some cataclysmic war of the gods, but these are cats, rats and protozoa!
If you want to get even more scary, think about the fact that T. gondii is just one environmental parasite, one of the relatively well-studied ones. There might be dozens of other little suckers shaping you to their ineffable monocellular will. Still think you’re the one driving up there?
To compensate for the psychopathic driving habits of the French, the Mediterranean climate is often offered up as a benefit of life here. For those of us who are not all that keen on temperatures in the 30s for weeks on end in the summer, that’s not much of a compensation. However, today, I would offer another piece of evidence that this “Mediterranean climate” isn’t all it’s cracked up to be. Here’s the swimming pool in our residence, with a nice layer of ice all across it. And yes, Rita is wearing a down jacket, an article of clothing more common on the ski slopes and high mountains than the Mediterranean coast...
by Francis Spufford
How do you make a book about central economic planning in the Soviet Union into an entertaining page-turner? You do what Francis Spufford did with Red Plenty, a book that is almost impossible to classify, a mix of fact and fiction and dramatised might-have-beens, plausible if not quite verisimilitudinousA little like that word I just invented. narratives populated by a melange of real historical personages and imagined characters.
I spent the period from October 1998 until November 2001 as a graduate student in the atmospheric physics department in Oxford. This all started out swimmingly.
Over the course of the month of January, I:
published 24 blog articles. Less than thirty, but not a crushing failure.
had 13 days when I didn’t write anything, compensated by two days when I wrote 3 articles and 3 days when I wrote two.
wrote about 17 articles that I would consider “proper” articles, i.e. involving some actual writing. The others were just links or photos.
was ill for a few days, which could be a viable excuse for around 5.5 missing articles, which means that I (morally) only missed my target by half an article. Compared to people who liveblog their gall bladder surgery, I feel kind of small. Maybe it’s not so much of a moral victory after all...
So, was it a useful thing to do? Did I learn anything? I certainly did:
It’s quite hard to come up with substantial and substantive articles every day. I could wiffle on about random subjects without too much difficulty, but things that involve thinking take a bit more time.
The writing, in terms of putting words down one after another, isn’t something I find too hard. (I think I knew that before, but it’s nice to have it confirmed.)
Writing reviews of other peoples’ work (books, papers, etc.) is an order of magnitude easier than generating original ideas yourself. There’s an in-between stage that’s quite good for blogging, I think, where you get some interesting ideas from reading or thinking about something, not substantial enough to turn into a longer piece of writing, but perhaps enough to blog about. That provides quite a nice incentive for working through my enormous literature backlog...
Ultimately, the most satisfying things for me to write are more meaty. I didn’t manage anything I would consider in this category this month, mostly because of the perceived pressure of at least thinking about producing something every day, but I have a few ideas I’m going to work on.
Given the other things that I want to do with my (limited) time, I think a more reasonable goal is 2-3 articles per week, and that’s what I’m going to try to stick with for the next couple of months. We’ll see how that goes.
What about other 30-day challenges? Well, I have one lined up already. Starting tomorrow, I’m going to try to do 20-30 minutes of German study per day. We’re going to be moving to Austria in a few months, and I’d like to be able to do a bit more than order beer and pretzels once we’re there!
I recently read an interesting article by John McCarthy that set me thinking. The paper is called The Well Designed ChildJ. McCarthy (2008). The well-designed child. Artificial Intelligence 172, 2003-2014. Free online version here. and talks a bit about about the old nature versus nurture debate: to what extent is a newborn child a blank slate, and to what extent are human intellectual capabilities intrinsic and instinctual?
To me, the whole question of nature versus nurture has always seemed odd. It’s clear that both aspects are important. We are all limited to one extent or another in what we can do purely by the physical parameters of our existence: as much as I would like to teleport myself to Europa to frolic in the sunless sea beneath the ice, I can’t, and never will be able to. Equally though, the potential that exists in all of us at birth can be squandered, and without education and opportunity, none of us can make of ourselves all that we might. So, we need a bit of both.
Trident is the UK’s independent nuclear deterrent. Except that it isn’t independent and it doesn’t deter anyone. It is nuclear though, so we do have that. Apparently, Trident is our ticket to sit down with the big boys, to strut around as part of the Nuclear Club, to be on the right side of the Non-Proliferation Treaty (the side that does all the shouting about non-nuclear states and quietly does nothing about its own disarmament obligations). Narrow-minded political commentators on the right in the UK regularly pull out this argument as a reason for keeping or renewing Trident, despite its obscene cost and highly dubious (at best) morals. And that tells you more or less all you need to know about international politics at the highest levels. It’s just Playground Bullies Redux. He who has the biggest stick wins.
Trident isn’t independent (the missiles are leased from the US, the warheads, although assembled at Aldermaston, are based very firmly on US technology, the re-entry vehicle navigation software is American, and so on), but Scotland might soon be. Which might pose a tiny little problem for the Royal Navy, since the only places in the UK where they can harbour and replenish their missile boats are in Scotland. Oops.
From the Guardian this morning:
Asked during the referendum debate in the Scottish parliament last week whether the government of an independent Scotland would do a deal to keep Trident, the first minister Alex Salmond replied: “It is inconceivable that an independent nation of 5.25m people would tolerate the continued presence of weapons of mass destruction on its soil.”
I don’t know if it will really happen. People tend to get all serious about nuclear weapons, and there might be some sort of “deal” done, backed by threats, but I honestly love the idea of Alex Salmond wagging his finger at David Cameron yelling “Oot! Oot, ya wee eejit! An’ tek yon dam’ missals wi’ ye!”.
My Freude is quite quite schaddly tonight...
One of the things I’m responsible for in my day job is a phenology webcam at our experimental site at Puéchabon. The idea of this is to observe colour changes in the canopy of the forest up there, with a view perhaps eventually to replacing manual phenological observations with information drawn from digital photos. In this sense, phenology means things like when flowers come out, when fruit forms, and so on. The Quercus ilex (holm oak) forest at Puéchabon is evergreen, so you don’t get the same spectacular seasonal changes in leaf colour that you see in deciduous forests, so we’re not sure whether this is going to work–the changes we’ll be looking at will be a bit more subtle.
After referring to a paper in Agricultural and Forest Meteorology in the last post, I remembered something I read on Crooked Timber the other day. Many people have a pretty good idea of how toxic the world of academic publishing is–if you don’t, take a look at this recent Guardian piece by Mike Taylor, who lays out the issues pretty clearly.
But what can individual researchers do to fight the power and toxic influence of the big publishers? Personally, I only submit papers to journals with an open-access policy (the EGU journals like Biogeosciences are good here, especially with their “open peer review” process), and try to encourage co-authors to do the same. But I am among the tiniest of tiny fish in a very big pond, so what I do has an influence barely measurable in pico-SeldonsA Seldon being the commonly accepted unit of historical influence. Difficult to quantify precisely, but whatever the scale, I don’t have many of them.... Now some of the big fish are taking an interest, and have started a movement! You should sign up if you’re involved in publishing scientific results and care at all about the free dissemination of knowledge.
Winnie!
Today, we had an early morning outing to the Plage de l’Espiguette, a big beach about 30 km east of Montpellier. We got up at 6:15, picked up the Modulauto car at 7:00, persuaded Winnie into the back (she’s still a bit scared of cars...) and off we went. We arrived at the car park before sunrise and were the only people there! Miles and miles of dunes were ours alone! Much frolicking there was. And climbing up and down the dunes. And digging. And chasing of sticks. And rolling around pretending to be a dog (just Ian, since Winnie is a dog, and Rita isn’t quite as silly as me). It was a lovely morning, and the dunes were very pretty indeed. We’ll be going back there again, some time before the summertime beach dog ban comes in. And hopefully, with more doggie company next time, since I think Winnie could have done with a four-legged buddy to run around with. I do my best, but I don’t quite have what it takes for serious boisterous play, since I can’t run at 40 km/h for hours and hours at a time...
For a brief period some years ago, I worked for an engineering firm that did sonar work for the (UK) navy. Many of the people in the company were ex-submariners, who had served both on the Swiftsure/Trafalgar attack boats and on the Vanguard missile boats. These were all pretty solid people, and I remember one in particular who delighted in telling stories about his time working his way up from a “baby sailor” to naval attaché to the British embassy in Washington, D.C. His most amusing tales included a reenactment (with actions) of the joys of carrying soup tureens around crowded submarines, and the entertaining ability of US spy satellites to dip into the atmosphere to take a closer look at interesting sights, like Soviet mini-subs stuck on their motherships in their pens.
A slightly less amusing story revolved around the hubris of submarine commanders. He sent some of us a photograph showing £20 million of towed sonar array snarled up around a mooring buoy, all because the sub commander couldn’t be bothered to wait for the divers to help reeling the thing in. A sad navy man with a big beard stood gazing at the pile of expensive spaghetti, looking like he might burst into tears at any moment. Not the navy’s finest moment.
Some interesting things that passed through my RSS reader recently:
Food pairing and flavour networks
This is kind of interesting. Why do cheese and bacon (if you eat that kind of thing) go together so well? Or asparagus and butter? Or caviar and chocolate? (Apparently. If you’re Heston Blumenthal.) The hypothesis was that these “paired” foods had many flavour compounds in common. The data presented in this preprint seems to confirm that idea for Western cookery, but contradict it for Asian cuisine. There’s a lot more in there about the evolution of recipes, clusters of ingredients, and other network analysis goodness.
Economists doing something right?
The people at Crooked Timber always have a lot of interesting things to say. This is a summary and some discussion of a meeting at the New Economics Foundation talking about the need to cut consumption and spread wealth around by redistributing working hours. From a personal point of view, the idea of earning a reasonable salary from a 21-hour working week and having time to work on personal projects, do some volunteering, spend time with Rita and Winnie, all sounds great. From a social point of view, a gradual redistribution of working hours to reduce unemployment and spread income around more fairly also sounds fine. It seems unlikely to happen, if only because NEF seems to be the only group of economists who can bear to think about the end of economic growth and a transition to a steady-state economy. When I listen to mainstream economists speak, I have this image in my mind of a train racing along a bridge which is being frantically cobbled together bit by bit as the train approaches. Sometimes the train gets closer to the edge, sometimes it backs off a little way. But in the long run, the bridge builders can’t win. They’re just going to run out of stuff to build more bridge.
Some good news
To leaven the economics misery, this is really good. Polio is on the way out in India: not one new case in the last year!
Carnedd Llywelyn, North Wales. January 2006.
Along with a lot of other people, I’m a big fan of cephalopods. I’ve dived with squid and cuttlefish and have watched octopus while snorkelling. I particularly remember following one cuttlefish across a reef in the Philippines, watching its mesmerising pattern display, until it got bored of the clumsy and noisy thing with too few appendages plodding along behind it and made off for deeper water.
Octopus, in particular, are smart little critters. One of the saddest things I’ve ever read was a section of a book about octopus physiologyI really like octopus, OK? I think the book was Octopus: physiology and behaviour of an advanced invertebrate by M. J. Wells, although it was a while ago that I read it. that talked about the effect of certain nervous system lesions on the behaviour of octopus–these were lesions induced by human experimenters, of course. The writer talked about how the cephalopod victims cowered at the back of their tanks and clearly were less than keen on being used as experimental subjects. It stuck in my mind as the only place in the book (otherwise a good and thorough treatment of octopus physiology) where the author seemed tempted in any way to anthopomorphise or to ascribe emotions or feelings to the octopus. That’s why Octopus vulgaris is the only invertebrate protected under the UK’s animal experimentation laws...
Anyway, octopus are very cool and a fascinating model for non-human (and non-vertebrate!) intelligence. A recent article in Orion magazine does a great job of getting across just how amazing these creatures are. Go and read it. If they lived a bit longer and could be trusted not to molest the dog, I would love to have an octopus to live with us. Alas, I don’t like to imagine what would be the result of Winnie versus a Pacific giant octopus. Messy, for sure.
Uh-oh. Getting a bit behind on this 30-day challenge. Time for some shorties...
Not really a “past life” as such, but something interesting I did a while ago that was brought back to mind by the Costa Concordia cock-up. I’ve not spent a lot of time on big ships, indeed until 2007, I think the only larger vessels I’d been on were ferries, cross-Channel or around the Greek islands.
In April 2007 though, I went to Canada for a big trip, mostly to visit some potential future places to live and work and to attend a workshop at the Banff International Research Station. I flew out to Vancouver, took a ferry to Vancouver Island where I visited UVic (ended up working there for a couple of years afterwards), then travelled east by train. I made it as far east as Halifax, Nova Scotia, all by train. That was a pretty cool experience in itself, but I’d had a wacky idea for how to get home from Canada to the UK. It’s possible to book passage on container ships under some circumstances, which I decided to do. It’s a tricky process, not super cheap, and the logistics of making the rendezvous with the ship turned out to be a bit more “interesting” than I expected.
I’ve been thinking for a long time, mostly privately in my idle moments, about the sort of house I’d one day like to build for myself. I have some ideas that I think are pretty cool: multiple small wooden buildings, connected by walkways, with the main building on a slope with a stepped living area; a separate Japanese-style bathhouse; an octagonal library/study with windows all around; a cellar dug into the side of the hill.
Of course, all of this is nothing more than idle daydreaming. Recently though, Rita sent me a link to some pictures of cabins (sheds, huts, that sort of thing) that capture a lot of the spirit of what I’ve been thinking of. You can see them here (yes, the link is safe!). Some of these places look heavenly.
I’m a sucker for nature photos. One of our annual outings when we lived in Bristol was to the Nature Photographer Of The Year exhibition at the city museum. Going by the crowds there, we aren’t the only ones who like this stuff.
So, here’s some eye candy.
Mt. Cain, Vancouver Island. New Years, 2010/2011.
Speedy Dog goes where she likes!
Well, maybe not any more, little girl...
Yesterday evening, we had a meeting with Martine, the dog behaviourist who’s helping us with Winnie. Winnie has come a long way from the terribly frightened little dog that we picked up from the animal shelter four months ago. She’s no longer terrified of everything in sight, she runs around with her tail up, likes to play with other dogs, likes to sniff in the bushes, likes to run crazily through the fallen leaves in the forest. She’s still jumpy and easily gets scared of people in the street or sudden noises, but she’s come a long way.
I’ve been very lucky so far in my life, and have had many opportunities to work and study in interesting places. I’m not sure I’ve made the best use of those opportunities–there’s much that feels unfinished or that didn’t quite work out the way I’d hoped it would–but these “past lives” have given me a lot of experience working in different fields, living in different places, hanging out with different kinds of people.
Blogging is inherently self-indulgent: you have to assume that someone out there is interested in reading your maunderings. (Not really. Honestly, I do this for my own amusement.) The ultimate in self-indulgence has to be autobiographical blogging and reminiscing, where some old fart bends everyone’s ear about the “good old days”. Anyway, since this is for my own amusement and for a bit of writing practice, I don’t really care. I’m going to be brazenly self-indulgent and start writing “Past Lives” articles to see if I can dredge up any interesting memories.
In 1998, the mathematical physicist David Ruelle wrote an odd little article, entitled Conversations on Mathematics with a Visitor from Outer Space, which appeared as a chapter of the very interesting looking book Mathematics: Frontiers and Perspectives, published by the IMU in 2000. I’ve not read the whole volume, although I think I’ve read at least one of the other chapters as a preprint. Most of the 30 authors seem to have written more or less “straight” articles, but Ruelle’s is different. He wanted to explore the constraints imposed on human mathematics by the structure and capabilities of the human brain. Perhaps constraints is too strong a word–“predispositions” might be better: human mathematics is necessarily tied to human brains, and human brains have evolved to their current state to solve problems that are quite different from the problems encountered in mathematics. One might thus expect human mathematics to have followed the “fault lines” in the Platonic edifice of Mathematics that are most easily appreciated using the mental tools that evolution has given us.
To explore this idea, Ruelle introduces the conceit of a visitor from outer space, a “galatic mathematician” pursuing doctoral studies investigating the nature of human mathematics. She describes some of the characteristics of the human mental apparatus that are relevant: limited short-term memory, and hence inability directly to execute complex algorithms; predisposition to see symmetry and pattern; human brains operate very slowly, even compared to human computers; the importance of geometrical and visual reasoning; and so on. Ruelle comments that despite our limitations, humans have managed to do some fairly complex mathematics, but the question is always: how much further could we go if we could transcend the limitations of our history?
A couple of more substantial (in length) books I read over Christmas were Neal Stephenson’s latest and the first of Tim Powers’ Fisher King series. I liked them both.
I was born in the UK in 1972. The “world leaders” who defined my youth were Margaret Thatcher, Ronald Reagan and Leonid Brezhnev. This may have something to do with why I am so bitter, cynical and generally misanthropic nowadays, although the star turns we’ve seen in the US, UK and former USSR since then may have helped too.
Before Reagan, there was Jimmy Carter. I remember him from the TV news at the time only very vaguely. Since missing re-election in 1981, Carter has worked on a vast range of issues, mostly to do with disease eradication in developing countries and election supervision in fragile democracies. That his name can send the US right wing into a frothing ball of fury seems to indicate that he’s been doing good things.
A recent interview in The Guardian brought home just what a great man he is. The description of his house and his manner make me think more of a Greek senator elected by lot than a modern American politician with all the schmaltz and money that goes with it. He’s a humble man who has lived his beliefs. That deserves a lot of respect.
The other thing that comes across in the interview is his intellectual engagement. I loved the story from his chief of staff that he only once was able to tell Carter something that he didn’t already know. If you have any sense, you don’t choose a leader you’d be happy to have a beer with (don’t know if Carter even drinks); you choose a leader who is intelligent, motivated and honest. Carter is all of those things.
Click here to read We never dropped a bomb. We never fired a bullet. We never went to war.
]]>Door, Kairouan, Tunisia. April, 2005.
After the rather dreary spy novels, time for something a bit more fluffy.
Pathetically slow in starting work on my little constraints project as I am, here’s the first of what should be a long series of posts...
One of the constraint solvers I’m looking at starting from, Cassowary, uses what is essentially an extended version of one of the most venerable of optimisation algorithms, Dantzig’s simplex algorithm. I’m going to start off thinking a little bit about the general setup of the kind of linear programming problems that the simplex algorithm is designed to solve, just to get a geometrical feeling for how these algorithms work and to understand the issues that might arise from relaxing some of the assumptions used in them.
I did a spot of reading over the Christmas and New Year holidays. In fact, reading was more or less all I did. Apart from walking the dog, spending time with Rita and eating myself silly. I read enough to write a lot of book review posts, but I’ll restrain myself to three or so... First off, spy stories.
One thing that came to an end over the Christmas holidays was the Stanford Artificial Intelligence course. The final exam was on the weekend of 17/18 December and final grades were distributed a couple of days later.
Overall, it was an interesting experience, and I certainly learnt some things. The instructors (Peter Norvig and Sebastian Thrun) were enthusiastic and clearly extremely knowledgeable, and they took on an extremely challenging task in trying to run an online class for tens of thousands of students. For that, I’m very grateful. There are a few things that weren’t 100% ideal about the presentation of the class material, but that’s to be expected the first time out with something like this.
The highlight of the course was definitely Sebastian’s presentation of his work on self-driving cars. I’d heard a bit about the DARPA Grand Challenge before, but I hadn’t known about the progress that the Stanford team had made with self-driving cars in traffic. That’s really impressive work, and Sebastian did a nice job of relating the algorithms used for guiding cars to the simplified versions considered in the class.
Looking for a prod to jolt me out of an (involuntary) 45 day blogging hiatus, I’m going to follow a scheme that Rita is using quite successfully: the Thirty Day Challenge. The idea is simple. Just do that thing, whatever it is, every day for thirty days. The timescale is short enough not to be daunting, but long enough to be habit-forming. Or that’s the theory anyway.
I have a few things lined up, including a slew of book reviews of my holiday reading and the creakingly slow commencement of my constraints project, but we’ll see how it goes.
OK, one down, 29 to go. That wasn’t so hard after all!
I miss Canada.
I miss sea kayaking.
I miss mountain biking.
I miss CrossFit Taranis and all the awesome people there.
We moved to France. Dumbasses. We exchanged a fantastic life in Canada for what? Basically: cheese. Feel free to mock us. We deserve it.
I’ve recently started making some tiny contributions to Brent Yorgey’s Haskell diagrams library, mostly bug fixes. The bug fixes are primarily just a way of familiarising myself with the diagrams codebase though. I have my eye on a rather chunkier task in the open issues list. That’s writing a constraints solver for diagram layout. I’ve been interested in this sort of problem for some time, and have never had a good reason to get down to it, so I’m going to see what I can come up with.
And I’m going to try something I’ve not done before, which is to document experiments, design and development as blog articles. It may all turn out to be hideously embarrassing, it may end up that none of what I do actually makes it into the mainstream of the diagrams library, but having to write about what I’m doing will keep me honest, force me to be clear about what I’m doing, and will provide an incentive to work on this stuff.
Stanford CS240h: Functional Systems in Haskell
Stanford is really pushing e-learning at the moment, what with the AI, machine learning and database classes that the engineering department is running. This is another good course with content available online.
Reddit: What does your company use Haskell for?
What it says on the tin: quite a few people chime in. Seems like Haskell isn’t so much of an “academic-only” language any more!
Storage and Identification of Cabalized Packages
Very helpful guide to GHC package management from Albert Lai.
Unused constraints in GHC
This week, we spotted an interesting thing in some of the diagrams code. There are a bunch of places where an earlier implementation of a feature required certain type class constraints. With a more recent implementation, that requirement has now gone away, but the constraints remain in the code. That makes using these particular functions trickier than it needs to be. We’d like for the compiler to warn about these extra un-needed constraints, since otherwise they just hang around like a bad smell. Brent describes it as “a nice project for someone wanting to dig into hacking on GHC”...
Money (via Crooked Timber)
Where does it come from? There’s a nice little Just-So story that money arose naturally out of barter economies as a natural consequence of Immutable Economic Laws. It’s thus slightly embarrassing that anthropologists haven’t found any evidence at all for barter economies of the required type. Cue immense academic pissing contest, of course. Read it all. It’s good.
Visual 6502
Remember the 6502? Come on, no need to be shy. The other 8-bit processor from the 1980s. I was a Z80 boy myself, but I did a bit of 6502 assembler programming back in the day (for my A-Level computer studies project, I wrote a little data capture package for an infra-red spectrometer in our school’s chemistry lab using the A/D ports on a BBC Micro). Any other fans of retro 8-bit should take a look at this site. They’re building sub-gate-level simulations and visualisations of old microprocessors. The visualisations are very neat, but the way they’re doing it is the kicker–working directly from dies, they’re photographing them, doing some image analysis to get the patterns of the chip layers out and using these to build transistor-level models of the chips. Lots of fun!
Circos
You may have seen those super sweet circular charts floating around in some biology papers in Nature or Science. This is where they’re made, and you can make them too.
Quasicrystal animations
This is “Haskell” too, but it lives in “Cool stuff”.
Great Mosque, Kairouan, Tunisia. April, 2005.
Science dog is tired
by John Bradshaw
Those of us who live with small furry friends have a vested interest in understanding what our beasties are thinking. They don’t talk, they won’t use SurveyMonkey, and they sometimes act like they’re not human at all, no matter how we might like to think of them as little people in dog costumes. But never fear. There is a whole industry out there that promises to help you interpret your dog’s slightest ear flick and nose twitch, that will allow you to correct all “undesirable” canine behaviours back to the acceptable human norms, that will turn you into a veritable Dr. Dolittle.
Just one little problem. Most of what you read is wrong, and quite a lot of it is harmful. Dominate your dog! Tame the wolf in your home! Show your pooch who’s boss! All complete crap, based on romantic story-telling and what makes good TV, rather than on decent science.
In In Defence Of Dogs, John Bradshaw, a researcher in canine behaviour at the University of Bristol (although he has appeared on TV, including The Colbert Report!) presents some of the recent thinking on dog behaviour, based on, of all things, scientific research.
via the MIT Technology Review Physics arXiv Blog
Everything’s better in slow-motion, right? And everything’s even more betterer in super-slow-slow-slow slow-mo, am I right? (You know I’m right.) It turns out that if you have a high-speed (10,000 fps!) camera and a pile of C4 explosive, you can make some pretty cool movies. You can read about it here.
On a less explodey note, two other vids from this year’s Gallery are also pretty outstanding:
Week four of the Stanford Introduction to AI class was about, well, what was it about? It was a bit of a mish-mash. Some stuff about logic, some stuff about planning. Nothing I can really pin some experiments on, so I’ve been thinking a bit about the format of the course instead.
I kept trying to tell Rita that a hippo would make a perfect pet. She didn’t believe me. But then she sent me this. Point proven, I think you will agree...
by Mark Kac & Stanislaw Ulam
This interesting little book, published in 1968, is the result of a collaboration between two of the great figures of 20th century mathematics. Although known mostly for his work on the Manhattan Project and later associated applied mathematics efforts (I first learnt of his work after being told to read about the Fermi-Pasta-Ulam problem by my PhD supervisor), Ulam began his career as a pure mathematician, working on problems in general topology. (His autobiography, Adventures of a Mathematician, is well worth a read.) Kac, best known for asking “Can one hear the shape of a drum?”, made major contributions to probability theory. Kac and Ulam got to know each other in Poland, then both made the move to the United States during the Second World War.
What does it mean when your dog yawns at you? She’s tired? She’s bored? She’s embarrassed of your choice of socks? If you don’t know, you probably want to read Turid Rugaas’s book about the body language signals that dogs use to display unease and to calm each other down.
Week 3 of the Stanford AI class was about machine learning. This was kind of handy for me, since my PhD thesis was mostly about applying some unsupervised learning techniques to dimensionality reduction for climate model output. That mean that I’ve read hundreds of papers about this stuff so the basic ideas and even quite a few of the details are already pretty familiar. However, most of what I’ve read has been aimed at applications in climate science and dynamical systems theory. Not much from the huge literature on clustering methods, for instance.
Sebastian very briefly mentioned some of the nonlinear dimensionality reduction methods that have been developed for unsupervised learning applications, but it was nothing more than a mention (no time for anything else). I can’t resist the temptation to dust off a little of this stuff from the archives.
Morning paddling preparations on Wallace Island, Gulf Islands National Park. Thanksgiving weekend, 2009.
The second week of the online Stanford AI class was about probability, reasoning when there is uncertainty and, more specifically Bayes networks. I don’t much want to talk about the theory behind these things here since that’s covered in a lot of detail on the linked Wikipedia page, in the lectures and in AIMA.
Instead, I want to present a mildly amusing, although not very efficient, Haskell implementation of Bayes networks. It’s often said that if you know the full joint probability distribution function (PDF) for a system, you can calculate any marginal or conditional probabilities you want. Well, Haskell is a functional programming language, a PDF is a function, so can we represent the PDF explicitly and use it for calculating? The answer, of course, is yes.
“Pour avoir une voiture, sans avoir une voiture”
To have a car, without having a car
As blogging software, I use the really rather nice Hakyll, by Jasper Van der Jeugt. This is a static website generator written in Haskell, and it’s a great solution for smaller blogs or personal websites.
Hakyll works just fine out of the box, but the blog setup I wanted was a bit different from what I’ve seen other people do with it, so a bit of hacking was required. I wanted to share some of the things I’ve done, since they might be of interest to other Hakyllers (Hakyllites?). All the code is available on Github.
Sundown on Wallace Island, Gulf Islands National Park. Thanksgiving weekend, 2009.
In an earlier post, I presented some code for solving tile puzzles using A* search. In the lectures, Peter Norvig talked about two different heuristics for these puzzles, one simply counting the number of misplaced tiles and the other giving the total Manhattan distance of all misplaced tiles to their correct positions. It’s interesting to find out how much of a difference using these heuristics really makes.
Winnie is a dog, probably half beagle, half something else. She lives with us now, but her life wasn’t always wall-to-wall belly rubs, salmon for breakfast and as much love as she can stand. We adopted her from the animal shelter where we work as volunteers on September 4, and for 3-4 months before that, she’d been cowering at the back of her cage, frightened even to show her little nose to the people outside.
We think Winnie was raised as a hunting dog, locked up in a kennel with other dogs and no humans to bond with, since she has a bad case of kennel syndrome: she’s scared of more or less everything except other dogs, and now, us. Loud or just sudden noises, cars, bikes, trams, other people, big boxes (no idea why), and so on. All send her into a panic. We think that she probably wasn’t aggressive and confident enough to satisfy the assholes who had her before, so they abandoned her.
by Roger S. Bivand, Edzer J. Pebesma & Virgilio Gómez-Rubio
I recently had to do a bunch of geostatistical analysis on some climate data (to be specific, using universal kriging to interpolate a time series of solar radiation data covering the region I’m working on to a different grid). I started off trying to use the geostatistical analysis toolbox in ArcGIS, which works fine as far as it goes, but seems to be very difficult indeed to access via ArcGIS’s Python scripting interface. Since I had 36 years of daily data to process, doing it by hand was not an option.
by China Miéville
Can you unsee? Can you unhear? If I ask you not to think of a tiny green rhinoceros wearing a straw hat, can you do it?
The first week of the online Stanford AI class has gone by. The format of the presentation is pretty good, with lots of short lecture videos chained together with little quizzes embedded. It’s really pretty neat. There are some minor errors in some of the lectures, which is normal, though in this kind of online setting, it’s more difficult for the keener students to correct the lecturer as the lecture goes along, so some people are likely to spend a bit of time confused because of that. Some people have complained on Reddit about the relatively superficial level of the coverage of the lectures, but as far as I’m concerned, that’s what the book (and papers) are for. The course is called “Introduction to Artificial Intelligence”, after all.
The two things we covered this week were a very quick introduction to the field, then some stuff about search. Although there is a big pile of code available for download associated with Russell & Norvig’s AI book which would have made playing with some of this stuff easier, I decided to write some code of my own in Haskell, mostly for my own amusement. I wanted to implement A* search, with a clean interface for setting up problems.
In my day job, I work on the ecology of Mediterranean ecosystems in southern France, and if you work on a particular type of ecosystem, the first thing you need to know is where you can find your ecosystem! Seems like a simple problem, but it’s not.
Glanmore Lake, Beara Peninsula. On a holiday to Ireland some years ago.
Along with about 160,000 other people, I’ve signed up for the Stanford Introduction to Artificial Intelligence course. My interest in this is two-fold: first for the material, which should be pretty cool if Peter Norvig’s book is anything to go by, but second for just how they (the two instructors plus presumably an army of TAs) are going to do this.
Perhaps I’m not a very good teacher in a lecture setting, but I have real trouble connecting even to a class of 90 or so students. Most effective teaching occurs one-on-one or in small groups in office hours. And most learning occurs when students are sitting quietly thinking about things themselves. How you make contact with and retain the attention of 85,000 students is a problem that boggles the mind. I’m very interested to see how it goes!
The first couple of modules are up on the website already, and it looks like the format of most of the course will be pretty nice and accessible. How assignments and exams will work, we’ll have to see. I really hope Stanford are supplying some additional resources to help with this project. It’s a very cool thing to be doing, and very exciting, but it has the potential to be a pretty high stress adventure for the instructors!
If AI doesn’t float your boat, Stanford are also running Intro to Machine Learning and Intro to Databases courses online as well.
We moved from Vancouver Island on the west coast of Canada to Montpellier in March 2011. Montpellier is not so bad, but there are a lot of things I miss about Canada. Our friends, for one thing. But also the wide open spaces. One of the last sea kayaking trips I took in Canada was a solo outing to a small (very small!) island in the Gulf Islands National Park.
Whaleboat Island is a marine provincial park, but it’s unmanaged, which means you are completely on your own: no campsites, no hookups, no wardens, no helpful information boards, and yes, no toilets. There’s also the minor matter that the beach only exists at low tide and you have to climb up some cliffs to find a flat spot to pitch a tent. Apart from that, it’s a perfect destination. Perfect for me anyway. I’d landed on Whaleboat once before, as a possible lunch spot, but that time no-one else seemed interested in clambering over rocks to eat their sandwiches and we ended up on an admittedly very pretty beach on Pylades Island.
This time though, I was on a mission. I’d bought myself a kayak a couple of days before, and was out to try it out, as well as checking out the camping possibilities on Whaleboat.
I’m a relative beginner with Haskell, and like many people, to start with I was a little perplexed by the Haskell approach to I/O. A small worked example helped a lot. I was curious to see how easy it would be to do something like the webcomic scraper application implemented in Clojure here and here. This is a simple application, but it does do realistic I/O, downloading files from the web, writing them to disk, and also doing some computations on the file contents. Over the course of two articles, I’m going to build something comparable in Haskell. It turns out to be pretty easy!
I’ve been meaning to start a blog for some time, but couldn’t settle on a platform. I’d used Wordpress before for a little blog we set up to let our friends and family know what we were up to when we moved to Montpellier, and I liked it (easy to install, easy to use). For a personal blog though, I wanted something a bit more hackable (yeah, I know, you can hack on Wordpress too, but if it’s going to be for fun, I want to be using something other than PHP!).