The simplex algorithm: warm-up
Pathetically slow in starting work on my little constraints project as I am, here’s the first of what should be a long series of posts...
One of the constraint solvers I’m looking at starting from, Cassowary, uses what is essentially an extended version of one of the most venerable of optimisation algorithms, Dantzig’s simplex algorithm. I’m going to start off thinking a little bit about the general setup of the kind of linear programming problems that the simplex algorithm is designed to solve, just to get a geometrical feeling for how these algorithms work and to understand the issues that might arise from relaxing some of the assumptions used in them.
Setup
The basic idea of linear programming problems is to find a vector
subject to
What’s the connection between this sort of problem and the problem of laying out diagrams? In a diagram, the variables that we are interested in determining, i.e. the components of the vector
Most of the time, we hope that the constraints we impose are sufficient to precisely locate all the elements of the diagram, in which case there is only one possible solution to the constraint equations and our linear programming problem degenerates to the solution of a system of linear equations to determine the unique point satisfying the constraints. On the other hand, if we are building a system of constraints interactively, we will often find ourselves with an underconstrained system–the user has placed a number of shapes, has applied some constraints between them, but those constraints are not sufficient to fix the positions of all the elements of the diagram. We might then choose some optimisation criterion to decide what is a good layout for the underconstrained drawingIn fact, things are trickier than this. Following the principle of least surprise, adding constraints to a drawing should only cause elements of the drawing to move around if they are in positions inconsistent with the constraints that have been applied so far. The Cassowary solver algorithm is constructed to allow this sort of update. I’ll talk about that another time, since it’s a bit more complicated than the basic simplex algorithm..
What sort of constraints can we treat? Suppose we have three points
where we introduce an auxiliary variable
where again we introduce an auxiliary variable
However, we can’t demand that the distance between points
which is not a linear constraint. That seems like a bit of a shame, since distance-based constraints are very natural from a geometrical point of view. I’ll talk below about what it might mean to lift the restriction to linear constraints.
Finding solutions
First of all, let’s think about how the constraints that we impose restrict the space of solutions to our optimisation problem. If
In the end, after considering all of the linear equality constraints, we thus end up with a situation where we need to seek solutions in some linear subspace of
Next, let’s think about the role of the inequality constraints in restricting the solution space. Each inequality constraint of the form
Here’s an example in
The figure above shows the constraints as blue boundaries, with the interior of the polygon bounded by the constraint lines being the set of permissible solutions. I’ve also show the contours of a particular gradient function (the red lines) and the resulting best solution (point marked in green). It’s pretty clear that adding more linear inequality constraints can’t make the permissible region anything other than a convex polygon.
It’s also pretty clear from this image that the optimal point, i.e. the maximum of the objective function
In fact, the optimal solution, except in degenerate cases, is found at one of the vertices of this polygon. This is the key point that makes this type of optimisation problem more tractable than it might initially seem: even with large numbers of dimensions, we basically only need to solve a combinatorics problem over the vertices of our permissible polygon. In particular, we don’t need to think about what happens to our objective function in the interior of the polygon. This is a big deal, and we’ll see how lifting the linearity requirement on either the constraints or the objective function renders the problem much more difficult.
In essence, the simplex algorithm is a smart way of doing this combinatorial search along the edges of our permissible polygon, in a way that works for larger problems and with some special cases to deal with degenerate problems and to detect insoluble problems.
Lifting assumptions
What happens if we lift the requirement that constraints be linear? A general nonlinear equation
Nonlinear inequality constraints pose another sort of problem. If we have an inequality like
There is another problem with nonlinear inequalities. Recall that for the linear constraint case, the permissible region was always a convex polytope. In general, nonlinear constraints do not guarantee convex regions (think of the constraints
Lifting the assumption that our objective function is linear means that we can no longer be sure that the optimal value lies on the boundary of our permissible region. An arbitrary nonlinear function can have all sorts of bumps and maxima within the permissible region. There are classes of nonlinear functions (in particular harmonic functions) where we can make definitive statements about extremal values within the permissible region compared to on the boundary, but these don’t seem like practical classes to use to restrict the choice of objective. All other things being equal, a linear objective is probably the best approach for now.
Conclusions and what next?
Linear constraints and linear objective function are nice and easy to understand geometrically.
General nonlinear constraints are potentially nasty.
Convex nonlinear constraints might be easier to handle (there’s a whole field called convex optimisation about dealing with this situation) and it seems intuitively likely that most of the nonlinear constraints we’re interested in might be convex–distance constraints essentially define balls in the coordinate space, for example.
Nonlinear objective functions break the whole “optimal solution on the boundary” idea. Best avoided if possible...
Next, I’m going to set things up to play with an implementation of the simplex algorithm in Haskell, along the way experimenting with an interface for specifying constraint systems.