Lecture 10: Linear and not Quite Linear Programming

Unbounded LP's
--------------
We began last time by assuming that we could always find two (or
generally d) initial halfplanes to provide us with an initial bounded
optimal vertex. We never dealt with this issue, but we claim that it
is always possible to determine either that the LP is unbounded, or to
determine a pair of halfplanes that bound the feasible region in O(n)
time. Our book presents an algorithm for doing this in the plane. This
algorithm has the nice feature that it generalizes to higher
dimensions (although they leave the generalization as an exercise).

In the planar case there is a much simpler algorithm than
this. Consider the outward point ing normal vectors for the various
halfplanes (imagine they are scaled to unit length). They partition
the unit circle into n angular sectors. Now consider the angular
sector that contains the objective vector c. Consider the two
halfplanes lying immediates clockwise and counter clockwise from c. If
the angle between these two normals is less than 180 degrees, then the
no matter how far apart they are, these two halfplanes will converge
at a point p and the feasible set (if it is nonempty) will lie in the
cone bounded by these halfplanes.

Figure 39: Boundedness test. Intersections with a hyperplane: Our book does not talk about how we go about intersecting given set of halfspaces with a hyperplane. (Suffice it to say that it is a geometric primitive that can be evaluated in O(d) time per hyperplane.) For completeness let's consider how this might be done. Suppose that the halfspace that we just added was given by the inequality: h(i) : a(i,1) x(1) + a(i,2) x(2) + . . . + a(i,d) x(d) <= b(i) The corresponding hyperplane is given by the equation: e(i) : a(i,1) x(1) + a(i,2) x(2) + . . . + a(i,d) x(d) = b(i) We can express this more succinctly in matrix notation. Let A(i) denote the (1 x d) vector consisting of the i-th row of the A matrix, A(i) = (a(i,1), a(i,2), ..., a(i,d) ). The inequality may be written A(i) ~x = b(i) , where ~x is a (d x 1) vector and b(i) is a scalar. We want to intersect the other halfspaces with this hyperplane. Furthermore, we would like to represent the result as a LP in d - 1 dimensional problem. (Observe that after intersection the hyperplane still resides in d-space.) The idea is to apply one step of Gauss elimination using the equation of e(i) to eliminate a variable from all the other inequalities. Let us assume that a(i,1) != 0 (if not, select a dimension k such that a(i,k) != 0 and swap with the first). Consider an arbitrary constraint h(j) that we wish to intersect with e(i): h(j) : A(j) x <= b(j) To eliminate the first dimension from h(j) we multiply A(i) by (a(j,1) / a(i,1)) and subtract from A(j), do the same for b(i) and b(j): A(j)' = A(j) - (a(j,1)/a(i,1)) A(i) b(j)' = b(j) - (a(j,1)/a(i,1)) b(i) To see that this works, suppose that x is a point on the hyperplane e(i), implying that A(i) x = b(i) . Suppose that x satisfied constraint h(j) . Then we have A(j)' x = ( A(j) - (a(j,1)/a(i,1)) A(i) ) x = ( A(j) x - (a(j,1)/a(i,1)) A(i) x ) <= b(j) - (a(j,1)/a(i,1)) b(i) = b(i)' Thus, every point on the hyperplane satisfies the modified constraint if and only if it satisfies the original constraint. A similar elimination can be performed to the objective vector c. It is also easy to show that (as is standard in Gauss elimination) the first term of each equation vanishes, so we are left with a d - 1 dimensional problem. Reversing the process allows us to project the d - 1 dimensional solution back into d-space. Low Dimensional Linear Programming: Last time we presented an O(n) time algorithm for linear programming in the plane. We begin by observing that the same algorithm can be generalized to any dimension d. In particular, let {h(1), h(2), ..., h(n)} be a set of n closed half spaces in dimension d (each h(i) is defined by one linear inequality of the form: a(i,1) x(1) + a(i,2) x(2) + ... + a(i,d) x(d) <= b(i): and ~c is a vector in d space, describing the objective function. We begin by selecting d half spaces whose feasible region is bounded with respect to c. This can be done by a generalization of the method used for the planar case. Next, we add half spaces one by one in random order. As before if the current optimal vertex is feasible with respect to the latest half space, then there is not change. This can be checked in O(d) time. Otherwise, let e(i) denote the hyperplane that supports h(i) (formed by changing the above inequality into an equality). Using Gauss elimination, we intersect all the half spaces with e(i) , and then solve the resulting d \Gamma 1 dimensional LP problem, involving i-1 half spaces recursively, and return the result. The running time is derived in exactly the same way as for the 2 dimensional case. Let T (d,n) be the expected running time of the algorithm for n half spaces in dimension d. Let's consider the ith stage. Assuming general position, there are d half spaces whose intersection (of the supporting hyperplanes) defines the optimum vertex. If any of these half spaces was the last to be added, then the running time of the last stage required a recursive call taking T (d - 1, i - 1) time. This occurs with probability d/i since each of the existing i half spaces is equally likely to be the last to be added. On the other hand, with probability (i-d)/i the last half space was not one that defines the optimal vertex, and hence the running time is just O(d) for the last stage. The total running time comes by summing over all the n stages. Ignoring the constant factor hidden in the O(d), the expected running time is given by the recurrence: T (1,n) = n T (d,n) = sum(i = 1...n) [ d * (i-d)/i + T(d-1,i-1) * d/i ] We claim that T (d, n) = O(d!n). The proof is by induction on d. This is clearly true in the basis case (d = 1). In general we have T (d,n) = T (d,n) = sum(i = 1...n) [ d * (i-d)/i + T(d-1,i-1) * d/i ] <= dn + sum(i = 1..n) (d/i) * ( (d-1)! * (i-1)) // by ind. hypoth. <= dn + d! * sum(i = 1..n) (i-1)/i // d*(d-1)! = d! <= dn + d!n Oops, this is not quite what we wanted. We wanted T (d,n) <= d!n. (why do we care? dn + d!n is the same order of magnitude as d!n, right? but we are using this bound within an inductive proof, if we don't provide a tight bound, then we have to replace T(d-1,i-1) with O((d-1)! * (i-1)), and then our final bound is the product of a huge number of "order" terms. so if each term has a constant factor k, you get a total time of k^d in front of your final result). However, a more careful (but messier) induction proof will do the job. We leave this as an exercise. Smallest Enclosing Disk: Although the vast majority of applications of linear programming are in relatively high dimensions, there are a number of interesting applications in low dimensions. We will present one very intersting example, called the smallest enclosing disk problem. We are given n points in the plane and we are asked to find the closed circular disk of minimum radius that encloses all of these points. (A disk is defined to be the set of points lying on within a circle.) We will present a randomized algorithm for this problem that runs in O(n) expected time. Before discussing algorithms, let us make some geometric observations. First observe that any circle is uniquely determined by three points (as the circumcenter of the triangle they define). Claim: For any finite set of points in general position (no four cocircular), the smallest enclosing disk either has at least three points on its boundary, or it has two points, and these points form the diameter of the circle. If there are three points then they subdivide the circle bounding the disk into arcs of angle at most Pi. Proof: Clearly if there are no points on the boundary the disk's radius could be decreased. If there is only one point on the boundary then this is also clearly true. If there are two points on the boundary, and they are separated by an arc of length strictly less than Pi, then observe that we can find a disk that passes through both points and has a slightly smaller radius. (By considering a disk whose center point is only the perpendicular bisector of the two points and lies a small distance closer to the line segment joining the points.) Thus, none of these configurations could be a candidate for the minimum enclosing disk. Also observe that if there are three points that define the smallest enclosing disk they subdivide the circle into three arcs each of angle at most Pi (for otherwise we could apply the same operation above). Because points are in general position we may assume there cannot be four or more cocircular points.
Figure 40: Contact points for a minimum enclosing disk. This immediately suggests a simple O(n^4) time algorithm. In O(n^3) time we can enumerate all triples of points and then for each we generate the resulting circle and test whether it encloses all the points in O(n) additional time, for an O(n^4) time algorithm. You might make a few observations to improve this a bit (e.g. by using only triples of points on the convex hull). But even so a reduction from O(n^4) to O(n) is quite dramatic. Linearization: We can ``almost'' reduce this problem to a linear programming problem in 3 space. Although the method does not work, it does illustrate the similarity between this problem and LP. Recall that a point p = (p(x), p(y)) lies within a circle with center point c = (c(x),c(y)) and radius r if (p(x) - c(x))^2 + (p(y) - c(y))^2 <= r(2) In our case we are given n such points p i and are asked to determine whether there exists c(x), c(y) and r satisfying the resulting n inequalities, with r as small as possible. The problem is that these inequalities clearly involve quantities like c(x)^2 and r(2) and so are not linear inequalities in the parameters of interest. We can apply a trick, called linearization to fix this though. First let us expand the inequality above and rearrange the terms p(x)^2 - 2 p(x) c(x) + c(x)^2 + p(y)^2 - 2 p(y) c(y) + c(y)^2 <= r^2 2 p(x) c(x) + 2 p(y) c(y) + (r^2 - c(x)^2 - c(y)^2) >= p(x)^2 + p(y)^2 Now, let us introduce a new parameter R = r^2 - c(x)^2 - c(y)^2 Now we have (2 p(x))c(x) + (2p(y))c(y) + R >= p(x)^2 + p(y)^2 Observe that this is a linear inequality in c(x), c(y) and R. If we let p(x) and p(y) range over all the coordinates of all the n points we generate n linear inequalities in 3 space, and so we can apply linear programming to find the solution, right? The only problem is that the previous objective function was to minimize r. However r is no longer a parameter in the new version of the problem. Since r^2 = R + c(x)^2 + c(y)^2, and minimizing r is equivalent to minimizing r^2 (since we are only interested in positive r), we could say that the objective is to minimize R + c(x)^2 + c(y)^2 Unfortunately, this is not a linear function of the parameters c(x), c(y) and R. Thus we are left with an optimization problem in 3 space with linear constraints and a nonlinear objective function. However this shows that LP is clearly relevant. Randomized Incremental Algorithm: Rather than reducing the problem to LP directly, we will consider how we can modify the randomized incremental algorithm for LP directly to solve this problem. The algorithm will mimic each step of the randomized LP algorithm. To start we randomly permute the points. We select any two points and compute the unique circle with these points as diameter. (We could have started with three just as easily.) Let D(i-1) denote the minimum disk after the insertion of the first i-1 points. For point p(i) we determine in constant time whether the point lies within D(i-1). If so, then we set D(i) = D(i-1) and go on to the next stage. If not, then we need to update the current disk to contain p(i), letting D i denote the result. When the last point is inserted we output Dn . How do we compute this updated disk? It might be tempting at first to say that we just need to compute the smallest disk that encloses p(i) and the three points that define the current disk. However, it is not hard to construct examples in which doing so will cause previously interior points to fall outside the current disk. As with the LP problem we need to take all the existing points into consideration. But as in the LP algorithm we want some way to reduce the ``dimensionality'' of the problem. How do we do this? The important claim is that if p(i) is not in the minimum disk of the first i - 1 points, then p(i) must be on the boundary of the minimum disk that contains the first i points. Claim: If p(i) is not in D(i-1) then p(i) is on the boundary of the minimum enclosing disk for the first i points, D(i). Proof: The proof makes use of the following geometric observation. Given a disk of radius r(1) and a circle of radius r(2) , where r(1) != r(2) , the intersection of the disk with the circle is an arc of angle less than Pi. This is because an arc of angle Pi or more contains two (diametrically opposite) points whose distance from each other is 2r(2) , but the disk of radius r(1) has diameter only 2r(1) and hence could not simultaneously cover two such points. Now, suppose to the contrary that p(i) is not on the boundary of D(i). It is easy to see that because D(i) covers a point not covered by D(i-1) that D(i) must have larger radius than D(i-1). If we let r(1) denote the radius of D(i-1) and r(2) denote the radius of D i , then by the above argument, the disk D(i-1) intersects the circle bounding D(i) in an arc of angle less than Pi. (Shown in a heavy line in the figure below.)
Figure 41: Why p(i) must lie on the boundary of D(i). Since p(i) is not on the boundary of D(i) , the points defining D(i) must be chosen from among the first i - 1 points, from which it follows that they all lie within this arc. However, this would imply that between two of the points is an arc of angle greater than Pi (the arc not shown with a heavy line) which, by the earlier claim could not be a minimum enclosing disk. The algorithm is identical in structure to the LP algorithm --- another example of the powerful technique called Randomized incremental construction. We will randomly permute the points and insert them one by one. For each new point p(i) , if it lies within the current disk then there is nothing to update. Otherwise, we need to update the disk. We do this by computing the smallest enclosing disk that contains all the points {p(1),... ,p(i-1)} and is constrained to have p(i) on its boundary. The requirement that p(i) be on the boundary is equivalent to the constraint used in linear programming that optimum vertex lie on the line supporting the current halfplane. This will involve a slightly different recursion. In this recursion, when we encounter a point that lies outside the current disk, we will then recurse on a subproblem in which two points are constrained to lie on the boundary of the disk. Finally, if this subproblem requires a recursion, we will have a problem in which there are three points constrained to lie on a the boundary of the disk. But this problem is trivial, since there is only one circle passing through three points. Minimum Enclosing Disk MinDisk(P ) : (1) If |P | <= 3, then return the disk passing through these points. (2) Otherwise, randomly permute the points in P . Let D(2) be the minimum disk enclosing {p1, p2} (3) for i = 3 to |P| do (a) if p(i) in D(i-1) then D(i) = D(i-1). (b) else D(i) = MinDiskWith1Pt(P [1..(i - 1)], p(i) ). MinDiskWith1Pt(P;q) : (1) Randomly permute the points in P . Let D(1) be the minimum disk enclosing {q, p(1)}. (2) for i = 2 to |P| do (a) if p(i) in D(i-1) then D(i) = D(i-1). (b) else D(i) = MinDiskWith2Pts(P [1..(i - 1)], q,p(i) ). MinDiskWith2Pts(P,q1,q2) : (1) Randomly permute the points in P . Let D(1) be the minimum disk enclosing {q1,q2}. (2) for i = 1 to |P| do (a) if p(i) in D(i-1) then D(i) = D(i-1). (b) else D(i) = disk(q1,q2,p(i)).
This course is modeled on the Computational Geometry Course taught by Dave Mount, at the University of Maryland. These notes are modifications of his Lecture Notes, which are copyrighted as follows: Copyright, David M. Mount, 2000, Dept. of Computer Science, University of Maryland, College Park, MD, 20742. These lecture notes were prepared by David Mount for the course CMSC 754, Computational Geometry, at the University of Maryland, College Park. Permission to use, copy, modify, and distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright notice appear in all copies.