Lecture 15: Planar Point Location Today's material is not directly covered in our text, but is a data structure for the set of questions asked in Chapter 6. Point Location: Today we consider the point location problem. The problem (in 2 space) is: given a polygonal subdivision of the plane with n vertices, preprocess this subdivision so that given a query point q, we can efficiently determine which face of the subdivision contains q. We may assume that each face has some identifying label, which is to be returned. We also assume that the subdivision is represented in any ``reasonable'' form, e.g. as a DCEL. In general q may coincide with an edge or vertex. To simplify matters, we will assume that q does not lie on an edge or vertex, but these special cases are not hard to handle. It is remarkable that although this seems like such a simple and natural problem, it took quite a long time to discover a method that is optimal with respect to both query time and space. It has long been known that there are data structures that can perform these searches reasonably well (e.g. quad trees and kd trees), but for which no good theoretical bounds could be proved. There were data structures of with O(logn) query time but O(n log n) space, and O(n) space but O((log n)^2) query time. The first construction to achieve both O(n) space and O(logn) query time was a remarkably clever construction due to Kirkpatrick. It turns out that Kirkpatrick's idea has some large embedded constant factors that make it less attractive practically, but the idea is so clever that it is worth discussing, nonetheless. Later we will discuss a more practical randomized method that is presented in our text. Kirkpatrick's Algorithm: Kirkpatrick's idea starts with the assumption that the planar subdivision is a triangulation, and further that the outer face is a triangle. If this assumption is not met, then we begin by triangulating all the faces of the subdivision. The label associated with each triangular face is the same as a label for the original face that contained it. For the case when the outer face is not a triangle, first compute the convex hull of the polygonal subdivision, triangulate everything inside the convex hull. Then surround this convex polygon with a large triangle (call the vertices a, b, and c), and then add edges from the convex hull to the vertices of the convex hull. It may sound like we are adding a lot of new edges to the subdivision, but recall from earlier in the semester that the number of edges and faces in any straight line planar subdivision is proportional to n, the number of vertices. Thus the addition only increases the size of the structure by a constant factor. Note that once we find the triangle containing the query point in the augmented graph, then we will know the original face that contains the query point. The triangulation process can be performed in O(n log n) time by a plane sweep of the graph, or in O(n) time if you want to use sophisticated methods like the linear time polygon triangulation algorithm. In practice, many straight line subdivisions, may already have convex faces and these can be triangulated easily in O(n) time.Figure 48: Triangulation of a planar subdivision. Let T(0) denote the initial triangulation. What Kirkpatrick's method does is to produce a sequence of triangulations, T(0), T(1), T(2), ... T(k), where k = O(logn), such that T(k) consists only of a single triangle (the exterior face of T(0)), and each triangle in T(i+1) overlaps a constant number of triangles in T(i) . We will see how to use such a structure for point location queries later, but for now let us concentrate on how to build such a sequence of triangulations. Assuming that we have T(i), we wish to compute T(i+1). In order to guarantee that this process will terminate after O(log n) stages, we will want to make sure that the number of vertices in T(i+1) decreases by some constant factor from the number of vertices in T(i). In particular, this will be done by carefully selecting a subset of vertices of T(i) and deleting them (and along with them, all the edges attached to them). After these vertices have been deleted, we need retriangulate the resulting graph to form T(i+1). The question is: How do we select the vertices of T(i) to delete, so that each triangle of T(i+1) overlaps only a constant number of triangles in T(i)? There are two things that Kirkpatrick observed at this point, that make the whole scheme work. Constant degree: We will make sure that each of the vertices that we delete have constant (<= d) degree (that is, each is adjacent to at most d edges). Note that the when we delete such a vertex, the resulting hole will consist of at most d-2 triangles. When we retriangulate, each of the new triangles, can overlap at most d triangles in the previous triangulation. Independent set: We will make sure that no two of the vertices that are deleted are adjacent to each other, that is, the vertices to be deleted form an independent set in the current planar graph T(i). This will make retriangulation easier, because when we remove m independent vertices (and their incident edges), we create m independent holes (non triangular faces) in the subdivision, which we will have to retriangulate. However, each of these holes can be triangulated independently of one another. (Since each hole contains a constant number of vertices, we can use any triangulation algorithm, even brute force, since the running time will be O(1) in any case.) An important question to the success of this idea is whether we can always find a sufficiently large independent set of vertices with bounded degree. We want the size of this set to be at least a constant fraction of the current number of vertices. Fortunately, the answer is ``yes,'' and in fact it is quite easy to find such a subset. Part of the trick is to pick the value of d to be large enough (too small and there may not be enough of them). It turns out that d = 8 is good enough. Lemma: Given a planar graph with n vertices, there is an independent set consisting of vertices of degree at most 8, with at least n/18 vertices. This independent set can be constructed in O(n) time. We will present the proof of this lemma later. The number 18 seems rather large. The number is probably smaller in practice, but this is the best bound that this proof generates. However, the size of this constant is one of the reasons that Kirkpatrick's algorithm is not used in practice. But the construction is quite clever, nonetheless, and once a optimal solution is known to a problem it is often not long before a practical optimal solution follows. Kirkpatrick Structure: Assuming the above lemma, let us give the description of how the point location data structure, the Kirkpatrick structure, is constructed. We start with T(0), and repeatedly select an independent set of vertices of degree at most 8. We never include the three vertices a, b, and c (forming the outer face) in such an independent set. We delete the vertices from the independent set from the graph, and retriangulate the resulting holes. Observe that each triangle in the new triangulation can overlap at most 8 triangles in the previous triangulation. Since we can eliminate a constant fraction of vertices with each stage, after O(logn) stages, we will be down to the last 3 vertices. The constant factors here are not so great. With each stage, the number of vertices falls by a factor of 17/18. To reduce to the final three vertices, implies that (18/17)^k = n or that k = log (base 18/17 n) ~= 12 lg n It can be shown that by always selecting the vertex of smallest degree, this can be reduced to a more palatable 4.5 lg n. The data structure is based on this decomposition. The root of the structure corresponds to the single triangle of T(k). The nodes at the next lower level are the triangles of T(k-1), followed by T(k-2), until we reach the leaves, which are the triangles of our initial triangulation, T(0). Each node for a triangle in triangulation T(i+1), stores pointers to all the triangles it overlaps in T(i) (there are at most 8 of these). Note that this structure is a directed acyclic graph (DAG) and not a tree, because one triangle may have many parents in the data structure. This is shown in the following figure. To locate a point, we start with the root, T(k). If the query point does not lie within this single triangle, then we are done (it lies in the exterior face). Otherwise, we search each of the (at most 8) triangles in T(k-1) that overlap this triangle. When we find the correct one, we search each of the triangles in T(k-2) that overlap this triangles, and so forth. Eventually we will find the triangle containing the query point in the last triangulation, T(0), and this is the desired output. See the figure below for an example. Construction and Analysis: The structure has O(logn) levels (one for each triangulation), it takes a constant amount of time to move from one level to the next (at most 8 point in triangle tests), thus the total query time is O(log n). The size of the data structure is the sum of sizes of the triangulations. Since the number of triangles in a triangulation is proportional to the number of vertices, it follows that the size is proportional ton * (1 + 17/18 + (17/18)^2 + (17/18)^3 + ...) <= 18n (using standard formulas for geometric series). Thus the data structure size is O(n) (again, with a pretty hefty constant).Figure 49: Kirkpatrick's point location structure. Figure 50: Point location search. The last thing that remains is to show how to construct the independent set of the appropriate size. We first present the algorithm for finding the independent set, and then prove the bound on its size. (1) Mark all nodes of degree >= 9. (2) While there exists an unmarked node do the following: (a) Choose an unmarked vertex v. (b) Add v to the independent set. (c) Mark v and all of its neighbors. It is easy to see that the algorithm runs in O(n) time (e.g., by keeping unmarked vertices in a stack and representing the triangulation so that neighbors can be found quickly.) Intuitively, the argument that there exists a large independent set of low degree is based on the following simple observations. First, because the average degree in a planar graph is less than 6, there must be a lot of vertices of degree at most 8 (otherwise the average would be unattainable). Second, whenever we add one of these vertices to our independent set, only 8 other vertices become ineligible for inclusion in the independent set. Here is the rigorous argument. Recall from Euler's formula, that if a planar graph is fully triangulated, then the number of edges e satisfies e = 3n-6. If we sum the degrees of all the vertices, then each edge is counted twice. Thus the average degree of the graph is sum(over all v) [deg(v)] = 2e = 6n - 12 < 6n: Next, we claim that there must be at least n/2 vertices of degree 8 or less. To see why, suppose to the contrary that there were more than n/2 vertices of degree 9 or greater. The remaining vertices must have degree at least 3 (with the possible exception of the 3 vertices on the outer face), and thus the sum of all degrees in the graph would have to be at least as large as 9 (n/2) + 3 (n/2) = 6n which contradicts the equation above. Now, when the above algorithm starts execution, at least n/2 vertices are initially unmarked. Whenever we select such a vertex, because its degree is 8 or fewer, we mark at most 9 new vertices (this node and at most 8 of its neighbors). Thus, this step can be repeated at least (n/2)/9 = n/18 times before we run out of unmarked vertices. This completes the proof.
This course is modeled on the Computational Geometry Course taught by Dave Mount, at the University of Maryland. These notes are modifications of his Lecture Notes, which are copyrighted as follows: Copyright, David M. Mount, 2000, Dept. of Computer Science, University of Maryland, College Park, MD, 20742. These lecture notes were prepared by David Mount for the course CMSC 754, Computational Geometry, at the University of Maryland, College Park. Permission to use, copy, modify, and distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright notice appear in all copies.