Lecture 15: Point Location in Trapezoidal Maps

Reading: Chapter 6 of the 4M's.

Point Location: Last time we presented a randomized incremental
algorithm for constructing a trapezoidal map in the plane. Today we
consider how to modify this algorithm to answer point location queries
for the resulting trapezoidal decomposition. The preprocessing time
will be O(n log n) in the expected case (as was the time to construct
the trapezoidal map), and the space and query time will be O(n) and
O(logn), respectively, in the expected case. Note that this may be
applied to any spatial subdivision, by treating it as a set of line
segments, and then building the resulting trapezoidal decomposition
and using this data structure.


Recall from last time that we are treating the input as a set of
segments S = {s1, ...  sn} (permuted randomly),
that Si denotes the subset consisting of the first i
segments of S, and Ti denotes the trapezoidal map of
Si. One important element of the analysis to remember from
last time is that each time we add a new line segment, it may result
in the creation of the collection of new trapezoids, which were said
to depend on this line segment. We presented a backwards analysis that
the number of new trapezoids that are created with each stage is
expected to be O(1). This will play an important role in today's
analysis.


Point Location Data Structure: As in Kirkpatrick's algorithm, the
point location data structure will be based on a rooted directed
acyclic graph. In particular, to the query processor it will look like
a binary tree, but there may be sharing of subtrees. There are two
types of nodes, x nodes and y nodes. Each x node contains the x
coordinate of an endpoint of one of the segments. Its two children
correspond to the points lying to the left and to the right of this
coordinate. Each y node contains a pointer to a segment. The left and
right children correspond to whether the query point is above or
below the line containing the segment, respectively.  (We will visit a y
node only if we already know that the x coordinate of the query point
lies between the left and right endpoints of the segment.)

Our construction of the point location data structure mirrors the
incremental construction of the trapezoidal map.   In particular, if we
freeze the construction just after the insertion of any segment, the
existing structure will be a point location structure for the existing
trapezoidal map. In the figure below we show a simple example of what
the data structure looks like for two line segments (from our
textbook). There is one leaf for each trapezoid. The y nodes are shown
as hexagons. For example, if the query point is in trapezoid D, we
would first detect that it is to the right of p1 , then
left of q1 , then below s1 (the right child),
then right of p2, then above s2 (the left
child).


The question is how do we build this data structure incrementally?
First observe that when a new line segment is added, we only need to
adjust the portion of the tree that involves the

Figure 54: Trapezoidal map point location data structure. trapezoids that have been deleted as a result of this new addition. This will be set of leaves in the current tree. Each such leaf will be replaced with a new subtree, which determines which one of the new trapezoids contains the query point. In Kirkpatrick's algorithm we just said ``check each of the new triangles that overlapped one of the old triangles'', and we will do essentially the same thing here. As in Kirkpatrick's algorithm, it will turn out that each old trapezoid is overlapped by a constant number of new trapezoids. However, let us consider the process in a little more detail here. Suppose that we add a line segment. This results in the replacement of an existing set of trapezoids with a set of new trapezoids (shaded in the figure below). If the segment passes entirely through an existing trapezoid, then there will be two overlapping trapezoids in the new trapezoidal map, and thus we just need to compare against the newly added segment (one y node). If the existing trapezoid contains one or both endpoints, then we need to test on which sides of these endpoints we lie (one or two x nodes) and if we lie between them, then we need to test whether we lie above or below the line segment (one y node). If we add a line segment to the example above, resulting in the replacement of C, D, E and G with new trapezoids, we will replace these leaves of the subtree as shown in the figure below. It is important to notice that (through sharing) each trapezoid appears exactly once as a leaf in the resulting structure.
Figure 55: Line segment insertion. Analysis: We claim that the size of the point location data structure is O(n) and the query time is O(log n), both in the expected case. As usual, the expectation depends only on the order of insertion, not on the line segments or the location of the query point. To prove the space bound of O(n), observe that the number of new nodes added to the structure with each new segment is proportional to the number of newly created trapezoids. Last time we showed that with each new insertion, the expected number of trapezoids that were created was O(1). Therefore, we add O(1) new nodes with each insertion in the expected case, implying that the total size of the data structure is O(n). Analyzing the query time is a little subtler. In a normal probabilistic analysis of data structures, we think of the data structure as being fixed, and then compute expectations over random queries. Here the approach will be to imagine that we have exactly one query point to handle. The query point can be chosen arbitrarily (imagine an adversary that tries to select the worst possible query point) but this choice is made without knowledge of the random choices the algorithm makes. We will show that for any query point, most random orderings of the line segments will lead to a search path of length O(log n) in the resulting tree. Let q denote the query point. Rather than consider the search path for q in the final search structure, we will consider how q moves incrementally through the structure with the addition of each new line segment. Let ti denote the trapezoid of the map that q lies in after the insertion of the first i segments. Observe that if ti-1 = ti, then insertion of the ith segment did not affect the trapezoid that q was in, and therefore q will stay where it is relative to the current search structure. (For example, if q was in trapezoid B prior to adding s 3 in the figure above, then the addition of s 3 does not incur any additional cost to locating q.) However, if ti-1 != ti, then the insertion of the ith segment caused q's trapezoid to be deleted. As a result, q must locate itself with respect to the newly created trapezoids that overlap ti-1. Since there are a constant number of such trapezoids (at most four), there will be O(1) work needed to locate q with respect to these. In particular, q may fall as much as three levels in the search tree. (For example, if q was in trapezoid C, before the addition of s3 in the figure above, and q was in trapezoid J afterwards, then q would have to pass through two new levels of the structure as a result of this insertion.) To compute the expected length of the search path, it suffices to compute the probability that the trapezoid that contains q changes as a result of the ith insertion. Let Pi denote this probability (where the probability is taken over random insertion orders, irrespective of the choice of q). Since q could fall through up to three levels in the search tree as a result of each the insertion, the expected length of q's search path in the final structure is at most Sum(i = 1..n) [3 * Pi] We will show that Pi <= 4/i. From this it will follow that the expected path length is at most Sum(i = 1..n) [3 * Pi] = Sum(i = 1..n) [3 * 4 / i] = 12 * Sum(i = 1..n) [1 / i] = which is roughly 12 ln n = O(logn) by the Harmonic series. To show that Pi <= 4/i, we apply a backwards analysis. In particular, consider the trapezoid that contains q after the ith insertion. Recall from last time that this trapezoid is dependent on at most four segments, which define the top and bottom edges, and the left and right sides of the trapezoid. Since each segment is equally likely to be the last segment to have been added, the probability that the last insertion caused q to belong to a new trapezoid is at most 4/i. This completes the proof. Guarantees on Search Time: The only problem with this result is that even though the search time is provably small in the expected case for a given query point, it might still be the case that once the data structure has been constructed there is a single very long path in the search structure, and the user repeatedly performs queries along this path. The result provides no guarantees on the running time of all queries. Although we will not prove it, the book presents a stronger result, namely that the length of the maximum search path is also O(logn) with high probability. In particular, they prove the following. Lemma: Given a set of n non crossing line segments in the plane, and a parameter L >= 0, the probability that the total depth of the randomized search structure exceeds 3L * ln(n + 1), is at most 2 ------------------------- [ (n+1)^(L * ln 1.25-3)] For example, for L = 20, the probability that the search path exceeds 60 ln(n + 1) is at most 2/[(n + 1)^1.5]. (The constant factors here are rather weak, but a more careful analysis leads to a better bound.) Nonetheless, this itself is enough to lead to variant of the algorithm for which O(log n) time is guaranteed. Rather than just running the algorithm once and taking what it gives, instead keep running it and checking the structure's depth. As soon as the depth is at most c log n for some suitably chosen c, then stop here. Depending on c and n, the above lemma indicates how long you may need to expect to repeat this process until the final structure has the desired depth. For sufficiently large c, the probability of finding a tree of the desired depth will be bounded away from 0 by some constant factor, and therefore after a constant number of trials (depending on this probability) you will eventually succeed in finding a point location structure of the desired depth. A similar argument can be applied to the space bounds. Theorem: Given a set of n non crossing line segments in the plane, in expected O(n log n) time, it is possible to construct a point location data structure of (worst case) size O(n) that can answer point location queries in (worst case) time O(log n).
This course is modeled on the Computational Geometry Course taught by Dave Mount, at the University of Maryland. These notes are modifications of his Lecture Notes, which are copyrighted as follows: Copyright, David M. Mount, 2000, Dept. of Computer Science, University of Maryland, College Park, MD, 20742. These lecture notes were prepared by David Mount for the course CMSC 754, Computational Geometry, at the University of Maryland, College Park. Permission to use, copy, modify, and distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright notice appear in all copies.