Lecture 4: Line Segment Intersection

This lecture makes use of several important data structures:

  1. Balanced Binary Trees, which supports operations:
  2. Skip Lists, which are easier to implement, and give probabilistic guarantees:
  3. Heaps (priority queues), which support

Geometric intersections: One of the most basic problems in computational geometry is that of computing intersections. Intersection computation in 2 and 3 space is basic to many different application areas.

Most complex intersection problems are broken down to successively simpler and simpler intersection problems. Today, we will discuss the most basic algorithm, upon which most complex algorithms are based.

The Line segment intersection problem is: given n line segments in the plane, report all points where a pair of line segments intersect.

We assume that each line segment is represented by giving the coordinates of its two endpoints. Observe that n line segments can intersect in as few as 0 and as many as (n choose 2) = O(n2) different intersection points. We could settle for an O(n2) algorithm, claiming that it is worst case asymptotically optimal, but it would not be very useful in practice, since in many instances of intersection problems intersections may be rare. Therefore it seems reasonable to look for an output sensitive algorithm, that is, one whose running time should be efficient both with respect to input and output size.

Let I denote the number of intersections. We will assume that the line segments are in general position, and so we will not be concerned with the issue of whether three or more lines intersect in a single point. However, generalizing the algorithm to handle such degeneracies efficiently is an interesting exercise. Complexity: We claim that best worst case running time that one might hope for is O(n log n+ I) time algorithm. Clearly we need O(I) time to output the intersection points. What is not so obvious is that O(n log n) time is needed. This results from the fact that the following problem is known to require O( n log n) time in the algebraic decision tree model of computation.

Element uniqueness: Given a list of n real numbers, are all of these numbers distinct? (That is, are there no duplicates.)

REDUCTION:

Given a list of n numbers, (x1, x2, ..., xn ), in O(n) time we can construct a set of n vertical line segments, all having the same y coordinates. Observe that if the numbers are distinct, then there are no intersections and otherwise there is at least one intersection. Thus, if we could detect intersections in o(n log n) time (meaning strictly faster than Theta(n log n) time) then we could solve element uniqueness in faster than o(n log n) time. However, this would contradict the lower bound on element uniqueness.

* Note, this lower bound result assumes the algebraic decision tree model of computation, in which all decisions are made by comparisons made based on exact algebraic operations, (+,-,/,*) applied to numeric inputs. Althought this includes most "normal" geometric operations, there are alternative models... by taking mods, floors or ceilings, you can implement a hashing function which can check for element uniqueness in expected O(n) time. Even in this model, however, no-one knows an algorithm better than 0(n log n + I).

We will present a (not quite optimal) O(n log n + I log n) time algorithm for the line segment intersection problem. A natural question is whether this is optimal. Later in the semester we will discuss an optimal randomized O(n log n + I) time algorithm for this problem. Line segment intersection: After this, we will ignore the issue of how to determine the intersection point of two line segments, but it is important to see that it isn't too difficult. Let ab and cd be two line segments, given by their endpoints. It is an easy exercise to determine whether these line segments intersect, simply by applying an appropriate combination of orientation tests (we did this in our first class!). But to determine the coordinates of the intersect point involves solving a small system of equations. The most natural way to set up this computation is to introduce the notion of a parametric representation of the line segment. Recall that any point on the line segment ab can be written as a convex combination involving a real parameter s:

p(s) = (1 - s)a + sb for 0 <= s <= 1:
Similarly for cd we may introduct a parameter t:
q(t) = (1 - t)c + td for 0 <= s <= 1:
An intersection occurs if and only if we can find s and t in the desired ranges such that p(s) = q(t). Thus we get the two equations:
(1 - s)ax + s bx = (1 - t) cx + t dx
(1 - s)ay + s by = (1 - t) cy + t dy 
The coordinates of the points are all known, so it is just a simple exercise in linear algebra to solve for s and t. The computation of s and t will involve a division. If the divisor is 0, this corresponds to the case where the line segments are parallel (and possibly collinear). These special cases should be dealt with carefully. If the divisor is nonzero, then we get values for s and t as rational numbers (the ratio of two integers). We can approximate them as floating point numbers, or if we want to perform exact computations it is possible to simulate rational number algebra exactly using high precision integers (and multiplying through by least common multiples). Once the values of s and t have been computed all that is needed is to check that both are in the interval [0, 1].

Plane Sweep Algorithm: Let S = (s1, s2, ..., sn) denote the line segments whose intersections we wish to compute. The method is called plane sweep. Here are the main elements of any plane sweep algorithm, and how we will apply them to this problem:

For this application, event points will correspond to the following:

Event updates: When an event is encountered, we must update the data structures associated with the event. In designed plane-sweep algorithms, these updates make sure that the data structure keeps some invariant. It is a good idea to be careful in specifying exactly what invariants you intend to maintain. For this plane-sweep algorithm, our data structures maintain: (1) a sorted list of all segments that intersect the current sweep line, and (2) the intersection points of any two neighboring lines are in the event queue. Thus, to maintain these invariants, when we encounter an intersection point, we must interchange the order of the intersecting line segments along the sweep line, and check to see if you need to add any new events. There are a great number of nasty special cases that complicate the algorithm and obscure the main points. We will make a number of simplifying assumptions. They can be overcome through a more careful handling of these cases. Detecting intersections: We mentioned that endpoint events are all known in advance. But how do we detect intersection events It is important that each event be detected before the actual intersection event occurs. Our strategy will be as follows. Whenever two line segments become adjacent along the sweep line, we will check whether they have an intersection occuring to the right of the sweep line. If so, we will add this new event (assuming that it has not already been added). A natural question is whether this is sufficient. In particular, if two line segments do intersect, is there necessarily some prior placement of the sweep line such that they are adjacent. Happily, this is the case, but it requires a proof.

Figure 17: Plane sweep.

Lemma: Given two segments si and sj , which intersect in a single point p (and assuming no other line segment passes through this point) there is a placement of the sweep line prior to this event, such that si and sj are adjacent along the sweep line (and hence will be tested for intersection).

Proof: From our general position assumption it follows that no three lines intersect in a com mon point. Therefore if we consider a placement of the sweep line that is infinitessimally to the left of the intersection point, lines si and sj will be adjacent along this sweepline. Consider the event point q with the largest x coordinate that is strictly less than px . The order of lines along the sweep line after processing q will be identical the the order of the lines along the sweep line just prior p, and hence si and sj will be adjacent at this point.


Figure 18: Correctness of plane sweep.

Data structures: In order to perform the sweep we will need two data structures.

The Complete Algorithm: We can now present the complete plane sweep algorithm.

Plane Sweep Algorithm for Line Segment Intersection

Intersect(S) :

  1. Initially, we insert all of the endpoints of the line segments of S into the event queue. The initial sweep status is empty.
  2. While the event queue is nonempty, extract the next event in the queue. There are three cases, depending on the type of event:
    1. Segment left endpoint: Insert this line segment into the sweep line status, based on the y coordinate of this endpoint and the y coordinates of the other segments currently along the sweep line. Test for intersections with the segment immediately above and below.
    2. Segment right endpoint: Delete this line segment from the sweep line status. For the entries immediately preceding and succeeding this entry, test them for intersections.
    3. Intersection point: Swap the two line segments in order along the sweep line. For the new upper segment, test it against its predecessor for an intersection. For the new lower segment, test it against its successor for an intersection.

Types of Events

Analysis: The work done by the algorithm is dominated by the time spent updating the various data structures (since otherwise we spend only constant time per sweep event). We need to count two things: the number of operations applied to each data structure and the amount of time needed to process each operation. For the sweep line status, there are at most n elements intersecting the sweep line at any time, and therefore the time needed to perform any single operation is O(log n), from standard results on balanced binary trees. Since we do not allow duplicate events to exist in the event queue, the total number of elements in the queue at any time is at most 2n + I. Since we use a balanced binary tree to store the event queue, each operation takes time at most logarithmic in the size of the queue, which is

O(log(2n + I)). Since I <= n2 , this is at most 

O(log(n2) ) = O(2 log n) = O(log n) time.

Each event involves a constant number of accesses or operations to the sweep status or the event queue, and since each such operation takes O(log n) time from the previous paragraph, it follows that the total time spent processing all the events from the sweep line is

O((2n + I) log n) = O((n + I) log n) = O(n log n + I log n):

Thus, this is the total running time of the plane sweep algorithm.

A very nice applet visualizing this algorithm was developed by Tristan Carvelho, 2005 and Hang Thi Anh Pham, 2001, and is available: here.


This course is modeled on the Computational Geometry Course taught by Dave Mount, at the University of Maryland. These notes are modifications of his Lecture Notes, which are copyrighted as follows: Copyright, David M. Mount, 2000, Dept. of Computer Science, University of Maryland, College Park, MD, 20742. These lecture notes were prepared by David Mount for the course CMSC 754, Computational Geometry, at the University of Maryland, College Park. Permission to use, copy, modify, and distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright notice appear in all copies.