Lecture 25 (includes summary of 24)

  1. Graph Reprise A graph G is defined be giving the set of nodes and the set of edges (a relation between the nodes). The following notation is the way you define a graph G:
    G(V,E)

    Graphs are perhaps the most important data structure in computer science. For example, the diagrams we drew of relations from a set to itself are one variant of a graph. But so are social networks, dependency graphs and so on.

  2. Properties of graphs that come up in a large number of problems:
    1. Paths: A path is a list of vertices v_0 ... v_k where every consectutive pair of vertices is in the edge set of the graph.
      • A simply path is one that never uses the same node twice.
      • A cycle is a path whose first and last vertices are the same.
      • A "simple cycle"... you can probably guess.
    2. An undirected graph is a graph whose edge set E is symmetric (so that any vertices that are connected are connected both ways).
    3. A connected component is a collection of nodes that are mutually reachable.
      • If you consider the edge set as a relation and compute the reflexive, symmetric and transitive closure of that set, the equivalence classes of the resulting relation are the connected components of the graph.
      • A graph is connected if there is one connected component.
      • For directed graphs, the equivalent concept is a "strongly connected component". Two nodes are in the same "strongly connected component" if there is a directed path in each direction between them.
    4. The "degree" of a node is the number of edges that are incident on that node. If the graph is directed, then each node has an "in-degree" and an "out-degree", either or both of which may be zero.
    5. A "tree" is a connected, acyclic graph.
      • There are rooted and unrooted trees. Usually unrooted trees are undirected graphs, and rooted trees are directed graphs, with edges pointing away from the root.
      • Nodes with just one edge are called "leaves"
      • In an undirected tree, there is a unique path between every pair of vertices. (could prove this).
      • adding any edge creates a cycle.
      • Removing any edge disconnects the graph (creates more than one connected component).
      • if |V| >= 2, then the number of leaves >= 2.
      • |V| = |E| + 1.
    6. The girth of graph. If we call the distance between two nodes to be number of edges on the shortest path between those nodes, then the girth of a graph is the largest distance between any two nodes. The girth of many social networks is much smaller than most people expect.
    7. Many problems can be expressed by adding a function w(e): E -> R that is a "weight" or "distance" of an edge, or a function c(v): C -> R that is a "cost" or "price" of a vertex.
  3. Today we are going to talk about 3 special properties.
    1. Graph Isomorphism. Two graphs are isomorphic if you can find a permutation of the nodes of one graph so that the edge set of that graph is the same as the edge set of the other graph. (You can see another version of this definition here).
      • Example problems: Which of the following graphs are isomorphic? What do you need to do to prove two graphs Isomorphic? Is it possible for more than one permutation to make something isomorphic?
      • Necessary conditions for isomorphism: (same number of nodes, same number of edges, same set of degrees).
      • In practice, usually pretty easy to find the Isomorphism or prove to yourself that it doesn't exist. But there seem to be pathological examples that make it hard. One of a very small set of problems for which we don't really know the complexity.
      • The relation "isomorphic" on the set of graphs is an equivalence relation. How would you prove this?
    2. Euler Paths and Euler Tour
      • Which of the following graphs have a Hamiltonian Path?
      • The "Find the Best Approximation to the Euler Path" problem is known as the "Chinese Postman Problem". What is the shortest path that visits every edge at least once. This problem is NP-complete.
      • Suppose that you create a graph representing a large software system, where each piece of code is a node, and each potential call is an edge. Then, finding a set of test data that is as short as possible and which forces the program to undergo all possible edges is an important problem
      • Classic example of an Euler tour. (draw graph)
      • What are necessary conditions for a graph to have an Euler tour? (undirected (even degree) and directed (in-degree== outdegree)).
      • Proof that these conditions are sufficient.
    3. Hamiltonian Paths. A hamiltonian path is one that visits every node exactly once. This is related to the "Travelling Salesman Problem", what is the shortest total distance that you have to travel in order to visit every node in a graph. Here we are assuming all the edges have weight of 1, and asking if the shortest possible path (n-1 edges), exists.
      • Which of the following graphs have a Hamiltonian Path?
      • Turns out that there is no good algorithm to solve this, in general. This is known to be in a class of problems called NP-complete, which means we don't expect to find an efficient solution to them. In CSE441, you'll find that you can sometimes get a good approximate solution.