Assignment One. Due Tuesday, January 31.

This assignment has two parts. The first part contains a little bit of linear algebra, and is in the form of a problem set and is to be done by yourself, the second part is a programming assignment, that you can do in whatever language you want, by yourself or in pairs. There is a 10% per day penalty for things not turned in on time.

Problem Set, to be turned in in class, tuesday January 31.

  1. Suppose you have a normalized camera, with the "nodal point" at (0,0,0), and the virtual image plane as the z=1 plane. You are given two points on an image, p1=(x1,y1) and p2=(x2,y2).
    1. Write down an equation that describes all 3D points that could project to p1.
    2. Write down an equation for the line on the image that connects p1 and p2.
    3. Write down an equation for the the set of all 3D points that project onto the image on the line between p1 and p2.
    4. Using homogenous coordinates to represent points on the image plane, solve problem 2 with the line in the form "all p such that l.p = 0", where you express l in terms of p1 and p2.
  2. One of the key problems that we will face occurs when coordinate systems change (for instance, when the center of the camera moves). A coordinate system change is a "rigid transformation", the coordinates may both rotate and translate. Here we are going to explore how the order of operations matters. For this problem, in each case, you can *either* solve the problem algebraicly, *or* find a specific values for R,T,P (that is, give me an example 3 by 3 matrix R with real numbers in it, and real 3 by 1 vectors T,P) that satisfy the conditions.
    1. Prove that sometimes the order matters. Define any rotation matrix R, any translation vector T, and any 3D point P, such that:
      R P + T != R(P + T).
    2. Prove that for some points the order doesn't matter. For a given rotation matrix R and translation T, find a set of points such that
      R P + T = R(P + T).
      You may do this in general, ie, for an arbitrary R,T, or for a specific R,T.
    3. Prove that for some motions, the order doesn't matter. Find a non-identity rotation matrix R and, and a non-zero translation vector T, such that the order doesn't matter. That is, for these cases, the following holds for all P.
      R P + T = R(P + T).
      You may do this in general, ie, for an arbitrary R,T, or for a specific R,T.

Programming. The goal of this is to write a vision "hello world" program; that gets you past the mechanics of opening an image, getting access to the pixels, changing them, and writing out an image. You may find a very brief 'tutorial' on matlab by following this link.

    Our hello world program will be an "eye detector". You are to write a program that reads in an image, makes a determination about the location of the eyes, and writes out an image with the position of the eyes somehow marked. This mark should be visible to a casual observer of the image (ie, don't just change one pixel to be white; that might be hard to see). drawing a small circle or box around each eye or shading a region of pixels to be noticeable redder might be appropriate. For this project, i don't care very much how well your program works, or how general it is, it is fine if it works on just the one or two images that you present. You need to turn in a weg page with:
  1. Your name(s). You may work in groups of up to 2, (two), on this project.
  2. 3 sample original images, and your detection results. You should find your own interesting images, perhaps with an images.google.com search for "someone famous". At least one detection result must be close to correct, at least one must be wrong.
  3. A pseudo-code, but explicit, description of your algorithm, starting when you have an access to your array of pixels. (about 10 steps?).
  4. A paragraph describing the types of images (in terms of their content) that your algorithm might work on, what images it won't work on, and why you couldn't have fixed your algorithm to work on such images in less than 15 minutes (ie, what is fundamentally hard about that problem).
This assignment will be graded largely on the explanation given for part 4, not how successful you are at detecting eyes in images.