CS363-U Lab 2

As always, if you can't figure out how to do something, raise your hand. Also, when you are done call over the instructor and show what you have before logging off.

Using Common Files to Support a Database

  1. The example we didn't have time to show on Friday is called grepdat.cgi. This is a basic search on a file which lists some lenses for 35mm cameras and the results of a Swedish testing site called photodo.com. Have a look at the file, data. Then try the script with something like 'Nikon' or '@24mm.*(Nikon|Sigma).*prime'. Don't forget those little dots -- the period is a "meta-character" which matches any character. ".*" or dot-star is a run of any characters. Those numbers at the end of the line are called MTF numbers, and the higher the number, the better. The reason there are multiple numbers is that each lens is (usually) tested at three distinct aperture values, whatever that means. (If there are four numbers, let's just look at the last three.)

  2. If you have not read about regular expressions in UNIX, now is a good time to have a look. You can look at the gawk manual, and if you use my subman tool, search for egrep. While you are at it, you might as well take a look at the difference between grep and egrep.

  3. To prove that you know how to use regular expressions, I want you to answer the following questions: (a) which tends to be better, a manufacturer's 50/1,4 prime lens or their 85/1,4 prime lens? (b) which tends to be better, a 28-80, a 35-70, or a 35-80 zoom? (c) does a 70-210, 100-300, or 70-300 zoom tend to be better on the short end (@70mm or @100mm) or the long end (210mm+)? (d) if you buy a hyperzoom (a 28-200 or 28-300 lens), at what focal length does it tend to perform best?

    I've deliberately phrased these questions in terms of what you want to know, not in terms of what you have to type in, in order to answer them. For those who can't figure out what I am asking, you must now THINK. It will actually take quite a few regexp queries to get decent answers to these questions. If you just don't get all this lens lingo, and YOU ARE NOT an engineering student, we'll be happy to explain this in more detail. If you are an engineering student, i think you should intuit most of this.

    HOWEVER, I've also deliberately scuttled the translation of many of the special characters you will need in order to pose the regular-expression queries through cgi. You will have to COPY the data file to your own folder and add translation lines to grepdat.cgi so that you can pose your queries. DO NOT write a routine that will translate ALL special characters, as you know that this is unsafe. Just add the translation of characters as you need them.

  4. Now, as you might have noticed, the data were hard to interpret since there were three numbers per line and usually a lot of lines to read. I want you to change the cgi script so that it (1) sorts the results by the first MTF score on each line (this is the score the lens received at its maximum aperture); (2) sorts the results by the second MTF score on each line; and (3) sorts by the third MTF score on each line. This means that the result of each query will be three separate blocks of output, and you might as well separate these blocks with an <hr> tag.

    You may want to have a quick look at the sort command in unix. You may also want to try a test script where you print a bunch of lines to a sort command, such as 'print x | "sort -nr > outfile"'. It's a pretty nifty trick, but now you have to really try to understand UNIX. Try doing things from the command line first. Then try executing the program from the shell, under your cs363 permissions. Then try executing the program through the browser so that it runs as apache.

    You will have to read in the sorted data, 'while (getline < "outfile")'. So you could equally as well have piped the results of the grep to a file, called a sort on that file, then read it in the results of the sort, perhaps 'while("sort -nr outfile" | getline)'. You do have a few choices here. Note that if you create a file from a cgi program, the directory must be writeable! It's safer to create the file in advance, then rewrite that file from the cgi program, and make the file writeable, but not the whole directory. But do you know how to make UNIX overwrite a file from the command line? It's the difference between ">" and ">!" when you direct the output to a file. But cgi programmers often have to fuss with this kind of thing to get it right because the UNIX environment might be different from server to server.

    Another problem is that you can't just say 'sort -nr +8' or something like that because the lines all have different numbers of words on them. So you will have to be creative here. There are at least a dozen distinct solutions. Surprise me.

    Let's make this lab a bit more fun. Those numbers are hard to visualize. How about making a little table with a bgcolor varying from 000000 to 999999 (or ffffff if you are ambitious) in proportion to the MTF score you are trying to sort by. If you need some help seeing how this should look, you can see my http://www.cs.wustl.edu/~loui/makephotodo.cgi, which yes, was yet another 3-hour cgi program I am proud of!

  5. It is now time to insert data into your data file. So you must make it writeable. Write a simple HTML page which posts the name of the lens, the focal length it was tested at (e.g., 50mm), and the three MTF scores. Have it add this score to the file. This won't take very long to get right, so don't freak out.

    I bet in the time that you have, half of you can make sure it doesn't add the new data if scores are already recorded for that name and focal length already. And I bet you can put in a password so that you have to know the password in order to modify the data file.

    That's it! Come and get yourself recorded as a lab2star.