CS363-U Lab 1

This lab will indeed be ready by noon monday. If you have undue difficulty getting to campus because of the snow and ice, you can complete the lab online, post your code by midnight, and get full credit. Since this is a monkey-see-monkey-do sort of lab, I am not too worried about independence of work. I think you all know that these first skills are essential to acquire in order to go on in the course.

As always, if you can't figure out how to do something, raise your hand. Also, when you are done call over the instructor and show what you have before logging off. You may want to be reading "man gawk" while you are doing this. IN ORDER TO CREATE FILES ON UNIX, you have a few options:

  1. Use pico directly on the server. This is ok for desperate students, but can't be allowed to persist as the normal work method. If you are overwhelmed by unix and gawk at the moment, go ahead and rely on pico. But later, try #3 below when you are feeling more secure.
  2. Use a windows editor, then ssh file transfer the file to the server. This works for html files, since the extra control-M that is generated at the end of each file is ignored by the browser. But it won't work for cgi programs interpreted in many unix languages. One way to fix this is to read the file in pico, then write it back out. Normally you could also run the file through dos2unix after doing the file transfer, which is a pain, or go into vi and execute a :g/./ s/.$// command followed by a :wq. That will kill the last character of each line, then write and quit the file. But these won't work right now and we are trying to find out why.
  3. The BEST METHOD for the growing cs/unix student is to use vi directly on the server, build the file on windows in notepad or whatever, and then select-all, copy, and paste into the vi window. You will want to say "i" before pasting, and "<escape>" after pasting. Then :wq will write the file. "vi foofile" at the command line will get you into the editor in the first place, so that it knows what filename to give when you write and quit. When you want to delete what is in the file and paste anew, try dG, or 1GdG, which should delete the whole file in anticipation of your i<paste><escape>:wq. Like UNIX itself, you'll learn a few pieces at a time, and everything will start to make sense.

I know we didn't get through the difference between get and post, and the code needed to parse the input. However, if you look at the m*.html and m*.cgi examples, we can proceed with this lab and the homework.

Getting Started



  1. Verify that you can see your cs363@wolf.cs directory from the browser by putting a file in the directory, making sure it is readable, and viewing it in a browser.

  2. Do the same with your directory at cs363@k9.cs.

  3. On either machine, write a gawk/cgi script that prints all the environment variables. I showed such a script in class.

  4. Change your script so that it just prints the IP address of the client.

  5. Make an html file with a form, an input field, and a submit button. Verify that you can see it in the browser.

  6. Write a script which takes the input from this form and returns a web page with the word "hello" on it, where the color of the font is the query string. So if you type "green", the script returns a green hello. You will need something like "font color=", and depending on what name you give the input field, you will need something like "substr(x,5)". Don't forget to print the Content-type. You may want to consult this unix manual by typing in "gawk" and "substr": http://www.cs.wustl.edu/~loui/513f03/subman.html.

  7. What happens if you invoke the cgi script with the suffix "?green" in the URL? e.g., http://wolf.cs/~cs363/loui/foo.cgi?green.

  8. Let's write a gawk script that is not intended to be a cgi program, just so we can understand the language better. Write a script which prints the numbers 1 through 10. Then a script which prints all non-empty substrings of "hello", including "hell", "llo", and "e". I am sure you will want to do something like:
    for (i=1; i<=10; i++) {
    }
    
    and
    for (i=1; i<=length(x); i++) 
      for (j=1; j<=length(x)-i+1; j++) {
      }
    
    Note that the gawk syntax requires grouping braces only if you are planning on controlling more than one line with your "for", "while", "if", etc.

    Finally, write a program which generates the strings "hello1" up to "hello10000". You will find that most scripting languages have automatic coercion of types, so that an integer can simply be appended onto a string as if it were a string. How long does it take to do this? How about if you store all of the strings in an array, e.g., a[1] = "hello1", or even a[n] = "hello" n ... Does it take longer?

  9. Write a script which takes a word as input from a form, such as "hello", and generates html as output, where each letter is doubling in font size (or increasing 1.5x). You must use CSS font control (style=font-size:10pt). You might keep track of the size by initializing and doubling a variable. Note that most scripting languages do not require any variable declarations -- they pop into existence by mentioning them, and their types and initial values are determined by the context in which they are used.
    # if you use the post method
    getline x
    # suppose x = "foo=hello"
    x = substr(x, 5)
    n = 10
    for (i=1; i<=length(x); i++) {
      print "<span style=font-size:" n "pt>"
      # another print statement from you 
      n = n*2
    }
    
    but you might want to do it another way. For example, you can try the get method, where the input comes to the script through the ENVIRON array.

  10. Write a gawk program which takes input from stdin (the "terminal") and stores each word from each line in an associative array. (Note that input from stdin is terminated with a control-D, if you are typing it at the terminal.) You might want to use $i and a[$i]. I'll show you below, but you should go to the gawk manual and have a look at arrays, getline, and split. (You might also want to look at where it talks about $1, $2, FS, and NF.) At the end of the input, you should write all of the words in sorted order. I'll give you that part:
    BEGIN {
      while (getline x) {
        # do your thing (loop from 1 to NF, setting thisword; words[thisword] = 1)
      }
    }
    END {
      for (i in words) print i | "sort"
    }
    
    You may want to say something like n = split(x, temparray, " "), and iterate over temparray[i].

    If you want to use all the built-in gawk parsing stuff, you can get away with something like:
    { while ($(++i)) words[$i] }
    END { for (i in words) print i | "sort" }
    
    (try it), but I want you to do it in the most straightforward way that makes sense to you. Scripting languages all have "poetic" ways for experts to be very brief, but that's not good for readability even if you are releasing code to a community of experts. In fact, part of what stunted gawk's growth is the belief that many unix people have that it should be used just for very short, unreadable scripts. Part of what is killing perl is the fact that it is very difficult for the author to make the code readable no matter how much s/he tries! Note that the manual that subman accesses refers to gawk 3.1.0 whereas wolf and k9 have gawk 3.0.4 installed. So you can't use the internal asort command because it doesn't exist in your version.