CS363-U Lab 1
This lab will indeed be ready by noon monday. If you have undue
difficulty getting to campus because of the snow and ice, you can complete
the lab online, post your code by midnight, and get full credit. Since
this is a monkey-see-monkey-do sort of lab, I am not too worried about
independence of work. I think you all know that these first skills are
essential to acquire in order to go on in the course.
As always, if you can't figure out how to do something, raise your hand.
Also, when you are done call over the instructor and show what you have
before logging off. You may want to be reading "man gawk" while you are
doing this. IN ORDER TO CREATE FILES ON UNIX, you have a few options:
- Use pico directly on the server. This is ok for desperate students,
but can't be allowed to persist as the normal work method. If you are
overwhelmed by unix and gawk at the moment, go ahead and rely on pico.
But later, try #3 below when you are feeling more secure.
- Use a windows editor, then ssh file transfer the file to the server.
This works for html files, since the extra control-M that is generated
at the end of each file is ignored by the browser. But it won't work for
cgi programs interpreted in many unix languages. One way to fix this is
to read the file in pico, then write it back out. Normally you could also
run the file through dos2unix after doing the file transfer, which is a
pain, or go into vi and execute a :g/./ s/.$// command followed by a :wq.
That will kill the last character of each line, then write and quit the
file. But these won't work right now and we are trying to find out why.
- The BEST METHOD for the growing cs/unix student is to use vi
directly on the server, build the file on windows in notepad or whatever,
and then select-all, copy, and paste into the vi window. You will want
to say "i" before pasting, and "<escape>" after pasting. Then :wq
will write the file. "vi foofile" at the command line will get you into
the editor in the first place, so that it knows what filename to give when
you write and quit. When you want to delete what is in the file and
paste anew, try dG, or 1GdG, which should delete the whole file in anticipation
of your i<paste><escape>:wq. Like UNIX itself, you'll learn a few
pieces at a time, and everything will start to make sense.
I know we didn't get through the difference between get and post,
and the code needed to parse the input. However, if you look at the
m*.html and m*.cgi examples, we can proceed with this lab and the homework.
Getting Started
- Verify that you can see your cs363@wolf.cs directory from the
browser by putting a file in the directory, making sure it is readable,
and viewing it in a browser.
- Do the same with your directory at cs363@k9.cs.
- On either machine, write a gawk/cgi script that prints all
the environment variables. I showed such a script in class.
- Change your script so that it just prints the IP address of the
client.
- Make an html file with a form, an input field, and a submit button.
Verify that you can see it in the browser.
- Write a script which takes the input from this form and returns
a web page with the word "hello" on it, where the color of the font
is the query string. So if you type "green", the script returns a
green hello. You will need something like "font color=", and depending
on what name you give the input field, you will need something
like "substr(x,5)". Don't forget to print the Content-type.
You may want to consult this unix manual by typing in "gawk" and "substr":
http://www.cs.wustl.edu/~loui/513f03/subman.html.
- What happens if you invoke the cgi script with the suffix
"?green" in the URL? e.g., http://wolf.cs/~cs363/loui/foo.cgi?green.
- Let's write a gawk script that is not intended to be a cgi program,
just so we can understand the language better. Write a script which
prints the numbers 1 through 10. Then a script which prints all non-empty
substrings of "hello", including "hell", "llo", and "e". I am sure you
will want to do something like:
for (i=1; i<=10; i++) {
}
and
for (i=1; i<=length(x); i++)
for (j=1; j<=length(x)-i+1; j++) {
}
Note that the gawk syntax requires grouping braces only if you are
planning on controlling more than one line with your "for", "while", "if",
etc.
Finally, write a program which generates the strings "hello1" up to
"hello10000". You will find that most scripting languages have automatic
coercion of types, so that an integer can simply be appended onto a string
as if it were a string. How long does it take to do this? How about if
you store all of the strings in an array, e.g., a[1] = "hello1", or even
a[n] = "hello" n ... Does it take longer?
- Write a script which takes a word as input from a form, such as
"hello", and generates html as output, where each letter is doubling in
font size (or increasing 1.5x). You must use CSS font control
(style=font-size:10pt). You might keep track of the size by initializing
and doubling a variable. Note that most scripting languages do not
require any variable declarations -- they pop into existence by mentioning
them, and their types and initial values are determined by the context in
which they are used.
# if you use the post method
getline x
# suppose x = "foo=hello"
x = substr(x, 5)
n = 10
for (i=1; i<=length(x); i++) {
print "<span style=font-size:" n "pt>"
# another print statement from you
n = n*2
}
but you might want to do it another way. For example, you can try the
get method, where the input comes to the script through the ENVIRON array.
- Write a gawk program which takes input from stdin (the "terminal")
and stores each word from each line in an associative array. (Note that
input from stdin is terminated with a control-D, if you are typing it at
the terminal.) You might want to use $i and a[$i]. I'll show you
below, but you should go to the gawk manual and have a look at arrays,
getline, and split. (You might also want to look at where it talks about
$1, $2, FS, and NF.) At the end of the input, you should write all of the
words in sorted order. I'll give you that part:
BEGIN {
while (getline x) {
# do your thing (loop from 1 to NF, setting thisword; words[thisword] = 1)
}
}
END {
for (i in words) print i | "sort"
}
You may want to say something like n = split(x, temparray, " "), and iterate
over temparray[i].
If you want to use all the built-in gawk parsing stuff, you can get
away with something like:
{ while ($(++i)) words[$i] }
END { for (i in words) print i | "sort" }
(try it), but I want you to do it in the most straightforward way that
makes sense to you. Scripting languages all have "poetic" ways for experts
to be very brief, but that's not good for readability even if you are
releasing code to a community of experts. In fact, part of what stunted
gawk's growth is the belief that many unix people have that it should
be used just for very short, unreadable scripts. Part of what is killing
perl is the fact that it is very difficult for the author to make the
code readable no matter how much s/he tries!
Note that the manual that subman accesses refers to gawk 3.1.0 whereas wolf
and k9 have gawk 3.0.4 installed. So you can't use the internal asort
command because it doesn't exist in your version.