CS 527A Homework 3


You are expected to complete 40 points worth of homework problems. For those selecting 10 and 20 point problems, you must select problems from at least two of the chapters. Also, no more than one paper critique can be selected.

If you are doing a 20 or 40 point problem be sure to attach the appropriate cover sheet and review the guidelines given there and in the course information handout.

If you are interested in doing a group project talk to Dr. Goldman.

Due on Wednesday April 4th. Late penalty will not apply until April 11th. (So basically, there is a one week extension.)


Cover Sheets:


  1. (10 pts) Read any of the chapters from the lecture notes from my computational learning theory class (except for Chapter 1). If you are interested in then also working a problem related to the chapter you read, just let me know and I can provide a question that will be worth an additional 10 points.

  2. (20 pts) Use the provided AdaBoost Applet to experiment with boosting. Try some different things to help you understand how (and when it works). Then complete a 20 page project report.

  3. (40 pts) Do some experimentation with boosting. You can use the decision tree algorithm code (from HW 1) or anything of your choice as the base algorithm for boosting. Or you can use the naive bayes algorithm designed for text categorization (provided with the next problem) and then use it as a basis for the boosting. A variation of AdaBoost called BoosTexter which is designed for text categorization can be found at http://www.research.att.com/~schapire/BoosTexter/.

  4. (20 pts) Use the provided SVM Applet to experiment with Support Vector Machines (and kernel functions). Try some different things to help you understand how (and when it works). Write a 20 point project report.

  5. (40 pts) Use the SVM light software packge to experiment with support vector machines. I recommend that you try the first application discussed on the page under "Getting Started: An Example Problem." which is a text classification problem. Once you understand how it works try some other things out. Part of this homework will be reading some of the provided material and describing that in your report.

  6. (10 pts) In this problem you'll study different aspects related to Bayesian Learning.

  7. (10 points) Consider the concpet learning algorithm FindG, which outputs a maximally general consistent hypothesis (e.g. some maximally general member of the version space).

  8. (20 pts) In the analysis of concept learning in Section 6.3 we assumed that the sequence of instances (x1, ..., xm) was held fixed. Therefore, in deriving an expression for P(D|h) we needed only consider the probability of observing the sequence of target values (d1, ..., dm) for this fixed instance sequence. Consider the more general setting in which the instances are not held fixed, but drawn independently from some probability distribution defined over the instance space X. The data D must now be described as the set of ordered pairs {(xi,di)}, and P(D|h) must now reflect the probability of encountering the specific instance x1 as well as the probability of the observed target value di. Show Equation (6.5) holds even under this more general setting. Hint: Consider the analysis of Section 6.5.

  9. (10 pts) Consider the Minimum Description Length (MDL) principle applied to the hypothesis space H consisting of conjunctions of up to n boolean attributes (i.e. monotone monomials). Assume that each hypothesis is encoded simply by listing the attributes present in the hypothesis, where the number of bits needed to encode any one of the n boolean attributes is log2 n. Suppose the encoding of an example given the hypothesis uses zero bits if the example is consistent with the hypothesis and uses log2 m bits otherwise (to indicate which of the m examples was misclassified--the correct classification can be inferred to be the oppositie of that predicted by the hypothesis).

  10. (40 pts) In this problem you use some provided code to explore how the naive bayes learning algorithm can be applied to text categorization. Here's the provided code. Here's the assignment from Tom Mitchell to help guide you.

    After running the provided install program you will need to edit the Makefile to modify the line that begins with "CC =" to be

    CC = /pkg/gnu/bin/gcc
    
    Then to compile it use the command
    /pkg/gnu/bin/make
    
    Also, in svm_base.c you may need to change the call to sqrtf to sqrt.
  11. (10 pts) In this problem we look at instance-based learning.

  12. (10 pts) Suggest a lazy version of the eager decision tree learning algorithm ID3. Be sure to give a very clear description of your lazy algorithm. What are the advantages and disadvantages of your lazy learning algorithm as compared to the original eager algorithm. I'm expected a well thought out discussion on this.

  13. (20 pts) Read one of the following papers and write a paper critique. Please write the summary of the paper so that someone in this class who has not read the paper would understand what it was about at a high level and would understand one part at a deeper level.

  14. (10 pts) In this problem you will compute the posterior probabilities based on a given bayesian belief network and some partial observations.

    Consider the Fire Alarm example from the following applet except remove the attribute "reporting." For each of the following three sets of observations, show your computation for obtaining the posterior probabilities of all variables.

  15. CHOOSE YOUR OWN ADVENTURE. You can propose any additional homework options (or variations of those given above) to Dr. Goldman. If approved a point value will be given. If you are interested in doing a problem from HW 2 you can do that under this option. Just send email to Dr. Goldman to confirm the particular problem you want to do (from HW 2) is acceptable for HW 3.