Lab 3: Linear Regression and Functions with Multiple Outputs
In class on tuesday we talked about matrix operations. Today we are
going to practice with some of these, then apply them to the problem
of linear regression. Regression (or fitting, or least squares) is a
method of fitting a line (or other function) through a set of points.
In our problems today, we will make sample data and go through the
steps of fitting a line to them, then you will be asked to design
a function to fit data to a collection of sample points and plot the
best fit line through them.
Problems
- First, lets create a collection of points where we know how the x
and y values are related. You should keep
track of these points with 2 arrays (called x,y) each with the same
number of points.
I would like you to create a list of points
(x,y) where x varies from 1 to 20, and y is defined to be 2*x.
- Now, to make it interesting, lets add a little bit of random noise
to our points. make an array an array n to be the same size as y
using the "rand" command.
- use "help rand" to look up what kind of random numbers the rand
function creates. We want to make sure that noise has a mean of
approximately zero (otherwise the noise would be called biased).
Now change the command you used to make unbiased noise.
- now make an array yPrime by adding your y array to your noise
array, and make a plot of your x values vs. your yPrime values.
- Now, given the points x,yPrime, we'd like to solve for the linear
relationship between these values. Solve for the "best fit" value a
such that yPrime = a*x, using the matlab "/" operator
- Now we'd like to draw the line y = ax. Create values "yLine"
using the a value and the x vector, then plot these values on the same
plot as you plotted the (noisy) points.
Check with the TA and show your plot
Part II
Now we are going to write a function that computes the first order fit, and returns both the best fit line and the mean-squared error.
The syntax for writing a fuction that has * two * outputs is:
function [a, err] = regression(x,y);
Using steps like you did above, write a function that:
- computes the best fit a (so that y is approximately a*x),
- uses that best fit to create "predicted y values" for each x value,
- computes the "error", the absolute value of the difference between
the y values and the predicted y values.
- computes the "mean error", which is the mean of all the
(errors).
- when your function finishes, it will "return" whatever values are
in the variables "a", and "mse", so make sure you have assigned
something to these variables.
check you function with the values of x,y that you created in the first part of this lab.
Now, we are going to use this to check the "small angle"
approximation. This is a common approximation used in physics
classes, that for small angles theta (when theta is close to zero), then sin(theta)
is approximately theta.
Now we are going to test the small angle approximation. You might
want to put this function into a script so that you can test, change,
and re-run parts of this (to open a script, you might type "edit
testSmallAngle" ), and then right the below lines there. For a script
you don't put any of the "function .... " business at the begining.
for ix = 1:10
a = ix./10;
x = "make 100 points between -a and a";
y = "compute the sin of those points";
[a err] = regression(x,y); % compute the regression line and the error
plot(x,y,'b'); % draw the points on the figure
hold on;
plot(x,a*x,'r-'); % draw the best fit line
hold off;
disp(a,err); % show what the fitting error is...
end
Check with the TA and show your plot