EXCEL HA HA HA HA

You think you know EXCEL, but have you used it the way I do yet?

1.  Here are some data I care a great deal about.  I have laundered the names
to protect the innocent.  But if you must know, and I am apparently happy to 
spill the beans, they are citation counts for professors in the engineering
school.

	937	564	405	189	175	
	307	259	241	120	83	
	2021	465	213	213	194	
	674	220	144	161	109	
	346	247	133	124		
	144	135	122	113		
	734	255	112	69	69	
	191	176	99	97	67	
	196	153	94	82	70	
	135	90	90	53	50	
	213	213	82	67	66	
	104	90	76	67	63	
	104	102	73			
	71	61	57	38	33	
	109	73	50	36		
	72	57	49	39	38	
	83	61	47	43		
	179	107	45	30	29	
	116	50	44	43	32	
	58	58	43	42		
	69	55	41	31	13	
	66	48	34	28	27	
	82	46	33	33		
	46	38	33	27	15	
	77	40	32	33	26	
	143	39	32	31	24	
	126	50	32	14		
	92	49	31	27	24	
	59	36	31	15	12	
	32	30	31	30	26	
	42	41	29	27		
	56	28	28	10	6	
	47	30	28	22	19	
	40	29	28	27	27	
	34	27	26	25		
	39	26	25	9	8	
	69	26	24	18	17	
	38	26	23	22		
	97	32	22	21	15	
	76	49	21	21	19	
	57	38	21	14	14	
	64	38	20	20	12	
	37	33	20	18		
	24	24	20	19	16	
	86	30	19	16	12	
	45	22	19	15	9	
	34	27	19	14	12	
	93	61	18	3		
	63	20	18	11	9	
	23	20	18	16	15	
	68	25	17	13	9	
	42	40	17	15	14	
	28	20	17	16	13	
	27	23	17	16	14	
	22	20	17	15	15	
	21	18	17	16		
	90	17	15	15	10	
	23	20	15	13	9	
	75	43	14	12	12	
	57	57	14	13	10	
	29	26	14	11		
	96	58	13	13	11	
	60	31	13	7	6	
	78	42	12	12	9	
	47	18	12	12		
	41	21	12	9	9	
	14	12	11	8	7	
	12	12	11	10	9	
	16	10	10		7	
	43	10	9	8	7	
	30	15	9	8	8	
	19	11	9	6	6	
	166	55	9	6	5	
	9	9	8	5	5	
	22	13	8	7	6	
	8	8	7	6		
	103	12	7	5		
	10	8	7	7		
	17	16	6	6	2	
	13	8	6	3		
	77	11	5			
	13	10	5	4	4	
	9	5	4	4	1	
	8	7	4			
	6	6	4	3	3	
	5	4	4	3	2	
	20	5	4	3		
	17	5	4			
	9		5	3	3	
	4	4	2		1	
	25	3	2	1		
	45	24				
	3	3				
	27	5				
	120	32				
	1					

	Import these data into excel.

2.  Sort each column separately.  Plot each column as a separate line
	in a single graph.  Your graph should have five lines,
	each decreasing as you go from left to right.  Some of the lines
	have fewer data points than the others.

	This step may require you to search around a bit to figure out how
	to sort, and to fumble around a bit with the chart wizard.  You
	might even learn, inadvertently, how to add a series to a chart.

3.  I believe that these data, when transformed logarithmically, are 
	normally distributed.  Don't you?  WHAT?

	Create an empty column to the right of each column with data.

	Use a log function (doesn't matter whether base-2 or base-e or
	base-10, does it?) to create a new column which contains entries
	that are the logs of the original numbers.

	Plot the logs as you did before, with five lines in one chart.

4.  Calculate the mean and standard deviation of each column and place
	the result below each column, after an empty row.

	How to do this?  Click on the Greek Sigma (the thing that looks
	like a W turned clockwise), and you can figure it out.

5.  I want to know how many values in your column are more than 3 standard
	deviations above the mean.

	You can use excel to calcluate the mean (m) plus three times the standard
	deviation (s), i.e., m+3s.

6.	Then you can take this calculated value and mark a 1 next to every cell
	that is above that calculated value.  How?  Just insert an empty
	column, place a 1 in the highest value if it is greater than your
	calcluated value, and copy the 1 all the way down until you reach
	a row where the data is NOT greater than the calculated value, m+3s.
	You can copy a value by clicking on the cell and dragging the lower 
	right corner down as far as you want to copy.

7.	Now you can use the SUM to count how many rows have a 1.

	Compute how many rows have values between m+2s and m+3s.
	How many rows have values between m and m+2s.

8.	How many rows have values between m and m-s.
	How many rows have values between m-s and m-2s.
	How many rows have values between m-2s and m-3s.

9.	Let your TA have a look at your numbers and see if the two of you can
	decide whether these counts, are normally distributed.

10.	In fact, you might want to chart them as a columnar bar chart (you will have
	six cells and it should look like a Bell curve).

	If you can validate my hypothesis for all three of the data columns, you
	can go.