Finding an Average from a histogram

The Raw Data

A test worth 50 points was given, and the 83 students in the class attained the following scores.

Test Scores

If we then histogram these scores to see what the distribution of grades looked like. We see the following:(click on the image for someting easier to view)

By eye, the average looks 30ish and the width is about 5, and there's a group of students at the low end that may alter this 1st guess.

Test Scores as histogram data (x,y) x = bincenter, y = contents in the bin.

We are going to apply the classic technique to determine the average, the width of the distribution (Standard deviation) and the error in the average (standard deviation of the mean). A good introductory text is John Taylor's "An introduction to Error Analysis". It has a cool picture of a train crash on the cover, and John Taylor is a really nice guy to boot.

The way I like to remember this stuff is that were trying to see how the data behaves quatitatively as if it were distributed via a Gaussian distribution. A Gaussian distribution goes like: where K is some normalization constant. Our job is to try and find the average and the sigma.

And to find the sigma, we find the standard deviation.

Now, we can actually determine the average much better than a sigma. We typically mean sigma when we quote an error on a single measurement. Let's take this to a logical limit.

So, I made an excel spread sheet with the histogram data and found out averages etc. It can be located here. I found an average of 28.6 +/-0.9 with a sigma of 7.9. If this stuff is still too wierd, I have some lectures that start at the beginning of all this stuff and error propagation too here. Click on the notes and stuff under lectures 1 and 2.