Explanation: Variance (σ2) is the average of squared difference from mean. Mean is M=2+4+6+8+==6. Variance is. What is the range of the following group of numbers: 10, 2, 5, 6, 7, 3, 4? the variance is defined as the average squared difference of the scores from the mean. data set 1: 3, 4, 4, 5, 6, 8 data set 2: 1, 2, 4, 5, 7, What are the variance and standard deviation of each data set? We'll construct a table to calculate the.
Now one way, this is kind of the most simple way, is the range. And you won't see it used too often, but it's kind of a very simple way of understanding how far is the spread between the largest and the smallest number. You literally take the largest number, which is 30 in our example, and from that, you subtract the smallest number.
So 30 minus negative 10, which is equal to 40, which tells us that the difference between the largest and the smallest number is 40, so we have a range of 40 for this data set.
Here, the range is the largest number, 12, minus the smallest number, which is 8, which is equal to 4. So here range is actually a pretty good measure of dispersion. We say, OK, both of these guys have a mean of But when I look at the range, this guy has a much larger range, so that tells me this is a more disperse set. But range is always not going to tell you the whole picture. You might have two data sets with the exact same range where still, based on how things are bunched up, it could still have very different distributions of where the numbers lie.
Now, the one that you'll see used most often is called the variance. Actually, we're going to see the standard deviation in this video. That's probably what's used most often, but it has a very close relationship to the variance. So the symbol for the variance-- and we're going to deal with the population variance.
Once again, we're assuming that this is all of the data for our whole population, that we're not just sampling, taking a subset, of the data. So the variance, its symbol is literally this sigma, this Greek letter, squared. That is the symbol for variance. And we'll see that the sigma letter actually is the symbol for standard deviation. And that is for a reason.
But anyway, the definition of a variance is you literally take each of these data points, find the difference between those data points and your mean, square them, and then take the average of those squares. I know that sounds very complicated, but when I actually calculate it, you're going to see it's not too bad.
Standard Deviation and Variance (1 of 2)
So remember, the mean here is So I take the first data point. Let me do it over here. Let me scroll down a little bit.
From that, I'm going to subtract our mean and I'm going to square that. So I just found the difference from that first data point to the mean and squared it. And that's essentially to make it positive.
Plus the second data point, 0 minus 10, minus the mean-- this is the mean; this is that 10 right there-- squared plus 10 minus 10 squared-- that's the middle 10 right there-- plus 20 minus that's the squared plus 30 minus 10 squared. So this is the squared differences between each number and the mean. This is the mean right there.
I'm finding the difference between every data point and the mean, squaring them, summing them up, and then dividing by that number of data points. So I'm taking the average of these numbers, of the squared distances.
So when you say it kind of verbally, it sounds very complicated. But you're taking each number. What's the difference between that, the mean, square it, take the average of those. So I have 1, 2, 3, 4, 5, divided by 5. So what is this going to be equal to? Negative 10 minus 10 is negative Negative 20 squared is Plus 20 minus 10 is 10 squared, is Plus 30 minus 10, which is 20, squared is All of that over 5.
And what do we have here? So in this situation, our variance is going to be That's our measure of dispersion there. And let's compare it to this data set over here. Let's compare it to the variance of this less-dispersed data set. So let me scroll over a little bit so we have some real estate, although I'm running out. Maybe I could scroll up here. Let me calculate the variance of this data set.
Expectation and Variance – Mathematics A-Level Revision
So we already know its mean. So its variance of this data set is going to be equal to 8 minus 10 squared plus 9 minus 10 squared plus 10 minus 10 squared plus 11 minus let me scroll up a little bit-- squared plus 12 minus 10 squared.
Remember, that 10 is just the mean that we calculated. You have to calculate the mean first.
Divided by-- we have 1, 2, 3, 4, 5 squared differences. So this is going to be equal to-- 8 minus 10 is negative 2 squared, is positive 4. You still get 0. Square it, you get 1.
Square it, you get 4.
So the variance here-- let me make sure I got that right. So the variance of this less-dispersed data set is a lot smaller. The variance of this data set right here is only 2.
So that gave you a sense. There are four frequently used measures of variability: In the next few paragraphs, we will look at each of these four measures of variability in more detail. Range The range is the simplest measure of variability to calculate, and one you have probably encountered many times in your life. The range is simply the highest score minus the lowest score. What is the range of the following group of numbers: The range is 8.
What is the range? The highest number is 99 and the lowest number is 23, so 99 - 23 equals 76; the range is Now consider the two quizzes shown in Figure 1. On Quiz 1, the lowest score is 5 and the highest score is 9.
How do you find the variance of the data 2, 4, 6, 8, 10? | Socratic
Therefore, the range is 4. The range on Quiz 2 was larger: Therefore the range is 6. It is computed as follows: The interquartile range is therefore 2. For Quiz 2, which has greater spread, the 75th percentile is 9, the 25th percentile is 5, and the interquartile range is 4. Recall that in the discussion of box plotsthe 75th percentile was called the upper hinge and the 25th percentile was called the lower hinge. Using this terminology, the interquartile range is referred to as the H-spread.
A related measure of variability is called the semi-interquartile range. The semi-interquartile range is defined simply as the interquartile range divided by 2.