The relationship between variables - Draw the correct conclusions
the total variance of the dependent variable into two (i.e., variation in y is not explained by the linear regression but If lines are drawn parallel to the line of. When examining the relationship between two continuous variables always look at the scatterplot, to see is an example of a non-linear relationship. Look for. variables. After collecting a set of data, the data points for two variables can be Before looking at drawing lines of best fit, it is useful to consider the possible kinds of Key words: line graph, variable, linear, linear relationship, non-linear.
So, for example, in this one here, in the horizontal axis, we might have something like age, and then here it could be accident frequency. And I'm just making this up. And I could just show these data points, maybe for some kind of statistical survey, that, when the age is this, whatever number this is, maybe this is 20 years old, this is the accident frequency.
And it could be a number of accidents per hundred. And that, when the age is 21 years old, this is the frequency. And so, these data scientists, or statisticians, went and plotted all of these in this scatter plot. This is often known as bivariate data, which is a very fancy way of saying, hey, you're plotting things that take two variables into consideration, and you're trying to see whether there's a pattern with how they relate.
Bivariate relationship linearity, strength and direction (video) | Khan Academy
And what we're going to do in this video is think about, well, can we try to fit a line, does it look like there's a linear or non-linear relationship between the variables on the different axes? How strong is that variable? Is it a positive, is it a negative relationship? And then, we'll think about this idea of outliers. So let's just first think about whether there's a linear or non-linear relationship. And I'll get my little ruler tool out here. So, this data right over here, it looks like I could get a, I could put a line through it that gets pretty close through the data.
You're not gonna, it's very unlikely you're gonna be able to go through all of the data points, but you can try to get a line, and I'm just doing this. There's more numerical, more precise ways of doing this, but I'm just eyeballing it right over here.
And it looks like I could plot a line that looks something like that, that goes roughly through the data. So this looks pretty linear. And so I would call this a linear relationship. And since, as we increase one variable, it looks like the other variable decreases. This is a downward-sloping line. I would say this is a negative. This is a negative linear relationship. But this one looks pretty strong.
So, because the dots aren't that far from my line. This one gets a little bit further, but it's not, there's not some dots way out there. And so, most of 'em are pretty close to the line. So I would call this a negative, reasonably strong linear relationship. Negative, strong, I'll call it reasonably, I'll just say strong, but reasonably strong, linear, linear relationship between these two variables.
Now, let's look at this one. And pause this video and think about what this one would be for you. I'll get my ruler tool out again. And it looks like I can try to put a line, it looks like, generally speaking, as one variable increases, the other variable increases as well, so something like this goes through the data and approximates the direction. And this looks positive. As one variable increases, the other variable increases, roughly.
So this is a positive relationship.
Statistics review 7: Correlation and regression
But this is weak. A lot of the data is off, well off of the line. But I'd say this is still linear. It seems that, as we increase one, the other one increases at roughly the same rate, although these data points are all over the place.
So, I would still call this linear. Now, there's also this notion of outliers. If I said, hey, this line is trying to describe the data, well, we have some data that is fairly off the line. So, for example, even though we're saying it's a positive, weak, linear relationship, this one over here is reasonably high on the vertical variable, but it's low on the horizontal variable. And so, this one right over here is an outlier.
It's quite far away from the line. You could view that as an outlier. And this is a little bit subjective. Outliers, well, what looks pretty far from the rest of the data? This could also be an outlier. Let me label these. Now, pause the video and see if you can think about this one. Is this positive or negative, is it linear, non-linear, is it strong or weak? I'll get my ruler tool out here. So, this goes here. This information is plotted in Panel b. This is a nonlinear relationship; the curve connecting these points in Panel c Loaves of bread produced has a changing slope.
Inspecting the curve for loaves of bread produced, we see that it is upward sloping, suggesting a positive relationship between the number of bakers and the output of bread. But we also see that the curve becomes flatter as we travel up and to the right along it; it is nonlinear and describes a nonlinear relationship.
How can we estimate the slope of a nonlinear curve? After all, the slope of such a curve changes as we travel along it. We can deal with this problem in two ways.
Relationship Between Variables
One is to consider two points on the curve and to compute the slope between those two points. Another is to compute the slope of the curve at a single point. When we compute the slope of a curve between two points, we are really computing the slope of a straight line drawn between those two points.
They are the slopes of the dashed-line segments shown. These dashed segments lie close to the curve, but they clearly are not on the curve. After all, the dashed segments are straight lines. When we compute the slope of a nonlinear curve between two points, we are computing the slope of a straight line between those two points. Here the lines whose slopes are computed are the dashed lines between the pairs of points. Every point on a nonlinear curve has a different slope.
- Bivariate relationship linearity, strength and direction
- What is a non-linear variable cost?
To get a precise measure of the slope of such a curve, we need to consider its slope at a single point. To do that, we draw a line tangent to the curve at that point. A tangent line A straight line that touches, but does not intersect, a nonlinear curve at only one point. The slope of a tangent line equals the slope of the curve at the point at which the tangent line touches the curve. Consider point D in Panel a of Figure We have drawn a tangent line that just touches the curve showing bread production at this point.
It passes through points labeled M and N. The vertical change between these points equals loaves of bread; the horizontal change equals two bakers. The slope of our bread production curve at point D equals the slope of the line tangent to the curve at this point.
In Panel bwe have sketched lines tangent to the curve for loaves of bread produced at points B, D, and F. Notice that these tangent lines get successively flatter, suggesting again that the slope of the curve is falling as we travel up and to the right along it. In Panel athe slope of the tangent line is computed for us: Generally, we will not have the information to compute slopes of tangent lines.
We will use them as in Panel bto observe what happens to the slope of a nonlinear curve as we travel along it. We see here that the slope falls the tangent lines become flatter as the number of bakers rises. Notice that we have not been given the information we need to compute the slopes of the tangent lines that touch the curve for loaves of bread produced at points B and F.
In this text, we will not have occasion to compute the slopes of tangent lines. Either they will be given or we will use them as we did here—to see what is happening to the slopes of nonlinear curves.
Nonlinear Relationships and Graphs without Numbers
In the case of our curve for loaves of bread produced, the fact that the slope of the curve falls as we increase the number of bakers suggests a phenomenon that plays a central role in both microeconomic and macroeconomic analysis. As we add workers in this case bakersoutput in this case loaves of bread rises, but by smaller and smaller amounts.
Another way to describe the relationship between the number of workers and the quantity of bread produced is to say that as the number of workers increases, the output increases at a decreasing rate.
In Panel b of Figure Indeed, much of our work with graphs will not require numbers at all. We turn next to look at how we can use graphs to express ideas even when we do not have specific numbers.
Graphs Without Numbers We know that a positive relationship between two variables can be shown with an upward-sloping curve in a graph. A negative or inverse relationship can be shown with a downward-sloping curve. Some relationships are linear and some are nonlinear. We illustrate a linear relationship with a curve whose slope is constant; a nonlinear relationship is illustrated with a curve whose slope changes.
Using these basic ideas, we can illustrate hypotheses graphically even in cases in which we do not have numbers with which to locate specific points. Consider first a hypothesis suggested by recent medical research: We can show this idea graphically. Daily fruit and vegetable consumption measured, say, in grams per day is the independent variable; life expectancy measured in years is the dependent variable.