Loading...
Hello everyone, it's Mr Millar here.
In this lesson, we're going to be looking at correlation.
So first of all, I hope that you are all doing well, and in the last lesson, we looked at scatter graphs as a way of representing bivariate data.
And we saw that scatter graphs were really useful for seeing the relationship between two variables.
In this lesson, we're going to take it a step forward, by having a look at what we call correlation, which is essentially how we describe the relationship that we see between the two variables.
And correlation is a really important word, and if you ever read the newspapers, or read any kind of scientific journal, or anything like that, you will see this word, correlation, come up a lot, so it's a really important word, and we're going to find out wat it means, but first of all, let's have a look at the try this task together.
So, here we've got a scatter graph, showing the relationship between the number of photos and battery life, and what I want you to do, is think about how could you describe the connection between the variables shown on the scatter graph? And also, how many ways can you complete the sentence frame below, so pause the video now, to have a go at thinking about these two questions.
Okay, great, so we actually saw this scatter graph at the end of the last lesson, or at some point last lesson, and as we see, that as we take more photos, the battery life decreases, which makes sense, because taking photos on a camera uses up battery life.
So, that is what is going on here, and in terms of the sentence frame, we can either say, "As the number of "photos "increases, "the battery life "decreases." Or we could say it the other way round.
So, "As the number of photos decreases, "the battery life increases." So, that is the try this slide.
Now, let's have a look at this idea of correlation.
Okay, so, for this connect slide, we have got three different scatter graphs, that each show a different type of correlation.
Positive correlation, negative correlation, and no correlation.
Now, the first one, positive correlation, says, "As one variable increases, so does the other." So the example here is that, as cinema attendance increases, so does the amount of popcorn sold.
The next one, negative correlation, well, that says that, "As one variable increases, "the other one decreases." So, the more days you're absent from school, the lower your average test score will be, for example.
And the final one, no correlation, is also really important.
And that is when there is no apparent pattern between the two variables.
So, for example, if we look at the distance that a pupil lives from school, and their score in the test, well, there shouldn't be any pattern, any relationship between those two things, so as we can see, the data points are kind of all over the place, there is no clear pattern.
And it's also worth pointing out that when we see a positive correlation, we can see that there is a positive gradient.
If we imagined a kind of a straight line, sort of connecting these points together, there'd be a positive gradient, and the same is true for a negative correlation, there would be a negative gradient.
The other thing that I wanted to point out here is that we can have a different strength of correlation.
So, for example, in the first one, we would say that there is a strong positive correlation, because, if we imagine a straight line connecting all of these points, something like this, then all of the data points are very, very close to that line, so there is a strong positive correlation.
If, however, we had a graph that looks a little bit more like this.
Well, we would say that there is still a positive correlation, because as one variable increases, so does the other, but this one is a much weaker positive correlation, because the data points are not so close together.
So, positive correlation is when, as one variable increases, so does the other, but it can either be stronger or weaker, depending on how close the data points are together, or how close the data points are to a line of best fit, going through them.
Anyway, make sure that you have got these three definitions down, before we move on to the independent task.
Okay, so, two questions for you to have a think about here.
The first one says, "Write a sentence explaining "what each graph is showing, and the type of correlation." So, two graphs for you to have a think about, and the other one says, "On each set of axes, "plot eight to 10 points, based on the correlation "that you would expect to find." So, for example, the first one, the hair length and maths test results, well, you wouldn't expect any kind of relationship between the two of them, so you might expect, you'd expect there to be no kind of correlation, so you would plot your points kind of like that, all over the place.
There's no pattern.
Hope this is clear, pause the video now, four or five minutes, to have a go at these questions.
Great, well done, let's go through it.
So question one, first of all, well, clearly there is a positive correlation here, because the more hours the heating is on, the higher the heating bill would be, which is pretty straightforward.
And we can see here that there is a very strong positive correlation here, because the data points are very close together, they're very close to the line of best fit.
And the next one, the test score versus distance, well, there doesn't seem to be any kind of pattern here.
We would say that there is no correlation for that one there.
And for question two, well, we've done the first one already.
There'd be no correlation between the maths test results and the hair length, but for the other ones, we would expect to see a correlation, so, ice cream sales versus temperature, well, clearly the hotter it gets, the more ice cream we'd expect to sell, so we'd expect, say, a positive correlation like that.
You could argue that it might be a very strong positive correlation, it might be slightly weaker, so it really depends on what you think, because obviously the number of ice cream sales might depend on other things, not just temperature.
The next one, time spent brushing teeth versus number of cavities.
Well, clearly the longer you spend brushing your teeth, the fewer cavities you would expect to have, so make sure that you brush your teeth, so, we'd expect a negative correlation, to look a bit like that.
The final one, height versus shoe size, excuse me.
Well, clearly, the taller you are, as you grow in height, so you would expect to grow with your shoe size, so we'd expect a kind of positive relationship here as well.
Anyway, that is that, hope that you found that nice and straightforward.
Now let's have a look at the explore task.
Okay, so, explore task, we have got eight different graphs here.
So first of all, you need to sort these graphs into groups, based on the correlation shown.
So, let's call these graphs A, B, C, and D.
E, F, G, and H.
So sort these graphs into different groups, based on the correlation, and then place them in order by strength of correlation.
So remember what we said.
If the correlation is strong, then the data points are close together.
If the correlation is weak, then they're further apart, and finally, think of variables that might match each graph.
So, for example, the second graph might be, you could have the one that we had already about the number of hours that the heating bill is on, compared to the cost of the heating bill.
Okay, pause the video now, five or six minutes to have a go at this explore task.
Okay, great, let's go through these.
So, first of all, let's have a look at, let's have a think about the ones that are a positive correlation.
So, a positive correlation, whether, as one variable increases, so does the other, well, A is definitely positive, B is definitely positive, and so is E, I would say.
But I would definitely say that, looking at E, compared to A and B, E has a much weaker positive correlation, because the data points are further apart.
What about a negative correlation? Well, definitely D, definitely G, and definitely H.
And I don't think any of the other ones have a negative correlation as well.
Now clearly D and G have a very strong negative correlation, and H, well, H is a bit of an interesting one, because if you imagine plotting a line, you couldn't really plot this straight line between those points, so you might need a little bit of a curved line, which is interesting, but they definitely all have a negative correlation.
And then finally, no correlation, well, that is going to be C, that is going to be F, as well.
So, and in terms of the variables, that might match each graph, well, that is really up to you.
For example, for G, you could have temperature here, on the X axis, so, as it gets hotter, what goes down? Well, what might go down is your heating bill, because you don't need to put the heating on, if it's hot, whereas if it's cold, you need to put the heating on, so your heating bill would be bigger.
That might be one example, but I hope that you found lots of interesting examples yourself.
Okay, that is it for today's lesson, where we have looked at different kinds of correlation.
Next time we're going to be extending our learning to be looking at lines of best fit.
So that is it for today, thanks very much for watching, and see you next time, bye-bye.