video

Lesson video

In progress...

Loading...

Welcome to lesson one of data science.

I'm Ben, and I'm really excited that you've chosen to study this unit.

'Cause in this unit, we're going to look at how data is used to change the world around you.

We'll look at global datasets, as well as local data sets that are perhaps more relevant to you in your local area.

You'll learn how to visualise data, spot patterns, identify trends.

So hopefully you'll feel empowered to use that data to make decisions and solve problems. So all you'll need for this lesson is a computer and a web browser, and of course a whole bunch of enthusiasm.

So once you found that let's get started.

Okay, so in this lesson we're actually going to explore what is meant by data science and what it actually is.

We'll also explain how visualising data can help us to identify patterns and trends in order to gain some insights.

Okay, so what actually is data science? Well, let's say data science is extracting meaning from large data sets in order to gain insights to support decision-making.

So I think a key there is, large datasets.

So we're not just talking about some data that we might see in a spreadsheet with a few rows.

We're talking about large amounts of data that as a human being, if we were to look at this data it might look a bit confusing and it might take us a long time to really read through all of it to try and make an some kind of understanding of it.

So data science allows us to maybe extract some meaning out of that and provide some insights or represent the data in a different way.

So once those insights are being created, it'll help us to learn from the data and maybe make some decisions.

So let's get started and actually give you a task to do which is to give you some data as you can see on the screen here.

So task one on your worksheet I've given you this data for you to look at and what I'd like you to do is spend a little bit of time to really explore the data and look at it in a bit of depth and see if you can work out what the data is actually showing you.

Can you extract any information from it and maybe is there some kind of story that's appearing when you look at the data a little bit more, okay? So what I'd like to do is I would like you to head over to your task one on your worksheet now and pause the video and see if you can just explore the data and see if you can answer the questions.

Don't spend more than five minutes on this activity so if you're struggling after five minutes then unpause the video and we'll just move on.

Okay, so what did you actually learn from the data? Were you able to extract any meaning from it? Or get some information or maybe even tell a story from the data? Well, actually this data comes from a really very famous example from Mancha Joseph Minard.

Now Joseph Minard used these numbers in 1869 to find meaning, to tell a story with the data.

So I wonder whether or not you are able to tell a story with it.

Well, actually the story derives if I gave you a bit of context because obviously what you didn't have there and Joseph Minard did have was he had a bit of context which obviously helped him with this.

So the data you looked up before relates to Napoleon's march on Russia in 1812.

So Joseph Minard then represented this data in a different way to create what is known as what is widely regarded as the best statistical graph of all time.

So, let's have a look at that now.

So this is what Joseph Minard actually produced.

He actually produced this visualisation a graphical representation of that data that you were looking at.

Now, again, we're looking at the Napoleon's march on Europe.

So with it just a little bit of a help here, we can work out what this graph's showing us.

Now the width of the line represents the number of troops that were marching from France to Russia.

So on the left-hand side is their starting point and the width of that beige looking line represents the number of troops they had.

And then as we go towards Moscow, you might notice that the line is narrowing slightly, so what do you think that tells us? Well, that tells us that the number of troops that the French army had was diminishing.

So obviously along the way, things are happening to you know, to kill off the soldiers essentially.

And then what the black line represents their journey back home.

We can also see some geographical information on here and we can also see the temperature as well.

So this line at the bottom, the, the mini line graph at the bottom, it looks like it's going up but that actually represents the temperature dropping.

So the starting at the end point, when we get to Moscow, the temperature is actually -30 degrees.

So that's, I wanted to highlight some key points this graph to really highlight a couple of those things.

So let's have a look at this one here.

So what I've done on this slide is I've highlighted a couple of things that I wanted to point out to you, which was I'll say left their destination.

They had, you can see from this a circle and arrow on the top left hand side there, there were 422,000 troops.

Now, by the time they reached Moscow they only had a hundred thousand troops, which means that at that point, at least a hundred, they had at least 322,000 troops die before they even reached Moscow.

And then the black line coming back we can see that by the end of that campaign they were left with only 10,000 troops.

And that's all highlighted on this, on this graph, right? You know, it was not only represented in numbers but we can also see the thickness of the line, which helps.

So the other thing I wanted to point out was this.

So these are the geographical elements that you can see on the map and what I've done is highlighted a particular area of that.

So you can see what the green circle I've zoomed it in so you can see it on the right hand side.

Now what I find interesting about that is the fact that, you can see the thickness of the line, which is on the right hand side, shows you that a 50,000 troops.

now that like wiggly line shows you that that's actually a river and that river, I think my presentation is not amazing but I think it's the Berezina River, which is near Minsk.

So they had 50,000 troops but by the time they crossed the river they had 28,000 troops, which that suggests that there were 22,000 troops that had died crossing the river.

Now, of course, that's not the only place where we can see geography on that map but that's just a highlighted part to it where we can, again, extract some meaning out of that that might have been difficult for us to determine by just looking at that initial data.

So what all of this is showing us that when we did look at that initial data we found it very difficult to extract that meaning and find the story.

But as soon as we visualise it, then actually once we look at that visualisation, it makes it much easier for us to interpret the data and, you know? Get some kind of insight as to what actually happened during that Napoleon's march on Moscow.

Now another really famous example was John Snow's visualisation.

Now in 1854, there was an outbreak of cholera in the Soho area of London.

Now, the time it was widely believed that cholera was caused by pollution in the air.

Now John Snow didn't quite believe the theory but he needed a way to prove it.

Now we did some observations that led him to give him some evidence of discounting this belief but he couldn't prove this to other people.

So your job now is to help John Snow prove his theory.

Now, John Snow made a dot map of Soho.

Now, essentially a dot map is finding a map and placing a dot every time something occurs.

Now, the dots that you can see on the map here on the shaded parts represent where there each time there was a cholera related death and where it had occurred.

And what I'd like you to do is look at the map in a bit more detail and answer the questions on the worksheet to help John Snow prove his theory.

So I'd like to pause the video now, head over to task 2 on your worksheets and see if you can answer the questions and help John Snow disproves that theory.

Okay, so when you've done that unpause the video and we'll continue.

Okay, so how did he get on with that? Were you able to help John Snow in trying to work out what was going on with the cholera outbreak in Soho? Now what John Snow does he highlighted on the map the position of the water pump on Broad Street.

Now that was really important because he had gone on and done some research and he worked out that so many people were dying around the Broad Street area so what was different about Broad Street compared to the other areas of Soho? And when he worked out that they were all getting their water from the same source and that was the pump that he highlighted on Broad Street.

Now that visualisation helps him to prove a theory that all deaths had been people who'd used this water pump for drinking water.

So John Snow created this visualisation and used it to help convince the local council to immediately remove the water pump handle.

And as soon as they did that many lives were saved because people all of a sudden stopped drinking from that water source and they weren't getting cholera anymore.

So that visualisation helped him prove a theory to really kind of prove the theory that he was trying to tell them that the water pump was the actual issue.

Okay, so you might have heard of an infographic before or at least might have seen one before.

Now, I've used the term visualisations quite a lot already in this lesson.

So let's explore the differences between them and what is each one, what each one is.

So data visualisations are visual representations of the data, such as what we looked at before, that John Snow had done and Joseph Minard.

So they are charts and graphs intended to help an audience to process the information more easily, and maybe get a clear idea about what the data's showing just at a glance.

Whereas infographics are a visual representations of data often involving pictures that reflect patterns and help tell a story.

Now infographics can include visualisations, so we look at the differences data visualisations tend to be a single thing, it's about a single set of data.

Whereas infographics are a collection of those single visualisations to then in order to tell more of a story.

So we look at the example on the right-hand side this is participation in code clubs around the world.

So you can see it's made up of a whole, whole bunch of visualisations.

Put together it helps form us maybe a bigger story or easier thing first determine and make decisions about.

So I thought I'd show you some visualisations that I think are really great and really creative.

So here's an example of maybe more unusual ones.

We've already looked at John Snow's dot matrix we lots of Joseph Minard doing that kind of visualisation with a different thickness of lines.

So here's ones that also show different ways in which we can represent data so let's look at the first one.

So this first one is showing us female astronauts in space.

So you can see it's very simply a bar chart.

And we can see that and then at the bottom we can tell what years they were going across.

So it starts at 1963 and goes all the way to 2019.

But also we look at this data in a bit more detail.

We can see that we can work out the number of astronauts, female astronauts in space.

But also what nations they come from as well.

And you can see maybe if it's increasing for a country or is it a bit more random, but we can test it out by just looking at it, in a bit more detail.

Now, I also, really like this one and this was done by somebody, a child at school.

And what this is showing us is they recorded their use of digital technology outside of school.

So you can see on the graph at the bottom we have Monday to Sunday, so it's recording their usage throughout the whole week.

But actually we it's not very clear on your screen, but the key at the top shows us what kind of activities they were doing.

So if it was whether or not they were listening to music or playing a game or looking at social media.

And the colours actually represent how long they were spending on it as well.

So again, visually, we can see that Saturday and Sunday, and is that a Thursday? They're spending the most amount of time using electronic device.

But you might also be able to tell that they're using maybe playing more music on a Sunday than they do on other days.

So from that visualisation, we can pull out that kind of information.

And finally, I wanted to show you this one so this one's very similar.

So this shows us what liquids were being consumed by a student at school throughout a whole week.

So the shapes of the icons inside the bottles they represent, what day of the week it is.

But also we can see straight away from here that water is the most consumed drink.

And we look a bit further you might even be able to tell by the shape of the icons and the patterns inside what day of the week that was.

So again, we can pick out information was we can ask questions such as are they drinking water at home, or is it just at school? Are they drinking healthily, inside and outside of school? So those are the kinds of insights that we might be able to gain from looking at that visualisation.

And what decisions could we make from that? We could probably decide that we need to drink more water at home maybe, or in school we may need to drink more water, whatever it might be, but either way, we're able to extract information provide an insight so that maybe we could make a change or make some kind of decision.

So I've got an activity for you to do now which involves you looking at some data and I'd like you to be really creative and create a visualisation based on this raw data that I'm going to provide you with, okay? So you're going to create, go to the worksheet and create a visualisation, but let me show you the data first of all.

So you can see I've got all the planets in our solar system and I provided you with a whole bunch of data such as the mass, the time it takes for light to travel to that planet from the sun, the time it takes to orbit the sun, in earth days, the length of the day in hours, distance from the sun and the average temperature as well.

Okay, so I'd like you to pick out the data that you would like from here.

I'm certainly not asking you to represent all of this data in a single visualisation but what I'd like you to do is, pick out something that you think might be interesting to learn about.

For example, you might just want to represent their mass.

And that would be absolutely fine if you want to do that have the planet and the mass.

But you also you might want to compare two of the variables that we're looking at there for example, is there a relationship between, the time it takes for light to travel from the sun to the average temperature, maybe? And that might be something that you could visualise.

Now I'm going to leave it completely up to you how you visualise this data.

You might want to do this electronically and you can see the link at the bottom there if you were to click on that link it will take you to this as a spreadsheet.

So you can use that electronically and if you want to use spreadsheet tools to create that graph, you're more than welcome to.

But equally if you want to be creative, like those students that you saw at school, you want to just get a pen and paper and draw something for me, and you think you might want to represent the size of the planets in different ways with different width or different colours, that's up to you, let's see if you can do that, okay? So I'm really looking forward to seeing what you might create with that.

So I'd like to head over to your worksheet now, pause the video, have a go at that and once you're done, you can restart again.

Okay, so how did he get on with that? And really, really well done if you're able to make a visualisation.

I'm really looking forward to seeing what you created as well so please do share them with us.

If you feel like you'd like to share them then please ask your parents or carer to share your work on Instagram and Facebook or Twitter tagging @OakNational and using the #LearnwithOak.

So you've done a really great job there we've taken our first steps into data science.

We've looked at maybe some more historic examples of how data has been taken from its raw form to actually make decisions and have an impact.

But also we've looked at some space data where you visualised it yourself.

So the next lesson we're going to look at global data sets.

We're going to look at how data is used to determine and compare different countries in the world.

So I'm really looking forward already so I'll see you then.