Loading...
Hello, and welcome to lesson four of our data science unit, I'm Ben, and this lesson is called data for action.
And that's because what we're going to do is we're going to put into practise all those steps that we learnt about last lesson of the investigative cycle, but we're going to put them into practise using a problem that you're going to define.
And hopefully by the end of that, you're going to feel empowered to make a decision or make a change in your local area.
So all you'll need for this lesson is a computer and web browser.
If you can clear away any distractions that you might have, so turn off your mobile phone.
And again, if you've got a nice quiet place to work, that'd be absolutely brilliant.
And when you're ready, let's get started.
Okay, so in this lesson, we're going to look at the steps of the investigative cycle again.
We're also going to work out what data that we need to solve a problem that you're going to define.
And then finally, we're not going to have chance to go through all the steps in this lesson, but that's something that we'll move on to in lesson five and a little bit of lesson six two, but we're going to get to a point where you identify the data that you need.
And then we need to find the way in which we're going to come collect that data.
And as part of that, we need to make a data capture form.
So we'll come to that later on in this lesson.
But to get started, I wanted us to have a recap of the investigative cycle.
So to do that, last lesson, so lesson three, we learnt about the framework that we can follow when posing and solving real world problems using data.
So task one of your worksheet requires you to put those steps in the correct order and match the descriptions with the steps.
So, I would like you to pause the video now, head over to your worksheet, open up task one and you'll see all the steps and all you've got to do, all the words for all the steps, you've got to put them in the right order, and then look at all the descriptions and see if you can match the step with the appropriate description for that step.
Okay, so go ahead and do that.
And then once you've finished that, un-pause the video and I'll be here when you get back.
Okay, so hopefully now you've been able to order the steps in the appropriate order and also link up a definition with each step.
So part of the answer is on the screen for you there already because that is the PPDAC cycle that you're looking at.
So you should have had the order of problem, plan, data, analysis and then conclusions.
Now don't worry if you didn't quite get that order, you can always rearrange that later on to make sure you've got a correct answer.
So let's go through what the definition is for each individual step.
So starting off with the problem.
Now, the problem stage is where we define the problem that needs to be solved and we pose questions that can be investigated.
So remember, when we start off with a broader problem, what we then have to do is think about questions that we could answer using data, so that often involves being specific about what variables we can use.
Okay, so the next stage was the plan.
So once we've got our questions defined in the problem stage, the first part of the plan stage is to make predictions about what we think the answers would be to those questions.
Then what we need to do is work out what data is needed to be able to collect.
And where are we going to get that data from? Is there somewhere where that data is already provided for us or do we need to collect that ourselves? So then we move on to the data step.
Now the data step's where are we're going to gather the data.
So either we get the data from that data source or we collect the data ourselves, but most importantly, no matter which method that you've chosen to do, the next step will be to cleanse the data, to make sure that data's ready for analysis already for the next step.
So the analysis stage is where we visualise the data.
So once we've visualised it, that should help us be able to spot any patterns, any trends, correlations or any outliers that might be interesting.
And then what we do as part of the analysis stage, as we demonstrated when we looked at the roller coasters in lesson three, was to write down our observations about what the data is showing us.
Now, once we've done all that analysis and we got our observations, we're then in a place to move on to the next step, which is the conclusions.
So the part of the conclusion is to answer the question and explain what the data reveals.
So if you're answering the questions that you pose right at the problem stage, then you should feel confident with your answer being based on some data that you've looked at and analysed, but if you don't feel confident with it and you feel that there's more questions to be asked, that also needs to be drawn out in the conclusion stage because it may well be further action is needed as needed and you need to go through the cycle again before you feel really confident to make any suggestions or making decisions based on the data that you've looked at.
Okay, so let's move on.
Now this is the stage where we're going to look at our own problem to solve.
Okay, so this problem is going to be based around, although we're going to frame the question for you or frame the context for you.
What I wanted to make you aware of, this is very much something that's going to be based on you, your local area and hopefully something that you feel that by the end of it, you feel empowered to make a decision or maybe make some change in your local area.
Okay, so as you can tell, I'm quite excited about that and quite passionate about it.
So hopefully by the end of this lesson, you will be too as well.
So the context that I want us to work around here, is a problem with litter.
Now this for me is a big deal.
And I'm going to explain to you why I think litter is a big deal.
And it's something that I feel that we can all make a change about, not just on an individual level but you're going to collect some data that maybe is really going to highlight an issue that maybe you can solve.
So let me show you these statistics that I found really interesting.
So, almost 48% of people admit to dropping litter, 48%.
That's nearly half of us.
And that's only the people that have admitted to it.
So probably we're looking at about over 50%, including the people that maybe aren't prepared to admit to it.
That's half of us drop litter.
Okay, now the amount of litter dropped each year the UK is increased by massive 500% since the 1960s.
So clearly not only is it a problem that lots of people drop litter, but obviously this is not a problem that's going away.
It's actually increasing hugely since 1960s, and seven out of every 10 items of litter is food packaging.
It's interesting in itself and keep that in your mind because that might be something we want to frame a question around maybe.
Now around 122 tonnes of cigarette butts and cigarette related litter are dropped every day across the UK.
And 1.
3 million pieces of rubbish are dropped on UK roads every weekend.
And the third or motorists admit to littering whilst they're driving.
Now, you all live in the local area and it's your community and we're going to explore that a little bit more later on in this lesson but just think about that for a second and think, would you like motorists dropping litter in your local road, maybe the road that you live on? Do you want to walk around the streets and be surrounded by litter? I'd imagine all of you are thinking the answer's no to that, so let's see what we can do about this problem.
And before we get to that stage, I found a really interesting story that hopefully we can all relate to.
Now, I'm not sure whether any of you have been lucky enough to go to Disneyland.
I haven't, but I found this to be a really fascinating story.
So I'm just going to read it to you.
Then I've got some questions first to post to think about this scenario.
So it said, Walt Disney wanted to know just how long a park patron would go with trash in their hand before just letting it drop to the ground.
So, what he did about that was he sat on a bench and watched the visitors of his park, counting the steps of those looking for a place to throw their garbage.
And he counted 30 steps on average.
And interestingly, that is still the distance between each trashcan in Disney, further ensuring a clean experience.
Now that for me, is a great story because it's perfect for what data science is all about.
There's a problem that needed solving and Walt Disney collected data, analysed that data and then felt empowered to make a change.
And that's perfect, and that's exactly what we're going to set out to achieve.
Okay, but I've got some questions that I want you to think about.
So pause the video in a second and think about these questions.
How did Walt Disney narrow down a larger problem into a more specific question, but what was that question? And what was the outcome of that? And did it solve the problem? Okay, so just pause the video, have a think about that.
And then un-pause when you think you've got some answers formulated in your head.
Okay, so how did Walt Disney narrow down the problem? What was the larger problem? The larger problem was, there is litter at Disneyland and Walt Disney didn't want there to be litter at Disneyland.
They wanted to improve the situation, but that was too much of a broad, unfocused question.
So what Walt Disney did is he narrowed it down.
He went, how long does it take before somebody drops litter in one of my parks? And therefore, we can see that's a much better question, it's focused and we're using variables there.
He's using distance a start.
So we've measured that distance, which I think's really interesting.
Now, what was the outcome? Well, what I'm trying to highlight there is he's used this data that he's collected to make a change, and the outcome was, he made a change and the change was to put, as they call it trash cans, 30 steps between each trash can, so you can't go more than 30 steps in Disneyland before finding a trash can.
And that's still there today, which I think's really great.
Now the next question, did it solve the problem? I don't know, did it, we don't know that.
I mean, it may have solved the problem, but from this text alone, from the data that we've been given, we don't know if that solved the problem.
So how would we go about working that out? Well, we'd go through the steps of the PPDAC cycle again and maybe Walt Disney's already done this.
This data might already exist or we all might need to take a trip to Disneyland.
That'd be great, wouldn't it? And go and investigate this ourselves and see, you know, has it improved the litter? Okay, are people still dropping litter? So yeah, either way, we don't know that but it may have done, it may not have done, but for others to be confident in knowing that, we would need to do the cycle again.
Okay, so now let's look at something that relates to us.
Okay, now I wanted to point out this concept of community.
We all live in a community, your local area is your community.
So how can we use data to help us improve our community by reducing our waste and recycling as much as we can? Okay, so this is part of the problem stage now.
So the problem for us is the broader issue, is how are we going to improve litter in our community? That's the broader picture, but like Walt Disney, we need to narrow down that into something much more specific.
So your task is going to be to think of two focus questions that use variables that are going to help us or allow us to pose questions that we can use data to help us find the answers to, to solve our problem, okay.
So keep that in your mind 'cause I've got two tasks for you to do.
So that's the first task.
Now the second task is to then quickly move on to the first part of the planning stage.
And that's simply predict what you think the answers to your questions will be.
So, I'd like to go to your worksheet and complete task, two and three.
So here are you going to pose the questions and then predict what you think the answers will be before we investigate any further.
Now to help you with this, it might be helpful just to start thinking about what those variables might be, what data could you collect on litter or rubbish around your local area? So have a think about that.
See if you can pose two questions and then see if you can then work out what you think the answers are going to be before we investigate further.
Okay, so pause the video, go ahead and do that once you've done that, you can un-pause and we'll move on to the next stages.
Okay, so how did you get on with that? Were you able to define your problem and come up with two really good questions? Now, if you weren't, then don't worry too much about that because after this next task that you're going to do, it may well be that you can go back and feel a bit more comfortable with asking a slightly different question.
So don't worry too much if you found that part difficult because this part now is all about really thinking a lot harder about the variables.
And I asked you before you post the question to think about what categories of data you might want to collect but this task is really thinking about that in more depth.
Okay, so I'll read out what it says on the slide.
It says the next part of your plan, it involves thinking about the data.
Now we don't have a data set to analyse here.
So we don't have a pre-made local, set of data about litter local to you in your local area.
So we need to collect that data ourselves.
So what data could we collect about litter that we find? Okay, so that involves going back to your task sheet and I'd like you to come up with different categories.
Now I provided you with a mind map.
So to do this on your worksheet, you'll need to create little text boxes and fill in the mind map.
But if you find that really difficult and a bit fiddly then don't be afraid to just delete the mind map and just list out the categories.
Some people work in different ways.
Some people might prefer seeing it categorised in a mind map.
Others might prefer just to list it.
Okay, so I've got some suggestions for you but don't feel limited to this list because it's really up to you, what you go and collect.
Okay, so the categories of data that I suggest are, you might look at, you know, since we're on the mind map, I got types of litter.
Now I might fill that in with the different categories such as food waste will be a type of litter, food packaging, maybe stationary, or think about, there's other stats that we looked at earlier on, such as it might be a cigarette butt or something like that.
You might want to record the time of day or even the day itself.
You might want to record the material it's made out of, maybe the quantity of litter at the location.
So it might be that there's a lot of food waste.
So you might want to include that set of data there.
Maybe, it might be important to include the location.
I think it might be.
I think that's a really important one, the location of where the litter was dropped maybe the distance from a bin and maybe, was it recyclable? And that in itself could be broken down into further categories.
So what category is it? Because you may notice and don't do this on litter that we're eventually going to pick up but if you've got items of plastic, for example, around the house, have a look at them and if they've got that little recyclable logo on them, normally they got a number inside there as well.
Now that number is going to help you determine what category and how it is recycled and that might not actually be recycled locally to you.
So it might be quite to find out, it's a lot of litter being dropped.
For example, that is recyclable, but you can't recycle it in your local area.
'Cause that might be something that you can take action about.
You might be able to go to your council and write to them and say, well, lots of this list has been dropped and it's not recycled.
We need to recycle it.
Okay, so you get the idea anyway.
So don't feel limited to my list because there's other categories you can do there, you can expand on some of these categories, but it's really up to you what data you collect to help you solve the problems. And like I said, if you want to go back to the problem stage and redefine your questions once you've thought about the data a little bit more and the variables, then that's okay, too.
Okay, so if you can go ahead and plan out that data, don't be afraid to have a conversation with someone in your house about it if that's possible and see if they can help you with some ideas as well.
Okay, so pause the video and when you've done that, you can un-pause.
Okay, so you're doing a really great job.
I know this lessons were quite a lot of thinking but we're really doing the foundation work for something that hopefully is going to make a difference in your local community.
So stick with it if you can.
Okay, so now we're going to move on to the point of data collection.
So we've got our problem that we've defined, that we've worked out what we think the answers are going to be.
We've also worked out what variables and categories of data we need to collect.
So the next point is to work out how we're going to collect that.
Okay, so the next step is to create our data capture form.
So now we've decided what data to collect, we need to consider how we're going to collect it and store it.
So as we're going to collect this data ourselves, we should think about how we're going to go about it.
So I put hint there, to help with this, you should also consider what you might want to do with the data after collection.
Now we're kind of want to store this data electronically because we're going to use the Kodak platform that we used in lesson three, to get our data and then analyse and visualise it.
So I quite liked to have this data electronically.
Now, if you're not able to collect the data electronically, it's not the end of the world 'cause we can always type up the data later on.
But what I'd like to do is demonstrate to you how to create an electronic data capture form.
Okay, now I'm going to use Google forms for this.
So if you have a Google account, then you can use that, if your school or you have a Microsoft account sorry, there's also Microsoft forms, which works in a really, really similar way.
So if I demonstrate Google forms and you'll be able to kind of use what I've shown you and apply it to a Microsoft form.
Alternatively, if you don't have access to either of those things, then don't worry about it, a spreadsheet would do.
Okay, so maybe laying out the headings in the spreadsheets and showing you the headings being each one, the variables and then you can just, each time you've got a new bit of litter or data that you want to collect you just type it onto the correct heading.
Okay, so that that'd be absolutely fine as well.
So I'm going to go ahead now and show you how to create a Google form.
So I've got a folder.
Fiends, family, and businesses saved as contacts on your device.
For example, you can say, call John's mobile or call the sandwich shop on Pier Street.
Anyway, so I've got my drive now and this is a folder I'm going to create.
And you can see I've actually already made a data capture form, which I'll show you in a minute but let's start off going through the process of creating this from scratch.
So on the left hand side, I'm going to click on new.
Okay, then it shows me that I can create a Google doc sheets slides but I'm going to click on more and you can see, I can go to Google forms. So I'm just going to select a Google form.
Yeah, that's fine.
Okay, and it's going to make a form for me.
So this is going to open up a template that I can start adding to, okay.
Now hopefully it should be fairly straightforward what to do.
We don't need to use any particularly fancy tools here.
We don't need to include any video or images.
Is it just going to be a place where we can log the data that we're going to collect, okay.
So give it a nice sensible name.
So untitled form's not a great name 'cause we're never going to remember what that is if we create more than one form.
So I'm going to call mine litter collection.
I always put it in the wrong place.
Here we go, I'll select that again.
Litter collection form, okay.
Right now, instantly it puts in a question for you.
Okay because obviously there's no point having a form without some kind of like initial question to be asked.
Okay, so this is where you're going to go through all the questions that you think you need to ask, all the categories of data that you need to collect.
Okay, so I'm going to say, for example, the type of litter, not spelt litter correctly.
Type of litter, okay.
Now, if I use that, there are different options.
We can make it multiple choice, which means that whatever I put in here, I can only select one of them.
Okay, I can make it a checkbox, which means that if I put two options, it allows me to select more than one.
And let me demonstrate that.
So I'll start off with the multiple choice.
So I'm going to put, I don't know, plastic bottle.
Okay, and then the other one's going to be a drinks can.
Capital letter there.
Okay, so I've got two options there.
Now, at any point, you can always preview your forms and see what it's actually going to look like and what it actually does.
So that's that little eye icon there.
So I click on the eye icon and that's going to load up in a way that I can see.
So you notice it because I've selected the multiple choice, it means that I can only select one at a time.
Okay, so let me just select that again.
Okay, you can see that's selected but if I select drinks can, it then moves it and it de-selects plastic bottle.
But if I was to just get rid of that one, yeah.
Leave that, don't save my changes, but change that to a checkbox.
That means that now, I can select both of them.
Okay, now for this question, type of litter, is not appropriate to say it could be both because you're only going to collect it about one item of litter.
So if that's the case, then you need to make sure that you select a multiple choice.
Okay, so keep going through that.
And the other options that you've got are you can have a drop down list if you want to, you can also allow for a short answer and that allows for free text to be entered.
Okay, so you might want to make notes about something if you want to.
However, just be a bit wary about this because to really analyse data, we need to make sure it's curved, the data is uniform in a certain way.
So we can use a codec platform to make sure that we can compare the data.
Now, if we're using short answers then that kind of muddies the water a little bit because it might not be a data you can compare, you might've missed spelling mistakes or written something in a slightly different way.
So that's sometimes something to avoid but don't be afraid to add it in if there's something where literally just want to make notes to remind yourself, as and when you collected the data.
Okay, so once you've made your data cash forms, I'm going to show you one that I've made earlier.
Okay, so you can see there, this is the one that I've made.
I've got lots of different questions.
Okay, now there's one I did allow free text for, So approximate distance from nearest bin, so I did allow a short text there, but I know myself that I only want to enter a number in there.
Okay, I couldn't really give a multiple choice because you could end up with lots of numbers there.
'Cause it might be one metre.
It might be 100 metres.
It might be 200 metres, it might be more, so you wouldn't want a drop down list of maybe all those metres itemised by number, all the way up to like 1000.
That would take quite a long time to do.
But I know myself that I'm the best thing for me to do there is to just to put a number in, okay.
Once you've done that, to have a version of it that you can actually enter data into.
You can click on preview and the preview will just give you that view of the form where you can start entering data.
Alternatively, you can email it to yourself maybe or just get a link to it.
So if I clicked send there, it's just taking its time, If I click on send, it should.
There we go, you've now got an option to email it.
Alternatively, you can click on this one here and it gives you the direct link to it.
Okay, so you can maybe store that link somewhere so you can enter data when you're ready to, okay.
Perfect, so that's how you create a data capture form.
So go through all the categories of data that you want to collect, pose the question around that.
It doesn't need to be a deep thinking question.
That's just, was it purchased from there? How far is it to the nearest bin? Is it recyclable? What category of recyclability is it, is that a word? I don't know.
Anyway, okay, so what I'd like to do now is pause the video, go ahead and create that data capture form.
And like I said, you don't need to use Google forms. You might use something else.
You might use Microsoft forms or you might use a spreadsheet.
Okay, so pause the video, go ahead and do that.
And then un-pause when you're done.
Okay, so now you've got your data capture form or your spreadsheet or your written piece of paper where you've put down all the categories of data that you're going to collect.
Now we're very to actually start collecting the data.
Okay, so your task is to now go ahead and go and collect that data and add it into your data capture form or spreadsheet.
Now, if you've got time to do it now, then go ahead and do it now.
But really, what we need you to do is to do this before you start the next lesson.
So it may well be, you want to just go ahead and do this in one session.
So you just go and walk around your local area and collect the data.
You may decide that you want to do this at different times of the day, or maybe even different days of the week.
That's completely up to you.
But either way, before you start lesson five of this unit then please make sure that you've got a series of data to work with.
Okay, now, we ask that you try and include a minimum of 20 entries.
Now, the reason we say that, the more data, the better.
Ultimately, the more data that you have, the stronger your case is going to be or the more likely you're going to get a more definitive answer to your question.
So 20 will be okay, but if you're able to collect more than that, then that's absolutely great.
Okay, now the only thing we ask is to make sure that you don't pick up litter without appropriate protection.
So don't go around and pick up dirty litter without any kind of gloves on or anything like that and make sure if you're crossing any roads, you're doing it as safely as possible.
Okay, you might not ever need to pick up the litter anyway, you should just be able to look at it and note down what it is.
Okay, so just be careful about doing that.
Okay, and wash hands always.
So now that's all for this lesson.
So I really hope you've enjoyed that lesson.
I mean, it's been, like I said before, there's been a lot of great thinking going on this lesson but maybe it is putting the foundation building blocks into solving a problem that, you know, you have defined yourself, you're going to collect the data.
And hopefully by the end of this, you're going to feel that you've really investigated something and hopefully got to a position where you can make a difference to your local community, okay.
Now, if you'd like to share your work with us we'd love to say it as always.
So please ask your parent or carer to share your work on Instagram, Facebook or Twitter, tagging @OakNational, using the hashtag Learn with Oak.
Okay, so I'm looking forward to seeing you next lesson where hopefully you're going to come equipped with all your data and we'll get ready to start moving through the next stages of the cycle and really analyse and work out what the problem is.
Okay, so I'll see them.