video

Lesson video

In progress...

Loading...

Hello everyone, and welcome to this maths lesson on sampling.

Thank you so much for joining me today.

In Lesson 1, we will look at identifying and categorising different types of data.

There are a lot of keywords that we'll be using today, including primary data, secondary data, qualitative data, quantitative data, and two types of quantitative data, quantitative discrete and quantitative continuous data.

We will look at the definitions for all of these across the whole lesson.

But to begin with, let's focus on qualitative and quantitative data.

Let's have a look.

An investigation aims to collect data to answer a question.

For example, is the average mass of a wild squirrel increasing or decreasing in a given area? Answering this question will involve collecting numerical data on the mass of many squirrels.

A different investigation may look into the popularity of a TV show.

This will involve collecting more descriptive data, this time from surveying many different people.

Laura sensibly asks, "Is there a way of identifying the different types of data that could be collected during an investigation?" There are many ways that we could identify and categorise different types of data.

One of the most essential is: Is our data qualitative or quantitative? Quali means quality.

When collecting qualitative data, each datapoint will be a word, image, or sound that describes a quality of whatever you are trying to observe.

On the other hand, quanti means quantity, where collecting quantitative data, each datapoint will be a number or multiple numbers based on what you are counting or measuring.

Examples of qualitative data include the name of the months or the name of someone's favourite drink, whilst examples of quantitative data include the age described in months, years, or even days, depending on the type of animal that you're collecting data on.

All of these are numerical and measure how long something has been alive for.

Similarly, the mass of something can be measured using a range of different units, such as kilogrammes, grammes, or ounces.

Okay, quick check.

Pause here to categorise these six datum into either qualitative or quantitative data.

And the answers are as follows: A, E, and F are qualitative, whilst B, C, and D are quantitative.

Qualitative and quantitative data go beyond just some basic examples.

For example, qualitative data can also be images or sounds.

The use of images in data collection has seen a massive rise over the past few years due to the development of machine learning and AI, where computers are fed data as images in order to be trained to recognise what's on that image and then create their own version of that image.

Similarly, sound data is incredibly useful in voice recognition and music.

Furthermore, quantitative data can also be a coordinate or a calculation.

Many maps and aircrafts will describe the location of something not as a name of a place, but rather as a numerical coordinate.

Other types of quantitative measurement also include the rate of change of something, such as profit and loss or acceleration.

For this check, pause here to categorise these four datum into either qualitative or quantitative data.

And the answers are as follows: Qualitative is A, C and D, and quantitative is B, oh, and D.

Location can either be qualitative or quantitative.

A town or city name is qualitative, whilst the coordinates of a location are quantitative.

The processing stage is an important part of the statistical inquiry cycle.

During the processing stage, we may be able to calculate a summary statistic, such as the mean, the median, the mode, or the range.

But can all four summary statistics be applied to both qualitative and quantitative data? Quantitative data are numerical, and so all four of the mean, median, mode, and range can be applied, such as these data showing the volume in decibels of a monkey's mating call.

Because the volume is a numerical measurement, all four of mean median, mode, and range can be found.

However, qualitative data are always non-numerical; therefore, it makes no sense to add up qualitative data for a mean.

The only one of these four summary statistics that can be used is the mode, which shows which datapoint is the most frequent, such as with these penguin toys.

The most frequently occurring one is this little fellow, who appears three times, not the one or two times of the other penguins.

Okay, for the final check of this cycle, match the value or statement that completes each summary statistic for both datasets.

Pause it to look at the options at the bottom of the screen.

The mean length of a guinea pig is 25 centimetres, whilst the modal length of a guinea pig is 28 centimetres.

The range of locations of an ancient cat, that doesn't really make sense, it doesn't exist, whilst the modal location is Africa because Africa appears most frequently in that table on screen.

And for these practise tasks, question one.

Pause here to put a tick in each box to show whether each datum is an example of qualitative or quantitative data.

And for question two, state whether each dataset is qualitative or quantitative, and if possible, find its mean and modal value.

If a summary statistic cannot be found, put a cross in the box.

Pause now to complete question two.

Okay, great work so far.

For the answers to question one, distances are quantitative.

Types of flowers are names, and so are qualitative.

The type of sound is also qualitative.

However, the volume of that sound is quantitative as it can be measured in decibels.

Mass is also quantitative, but the shape is qualitative.

And pause here to compare your answers to question two to the ones on screen.

Okay, next up, let's look in more detail at quantitative data as there are several types of this more new numerical type of data.

Let's have a look at two of them.

So as a recap, we've already seen two types of data.

They are qualitative and quantitative.

But as Laura says, there are different types of number, and so shouldn't there also be different types of quantitative data? There are many different types of quantitative data, and two of the more common ones are continuous data and discrete data.

Quantitative continuous data are numerical data that can take on any value on a scale, any appropriate number, whether that is an integer, a terminating decimal, a recurring decimal, or anything else.

These data have as much precision as we can measure them with.

However, quantitative discrete data are numerical data that can take on only specific numerical values.

They are limited to certain numbers that follow a certain rule or pattern.

However, these values do not need to be integers.

The mass of a rabbit is continuous as a rabbit can be of any mass.

One rabbit might be the tiniest little bit more heavy than another rabbit.

Quantitative continuous data can be measured as precisely as the equipment will allow.

Some equipment will allow more precise measurement of the same rabbit's mass than another bit of equipment.

For example, on that rabbit on the right of the scale, one bit of equipment might say that rabbit is 2,400 grammes, whilst a more precise bit of equipment will say its mass is actually 2,439.

15 grammes instead.

Temperature is also continuous.

This temperature is clearly 20 degrees.

But this temperature, I guess it's 12 degrees, or is it 11.

8 degrees or 11.

786 degrees? It is hard to tell precisely.

Quantitative continuous data can be measured as precisely as someone can read the equipment being used.

This is why digital displays are preferred nowadays.

There is less ambiguity compared to reading an analogue display, like this thermostat.

On the other hand, discrete data are numerical data that can only take on specific numerical values, such as the number of people on a bus or the number of seats on a bus.

The number of people or seats can only be integer values because you can't have half a person or one third of a seat.

However, quantitative discrete data do not need to be integers.

The price of something in pounds can only take on values up to two decimal places.

Cash can only take on a specific set of values in pounds and pence.

Furthermore, UK shoe sizes can either be integers or.

5 decimals.

Whilst the length of someone's foot can be of any length, which makes the length of the foot continuous, standard shoe sizes for those feet will never be, for example, 6.

221.

It will either be size 6 or size 6.

5.

Okay, for this check, pause here to categorise these datum into quantitative discrete or continuous data.

And the answers are as follows.

Okay, for question one of this second practise task, pause here to put a tick in each box to show whether each datum is an example of qualitative data, quantitative discrete data, or quantitative continuous data.

And for question two, an exam has a maximum score of 200.

Each pupil is given a percentage.

Laura thinks these percentages are continuous, whilst Andeep thinks that they are discrete.

Pause here to explain who you agree with.

Okay, on to the answers.

Pause here to compare your answers to question one with the ones on screen.

For question two, Andeep is correct.

The percentages can only take on specific values, depending on the discrete scores that they are based on.

The score of 77 results in a 38.

5% score.

However, there is no score that results in a percentage of 38.

41, meaning that the percentage scores are limited to only certain values, the definition of discrete data.

And finally, when collecting data, should you choose primary or secondary data? Well, that will depend on the investigation.

Let's see why.

Laura fairly asks whether she always has to collect new data since doing so is pretty time consuming.

Well, actually not always.

Both primary and secondary data are very different data collection approaches.

When an investigation is conducted, if data are collected for the primary purpose of being used for that investigation, then it is primary data.

However, if data are collected for the primary purpose of being used in a different investigation or for any other unknown purpose, then it is secondary data instead.

Examples of primary data include designing a survey to collect specific data required for a specific particular investigation.

If you collect data for your investigation, it is always primary data.

However, this is a different example of primary data, asking marine biologists to collect new data on the number of marine animals on a nearby coast and then analysing the data yourself.

If someone else collects data, but for your investigation, then it is still primary data.

On the other hand, if you use data that you collected but for a different investigation than for this current investigation, it is secondary data as you collected it for some other purpose beforehand.

Furthermore, using data on the number of marine animals that marine biologists have collected as part of an annual data collection process, this is also secondary data as those data were not collected with its primary use being for your investigation currently.

It does not matter if the marine data had intention for use in someone else's investigation or just as an annual data collection process with no specific purpose.

It will still be secondary data for your investigation.

Okay, for this check, Aisha uses data collected by Sam for an investigation.

Sam collected this data yesterday for a different school project.

Pause it to consider if this is an example of primary or secondary data.

Sam collected it for a different school project, and so Sam collected it not for Aisha's investigation.

Therefore, it is secondary data.

But this time, Sam collected this data yesterday as part of an investigation that both Aisha and Sam are a part of.

Pause here to consider if this second example is an example of primary or secondary data.

This is an example of primary data, as Sam is using the data they collected directly in their current investigation.

And lastly, Sam collected this data yesterday because Aisha asked Sam to collect data for an investigation that Aisha was busy planning.

Pause here to consider if this example is an example of primary or secondary data.

And this is also primary data.

Sam may not be part of Aisha's investigation, but Aisha directly asked Sam to collect data specifically for this investigation.

And for this different check, pause here to categorise these methods into primary or secondary data.

Both A and D were primary data collection methods, whilst B and C were secondary.

Each investigation is different.

Some investigations will benefit more from collecting primary data, but for other investigations, collecting primary data is unrealistic and so investigators might rely on collecting secondary data instead.

Okay, let's look at some advantages for both primary and secondary data.

If data are collected specifically for your investigation, you're more likely to have control over how you collected it.

You can try to minimise unwanted biases, ensure that the sample size is large, and that you collect exactly the type of data that you need for your investigation.

And secondly, you're more likely to have control over how relevant the data are to the specific details of your investigation, such as location or targeting a specific population within a specific area.

And also, the data you collect are more likely to be up to date when compared to the secondary data that might have been collected a long time ago in the past.

Historical data may no longer be relevant to your investigation.

On the other hand, what are some advantages of secondary data, data that have already been collected? Well, it's a time saver.

You save a lot of time by using data that already exists rather than spending a lot of time designing the data collection process and collecting it yourself.

It is also potentially free.

For example, a lot of Met Office weather data and UK census data can be accessed without paying.

Furthermore, some people who collect secondary data are experts in that field of investigation.

You may not be able to collect that data with your current resources or skills.

For example, it is pretty unlikely you'll be able to collect primary data on volcanic activity.

You wouldn't have the training to do so safely nor have the equipment to measure such activity.

But remember that asking or employing other people to collect data specifically for your investigation is still primary data even if you are physically not collecting it yourself.

This is a great way of collecting primary data whilst also having the right equipment and experience on your side.

And lastly, collecting secondary data may just be a starting point.

As we said before, secondary data is cheap and collecting informative secondary data might help justify spending time and money on your own primary data in a way that's more relevant to your investigation.

And on the flip side, most of the advantages of using primary data are the disadvantages of using secondary data and vice versa.

In addition to these, we also have the risk of bias in secondary data.

When using data collected by someone else for a different purpose, there is a risk of the data collected being biassed in a way that would benefit the person collecting the data, either in terms of money or in terms of getting a specific point across.

Pause here to check the summary of all of the advantages and disadvantages of primary and secondary data.

As a summary of all of the data categories we've looked at today, we have primary and secondary data.

Primary data can be qualitative or quantitative, and this quantitative data can also be continuous or discrete.

All of this is also true for secondary data as well.

And the final check, Sam needs to complete an investigation by tomorrow into how the temperature in their hometown changes over one week.

Pause here to decide.

Should Sam use primary or secondary data? It seems more sensible for them to use secondary data.

But why? Pause here to consider which statements justify choosing secondary data over primary.

Time is a factor.

Sam needs to collect a week's worth of information in one day.

This is not possible for Sam to collect; therefore, Sam has to rely on secondary data that has already been collected.

Furthermore, weather data is free and easily accessible from the Met Office, so it saves Sam time and money.

Okay, on to the checks.

For question one, pause here to put a tick in each box to show whether each set of data are primary or secondary and whether they're qualitative or quantitative data.

And for question two, the Met Office collect data on the number of sunshine hours across four different UK locations.

Sofia is the first person to use this data and compares it to data that she had collected from her hometown.

Pause here to explain how you know Sofia has collected both primary and secondary data during this investigation.

And finally, question three.

Brian isn't sure what sort of food to cook in a restaurant that they want to open.

Brian looks on the internet and finds research from 10 years ago about popular food choices in a small town in the USA.

Pause here to explain why Brian's data is unlikely to be of help to him and suggest other more beneficial data collection methods that he could employ.

And here are the answers for question one.

You are unlikely to collect your own data on global hurricanes for a school project, so this is secondary data.

Furthermore, the number of hurricanes makes it quantitative data.

However, a local newspaper collecting local opinions is likely primary data and opinions are qualitative.

A trained company collecting data on rival companies is primary data as they're collecting information about rival companies themselves.

The cost of something is quantitative data.

And for question two, primary data was data Sofia collected about Oakfield.

The secondary data was data collected by the Met Office.

It's still secondary data even though it has never been used for any other investigation because its primary purpose was not for use in Sofia's investigation.

And for question three, the data Brian found was outdated and unrepresentative.

It's about a town in the USA, not in Oakfield.

There's also a risk of bias since Brian doesn't know the methodology used in the data collection process.

Brian should conduct primary research from the people who live in his hometown of Oakfield.

This data will be up to date and relevant for the location that Brian wants to open the restaurant in.

And great work, everyone, in considering all of the different ways that we can categorise data and thinking deeply about all sorts of real world contexts for these data.

In this lesson, we've looked at numerical or quantitative data and non-numerical or qualitative data, where qualitative data can either be given as images or sounds as well as words.

Two of the main types of quantitative data are discrete, meaning that they are data that can only take on specific numerical values, and continuous, meaning that they are data that can take on any value within a given scale.

If data was collected with a primary intent for use in a given investigation, then for that given investigation, it is primary data.

However, if that data was collected for use in a different investigation than for your current investigation, it is secondary data instead.

Thank you so much for joining me today.

Until our next maths lesson together, take care and have an amazing rest of your day.