Lesson video

In progress...

Hello and welcome to this numerical summaries lesson with me, Mr. Grattan.

Today we'll be looking at calculating an estimate for the mean of a dataset as a grouped frequency table.

Pause here to have a quick look at some of the keywords that we'll be using today.

A grouped frequency table shows different totals in different ways, many of which will be used when estimating the mean of a dataset.

Let's check them out.

Andeep says that he can interpret information from this frequency table that shows data about the number of pets that pupils at Oakfield Academy had.

Interpreting this information will help when trying to calculate totals from this dataset.

For example, this row means that there were 12 pupils who had two pets each.

That means that there are a total of 12 multiplied by two equals 24 pets across all of the pupils in this row of data.

Using that same logic, this row means that there were three pupils who each had five pets.

That means that there are a total of five times three equals 15 pets represented by this row.

So pause here to think about or discuss in as much detail as possible what this row means.

This row means that there are two pupils who each had six pets.

Therefore, this row of data means that there are 12 pets across two different pupils.

To show this information for all rows of data, a third column can be created.

This column shows the total number of pets represented in that row of the table through a multiplication of the number of pupils in that row, the frequency, and the number of pets that each of those pupils has.

Finding the sum of all the totals in this column gives you the total number of pets across every pupil in the sample.

This is exactly the same as saying this total finds the sum of all the values of all the data points in a dataset that is displayed as a frequency table.

Andeep seems happy enough with interpreting a frequency table that shows the frequencies of individual data values.

However, Andeep is struggling to understand what the information means in a frequency table that shows the frequencies of groups or intervals of data rather than individual data values.

So what does this bottom row of data in this grouped frequency table even mean? Well, this row means that there were 36 people due to the frequency being 36, where each person spent between 60 pounds and 100 pounds, which includes 100 pounds but excludes the 60 pounds due to the different inequality symbols used.

However, we do not know exactly how much each person spent.

All we know is an interval of possible amounts that each person spent.

Furthermore, each person may have spent the same or a different amount as each other.

For this check, pause here to identify, which of these statements is correct for this row of the grouped frequency table.

This row means that 22 people spent between 10 pounds and 20 pounds each.

Andeep is fairly happy that this bottom row means that 36 people each spent between 60 pounds and 100 pounds.

But asks how can we use this information to find the exact total amount spent by these 36 people? Well, the answer is we can't.

We cannot find the exact amount that is.

This is because we do not know how much each individual person spent, so we can't possibly find the sum total of all of the amounts of all 36 people.

However, we can find an estimate.

Pause here to think about or discuss.

If we know everyone in this group spent between 60 pounds and 100 pounds, what would be a sensible estimate for the amount that each individual person spends? We can consider the midpoint of 60 pounds and 100 pounds to be a sensible estimate for how much each person spent.

The midpoint of 60 pounds and 100 pounds is 80 pounds.

We can then use this estimate, the estimate that everyone in this group spent around 80 pounds each to also find an estimate for the total amount spent by the whole group.

This is approximately 80 pounds spent by each person times by the 36 people equals 2,880 pounds spent in total by all 36 of these people.

Okay, for this quick check, pause here to find the midpoint of this interval.

Well done if you spotted that the midpoint was 15 pounds.

We can find the midpoint of any two numbers by adding together the two numbers, then dividing the result by two.

This is actually just finding the mean of those two numbers for this question.

10 plus 20 is 30 and then 30 divided by two is 15.

For this check, knowing that the midpoint of this interval is 15, pause here to calculate an estimate for the total amount that these 22 people spent at the supermarket by also showing your calculation.

Each person spent an estimate of 15 pounds and there are 22 people.

So 15 times 22 is 330.

Therefore, all 22 people spent a total of 330 pounds.

Okay, let's take all of this information about midpoints and the product of a midpoint and a frequency and formalise estimates for totals by considering two extra columns of information, a midpoint column and an estimated total column.

For example, 44 people spent between 40 pounds and 60 pounds.

An estimate for how much each person spent is 50 pounds each.

An estimate for how much all 44 people spent is 50 pounds times by 44 people, which is 2,200 pounds in total.

This value represents the 44 people who each spent between 40 pounds and 60 pounds.

They spent a combined total of approximately 2,200 pounds.

However, we do not know the exact amount, the exact total that they all spent.

Continuing with the check from before, pause here to write down the values of A, B, C, and D.

And well done if you've got these four answers.

For this next check, pause here to think about or discuss what does this value mean in this context? Write down a sentence explaining its meaning.

The 29 people, due to the frequency being 29, who each spent between 20 pounds and 40 pounds due to that interval spent an estimated total of 870 pounds combined where 870 is the midpoint of 30 times by the frequency of 29.

We can then take the sum of all of the estimated totals in the fourth column.

This gives an estimate for the total amount spent by everyone who took the survey, the total value of all data points in a dataset.

In this case, the sum of all values in the dataset is 6,320 pounds.

This means that the total amount spent by everybody in the survey is 6,320 pounds.

For this check, give yourself some time to look at this dataset.

Pause here to find the values A to F.

And pause here to match all of your answers to the ones on screen.

And pause here once more to write down a sentence that explains the value of this number in context.

Like all other values in the fourth column, this is an estimate, this time for the total amount spent due to this being the sum of all of the values above it.

It is the total amount spent by all customers who took the survey.

This represents the total value of all of the data points in the whole dataset.

Okay, great stuff.

Let's use what we've learned for this practise task.

Pause here to interpret and explain in as much detail as you can the many features of this dataset shown as a grouped frequency table.

And for question two, pause here to find all of the midpoints and the products of midpoints and frequencies.

And then use the information that you found to explain whether the statement and estimate for the total mass of the cats is 435 kilogrammes is correct or not.

And for question three, pause here to fill in all missing values of this dataset.

Brilliant effort on all of your interpretations.

The answer to question 1a is that 48 pupils threw the javelin some distance between 10 and 20 metres, but we are unsure exactly what distance.

For part b, a is the midpoint of that interval.

The midpoint of 20 and 30 is 25.

And for part c, 525 metres is an estimate for the total distance travelled by all 21 javelins that each travelled between 20 and 30 metres.

So all 21 of the javelins in this row of data travelled a combined distance of approximately 525 metres.

And for part d, to find the value of b, we would multiply the frequency of 62 by the midpoint of five, giving 310.

For question 2a, pause here to compare your answers to the ones on screen.

And for part b, the sum of all values in the estimated total column is 435.

Since this total represents the total mass of all cats that the charity saved, this statement is correct.

And finally, pause here to check that all of your answers to question three match the ones on screen.

Finding estimated totals for either groups from a dataset or for the whole dataset is already fairly informative.

But can we use these totals to help us in other ways, such as when estimating the mean from a dataset? Well, let's find out.

So here's the data from the previous cycle with the estimated total amount of money spent for each row of data, as well as the total amount spent from everyone who took part in the survey.

But as Andeep asks, exactly how many people actually took part in the survey? Pause now to think about or discuss where on the grouped frequency table you need to look to find out this information.

Well, this row tells us that there were 36 people who each spent somewhere between 60 pounds and 100 pounds.

So 36 represents the number of people in just the 60 pounds to 100 pounds group.

How many people were there in each other group in each interval? Well, the answers are all in this, the frequencies column because the frequency column represent all the people in all the groups.

The total number of people in the sample is therefore, the sum of all of the values in this column, the frequencies column.

The frequencies column will always show you the total number of data points in any dataset.

So for this dataset, adding these frequencies gives a total of 139 people who participated in this investigation.

For this check, pause here to calculate how many people in total were a part of this sample.

By adding all the values in the frequencies column, we find out that 57 people were taken as the sample for this investigation.

Great, now we know both the total amount spent by everyone in the sample, as well as the total number of people in the sample.

But can we use any of this information to calculate the mean amount of money spent per person? Pause here to think about or discuss what information could be used to find the mean amount of money spent per person in this sample? As with all of our totals, the mean that we're able to calculate is only an estimate.

The estimate may be close to the correct amount spent per person, but there is no guarantee that it'll be a precise exact amount.

But how do we find an estimate for the mean amount spent per person? Well, remember that the mean of any dataset, whether it is shown as raw data or in a frequency table, is the sum of the values of all data points divided by the number of data points.

Where can we find these two pieces of information in this grouped frequency table? Well, for this dataset, the sum of all data points is the sum of all of the amounts that every person spent in the survey.

Everyone in the survey spent a total of 6,320 pounds, and the number of data points can also be seen as the total frequency of all the data points in the dataset where each data point in this example is a person.

There are 139 people in this survey.

And so there are a total of 139 data points in this dataset.

Therefore, our estimate for the mean amount of money spent per person is the total amount spent by everybody divided by the 139 people, which is this, an estimated mean per person, a £45.

47 spent per person.

I love Andeep's check here.

We know that £45.

47 is a sensible estimate since it's a value that looks somewhere close to the middle of the data.

£45.

47 lies within one of the intervals of our dataset.

Jun, on the other hand, calculated an estimate for the mean as 1,264 pounds.

Pause here to think about or discuss why we know for certain that this value is incorrect even without doing any calculations to check for ourselves.

1,264 pounds lies way outside any of our intervals, well above the maximum possible data point of 100.

The mean, whether estimated or exact, has to be less than the maximum data point and greater than the minimum data point.

Pause here to think about or discuss.

What might Jun have done incorrectly to get this very peculiar value of 1,264 pounds? Jun divided 6,320 pounds by five because there were five rows.

This is incorrect as each row represents multiple pieces of data.

And to find the mean, you must divide by the total number of data points.

This is a common misconception.

When finding the mean from any frequency table, you should never divide by the number of rows, especially in a grouped frequency table.

The number of rows is fairly meaningless as the number of rows just represents the number of groups we have chosen to separate our data into.

It reflects nothing on the number of data points in our dataset.

We should always divide by the sum of all the values in the frequency column as that always represents the total number of data points in a dataset.

Okay, for this check, without doing any calculations, pause here to consider, which of these estimated means is sensible for this dataset shown as a grouped frequency table? Okay, only £18.

99 and £68.

25 are within any interval in this dataset.

And £18.

99 seems really, really low when considering the data could go anywhere between zero pounds and 300 pounds.

Therefore, the only sensible estimate for the mean is £68.

25, which is actually the correctly calculated estimated mean.

But how did I calculate the estimated mean to be £68.

25? Pause here to write down the calculation required to calculate the estimated mean for this dataset.

The estimated mean is the total amount spent by everybody of 3,890 pounds divided by the total number of people or the total frequency of 57.

And now for this last check, there is a lot of information missing from this grouped frequency table.

Pause now to perform as many calculations as necessary to find an estimate for the mean amount spent per person from this sample.

We first need to calculate an estimate for the total amount spent from the five people in the bottom row at 600 pounds.

We then need to calculate the total amount spent by everybody and the total number of people in the sample, which are these two numbers.

Then we need to perform a division of the total amount spent divided by the total number of people to give a sensible estimate of the mean at £36.

64 per person.

Okay, great stuff.

Here are the final few practise questions.

For question one, find the total number of rabbits in the sample and explain why 1,053 kilometres per hour is an extraordinarily incorrect estimate for the mean speed of a rabbit.

And then calculate a sensible estimate for the mean speed of a rabbit yourself.

Pause now to do this.

For question two, pause here to fill in all of the gaps in this frequency table and hence, calculate an estimate for the mean of this dataset.

And finally, question three, the same dataset is shown in two different frequency tables.

Pause now to calculate an estimate for the mean from both tables and calculate the difference between the two estimates.

Great effort on attempting all of these questions.

The answer to question 1a is the total number of rabbits is 210.

For part 1b, 1,053 kilometres per hour is well beyond the maximum possible speed in this table of 50 kilometres per hour.

Not even some of the fastest trains in the world can travel over 1,000 kilometres per hour yet.

So this speed is very much impossible for a small rabbit to achieve.

This incorrect result may have occurred because Aisha divided the sum of all speeds, 5,265, by the five rows, not the total frequency of 210 rabbits.

And for part 1c, a suitable estimate for the mean is just over 25 kilometres per hour at 25.

07 kilometres per hour.

This seems a lot more sensible for a rabbit to achieve.

For question two, pause here to compare all the calculations in your table to the one on screen where your estimate for the mean should be 1,454 hours.

And for question 2c, our mean was only an estimate because we do not know the exact values of all the data points.

We used the midpoint of each interval as an estimate for these unknown values.

And finally, question three, pause here to check all of your calculations across both tables.

Match the ones on screen with the difference between both estimated means at 1.

99 metres.

Amazing work on extending your knowledge of the mean in a lesson where we have estimated the values of a data point in an interval to be the midpoint between the two bounds of that interval.

The midpoint of an interval multiplied by the number of data points in that interval gives an estimate for the total value of all data points in that interval where the sum of all these totals gives an estimate for the total value of all data points in a dataset.

We can then divide this grand total by the number of data points in the dataset, the total frequency of the dataset in order to find an estimate for the mean value of one data point in that dataset.

Thank you all so much for all of your attention and effort during this lesson.

Until our next math lesson together with me Mr. Grattan, take care and enjoy the rest of your day.

I've finished the video