Lesson video

In progress...

Hi everyone.

My name is Ms. Ku and I'm really excited to be learning with you today as we're looking at histograms. Histograms are fantastic graphical representations of data.

I hope you enjoy the lesson.

Let's make a start.

Hi everyone and welcome to this lesson on problem solving with cumulative frequency and histograms. Under the unit graphical representations of data, cumulative frequency and histograms. And by the end of the lesson you'll be able to use your knowledge of cumulative frequency and histograms to solve problems. Today's lesson will consist of lots of keywords, starting with box plot and a box plot is a diagram that clearly shows the minimum and maximum value of a data set along with the three quartiles.

We'll also be looking at the word lower quartile and it's the value under which 25% of the data points are found when they are arranged in increasing order, also known as the first quartile.

We'll also look at the upper quarter and it's the value under which 75% of data points are found when they are arranged in increasing order, also known as third quartile.

Lastly, a histogram.

A histogram is a diagram consisting of rectangles, whose area is proportional to the frequency in each class and whose width is equal to the class interval.

Today's lesson will be broken into two parts.

We'll be looking at problem solving with cumulative frequency and boxplots first, and then problem solving with histograms. So let's make a start.

Problem solving with cumulative frequency and boxplots.

Here are some heights of people from a local sports centre and Andeep says 75% of the people are between 1.

68 metres and 1.

48 metres because they're between the upper and lower quartile.

Is Andeep correct? Have a little think, press pause if you need.

Well done.

Well, hopefully you spot this is the lower quartile and the lower quartile is 1.

48 metres, is the value under which 25% of the data points are found when they are arranged in increasing order.

Now this is the upper quartile and the upper quartile here is 1.

68 metres and is the value under which 75% of the data points are found when they're arranged in increasing order.

Therefore, we know the interquartile range represents the middle 50% of the data.

So Andeep is incorrect because 50% of the people are in between 1.

68 metres and 1.

48 metres inclusive.

Now let's have a look at a check.

Two classes took a test and Laura says both groups performed exactly the same, they have the same range and the same average.

Explain why Laura is partially wrong and state the difference between the two classes.

See if you can give it a go.

Press pause if you need more time.

Well done.

Well, hopefully you've spotted both classes do have the same median.

It's 22 and both classes do have the same range.

38, subtract six is 32.

However, the interquartile range is different.

Class A has an interquartile range of 26, subtract 18, which is eight, and class B has an interquartile range of 28 subtract 14, which is 14.

This means 50% of class A's scores were more consistent with the median than class B as the interquartile range is smaller.

Well done if you got this.

So we can tell a lot about data sets even if no values are given.

For example, here we have three box plots showing the height of pupils in different classes.

Class A, class B, and class C, and we have no scale.

So although no values are given, we don't need to know the scale as we know it's the same for all three box plots.

So what we're going to do now is identify and explain which class has the highest average height, which class has pupils with very similar heights, and which class is on average nearly three quarters the height of another class.

And we need to explain why as well.

Well let's have a look at part A.

Class C has the highest average height as the median is further along the scale than the other classes.

Even though we don't have any numerical values, we don't need to because we know that the scale is the same for all three classes.

And the further along the scale, we know the bigger the height.

Now for B, we know class C is the class with pupils of similar heights given the inter quartile range of the smallest and the range is the smallest compared to the other classes.

Lastly, C.

Class B's median is three quarters the median of class A, regardless of the numerical scale, the median of class A is 12 units and the median of class B is nine units.

So now let's have a look at another question where once again, we are not given a scale.

The average house prices in different regions across the UK were plotted as were the average house prices in different regions of London.

You can see the rest of England and Wales box plot is in green and London box plot is given in blue.

Some questions asked, are London house prices on average more than the rest of England and Wales? And using data justify how you know.

See if you can give it a go.

Press pause if you need more time.

The next question asks, are London house prices more consistent than the rest of England and Wales? And I want you to use the data to justify how you know.

See if you can give it a go.

Press pause if you need more time.

Well done.

Let's see how you got on.

Well, for the first part, yes, the house prices on average are more than the rest of England and Wales, and this is because the median house price for London is further to the right than the rest of England and Wales.

The second question says, are London house prices more consistent than the rest of England and Wales? And we have to justify our answer.

Well, the interquartile range is approximately the same.

It's approximately three unit squares, but the range for London is much larger.

So therefore the London prices are less consistent.

Really well done if you've got this.

When the numerical scale is missing from a box plot, it is still possible to see the quartiles and ranges, as long as the scale is the same from multiple box plots and we can compare them even when the values on the scale are unknown.

So will the same be true for cumulative frequency graphs? Well, here we have three cumulative frequency graphs plotted against the same scaled axis, and each graph shows test results from a different class.

Which graph shows the class that performed best on average? And I want you to explain how you know.

See if you can give it a go.

Press pause if you need more time.

Well done.

The median is the most common average we use when interpreting cumulative frequency diagrams, given the median position will be the same for all graphs.

We can see as we move along each graph.

Now the median is further for graph B than any other graph, so therefore the average test percentage for B is the highest.

Well done if you got this.

Now, which class do you think performed the worst on average? And I want you to explain how you know.

See if you can give it a go.

Press pause if you need more time.

And which class were the least consistent with their percentage scores? And once again, explain how you know.

See if you can give it a go.

Press pause if you need more time.

Well done.

Well, first part, it's C.

C has the lowest median.

Well done if you got this.

And for part B, where we can see the consistency with the interquartile range, knowing the relative position of quartile one and quartile three, lower quartile and upper quartile, we can see the size of the interquartile range.

Class A has the least consistent results, given it has the largest interquartile range.

Well done if you've got this.

Fantastic work everybody.

So now it's time for your task.

Here's some data on the number of bounces, different balls made after they were dropped.

I want you to identify the mistake with this boxplot.

See if you can give it a go.

Press pause if you need more time.

Well done.

For question two, the following cumulative frequency graphs and boxplots are shown and all have the same scales axis.

I want you to match the graph with the correct boxplot.

See if you can give it a go.

Press pause if you need more time.

Well done.

Move on to question three.

An Oak teacher conducted an experiment on his pupils.

He gave them a university maths multiple choice paper.

Bit cruel, but he did it.

There were four options to choose from, with only one answer being correct.

Each pupil had no idea what to do and just guessed.

And there were 40 questions in total.

Which boxplot is a likely box plot to show how the Oak pupils performed and explain your answer.

See if you can give it a go.

Press pause if you need more time.

Well done.

Let's have a look at these answers.

Well, for question one, the interquartile range was plotted as the upper quartile adding the interquartile range to the lower quartile will give the correct position of the upper quartile.

Here's the correct boxplot.

Next, we should have matched these boxplots with these cumulative frequency curves.

Really well done if you got this.

And lastly, it is B because on average, pupils have a one in four chance of getting a question correct.

This is a quarter.

So on average most pupils should get a quarter of the marks correct through guessing.

Really well done.

That was a tough one.

Fantastic work everybody.

So let's have a look at problem solving with histograms. Is it possible to determine information from a histogram without a known scale for the Y axis? Well, here we have a histogram showing the masses of apples, grammes, from a farm shop.

And we know 4% are greater than 180 grammes.

The question asks approximately what percentage of the apples are less than or equal to a hundred grammes? Well, one method is to keep using proportions.

We don't need to know the frequency density or the frequency in order to work out the proportion of apples which are less than or equal to a hundred grammes.

So we know this area of the bar represents 4%, 4% of our apples.

Given that the area is eight squares, this means eight squares is 4%.

So that means we know one square is 0.

5%.

From here we can look at the area which represents less than or equal to a hundred grammes and we can simply count the squares.

Counting the squares, there are 72 squares, so that means 36% of the apples are less than or equal to a hundred grammes.

Notice how we've worked this proportion out without identifying a frequency or a frequency density.

Another method is to simply assign a scale to the frequency density, and this might not give the correct frequencies from our farm shop, but the frequencies will be in proportion.

For example, I'm going to assign these frequency densities and then from these frequency densities work out frequencies based on this scale.

Based on this scale, I've come up with these frequencies.

So let's work out the proportion of this bar needed.

We know we need to identify the proportion of the apples, which are less than or equal to a hundred grammes.

Well less than or equal to a hundred grammes, looking at this bar, we can split this bar into 60 and 60.

So that means the proportion which are less than or equal to a hundred grammes is 72 outta 200, which once again is all 36%.

So now let's have a look at a check.

Oak pupils measure the length of different pieces of wood.

32% of the data lies between 20 centimetres and 30 centimetres.

And the question wants us to estimate the percentage of data that is less than or equal to 28 centimetres.

See if you can give it a go.

Press pause if you need more time.

Well done.

Let's see how you got on.

Well, we know 40 squares represents this 32%.

So that means one square is not 0.

8%.

If we know 112 squares represent less than or equal to 28 grammes, this works out to be 89.

6%.

Massive well done if you got this.

Alternatively, you may have signed those frequency densities again and calculated possible frequencies to see the proportion, using a scaled axis and increasing equally and then identified these frequencies.

Then using this, we know the total frequency, which is less than 28 grammes, means that this bar must have a frequency of 12.

From here, 44.

8 outta 50 is 89.

6% again.

Fantastic work everybody.

So now it's time for your task.

Here we have a histogram and it shows the mass of sweets in grammes.

We know 10% of the sweets are less than or equal to 10 grammes.

What percentage are more than 18 grammes? You've got a couple of options here.

How are you going to work this out? Are you going to count the squares and identify the percentage or are you going to assign a scale to the frequency density? See if you can give it a go, press pause if you need more time.

Well done.

Let's have a look at question two.

Here are the masses of different sweets in grammes.

Each bar represents a third of the data.

What percentage of the sweets are less than or equal to 18 grammes? And I want you to give your answer to one decimal place.

And again, how are you going to work this out? Are you going to count the squares and identify the percentage or are you going to assign a scale to the frequency density? See if you can give it a go, press pause if you need more time.

Well done.

Now let's go through our answers.

Here we have a histogram and it shows the masses of sweets in grammes.

We know 10% of the sweets are less than or equal to 10 grammes.

What percentage are more than 18 grammes? Well, four squares represents 1%.

So that means answering the question 62% or more than 18 grammes.

Really well done if you got this.

And this was a great question.

Here are the masses of different sweets and grammes.

Each bar represents a third of the data.

What percentage of the sweets are less than or equal to 18 grammes? And I want you to give your answer to one decimal place.

Knowing that 160 squares is 33.

3% recurring, that means we have 58.

3% or less than or equal to 18 grammes.

Great question.

So in summary, information can be deduced from a graph even when the scale for an axis is unknown.

And we can even compare graphs, as long as we know the scale is the same.

Our understanding of proportion can help with this.

Massive well done everybody.

It was really hard today, but I hope you've enjoyed using cumulative frequency curves, boxplots and histograms in problem solving questions and deepening that understanding of those key words.

Very well done.

It was great learning with you.

I've finished the video