video

Lesson video

In progress...

Loading...

Hi everyone.

My name is Ms. Ku and I'm really excited to be learning with you today because today we're looking at histograms under the unit of graphical representations of data.

Hope you enjoy the lesson, let's make a start.

Hi everyone and welcome to this lesson on summary statistics from histograms and it's under the unit graphical representations of data, cumulative frequency and histograms. And by the end of the lesson you'll be able to estimate the mean and median from a histogram.

We've got quite a few keywords here, so let's have a look.

The arithmetic mean for a set of numerical data is the sum of the values divided by the number of values and it's a measure of central tendency representing the average of the values.

We'll also be looking at frequency density, and frequency density is proportional to the frequency per unit for the data in each class.

Often the multipliers one meaning that frequency density is equal to frequency divided by class width.

And a histogram.

A histogram is a diagram consisting of rectangles, whose area is proportional to the frequency in each class and whose width is equal to the class interval.

Today's lesson will be broken into two parts.

We'll be looking at the mean from a histogram and then the median from a histogram.

So let's make a start looking at the mean from a histogram.

A group of pupils played a game with difficulty levels easy and hard, and the times show how long pupils played until they lost their first life.

Explain which histogram shows the times for the harder level.

So you can give it a go.

Press pause if you need more time.

Let's see how you got on.

Well this is the harder level.

Although we don't know the numerical values, the frequency density, we do not need to as the frequency density is proportional to the frequency per unit for the data in each class, we can see that most students lost their first life within 15 seconds of playing the game.

This must be the easier level as most peoples lost their first life much later, meaning 30 seconds onwards.

Histograms are just fantastic as they show the distribution of the data.

And we're not even given the frequency densities here, but you can see and identify which set of data represents which group.

Fantastic.

So here are three histograms of three groups of people typing the alphabet as quickly as they can.

We have primary school pupils, we have teachers and we have pensions.

Now the frequency density scale is the same for all the graphs.

Which graph do you think represents which group? And I want you to explain how you know, so you can give it a go.

Press pause one more time.

Well done.

Well the first graph are the primary school pupils.

The second graph are the teachers and the last graph are the pensioners.

So let's have a look why? Well, this middle graph has to be the teachers as the teachers are the quickest given most type the alphabet in less than 15 seconds.

This indicates that they are pensioners as most people have typed the alphabet after 25 seconds.

And lastly, this has to be the primary school pupils as they are generally quick with most being less than 20 seconds.

Really well done if you've got this.

We can also approximate averages to provide statistical values for comparison.

And this is much better than saying most people as we can assign a statistical value.

So let's look at the data from primary school pupils typing the alphabet as quickly as they can and we can work out an estimate mean from this histogram.

One option is to construct a frequency table given the information in the histogram.

So we can draw up the table with the time intervals given.

Here you can see my frequency table between zero and 10 minutes, between 10 and 15 minutes, between 15 and 20 minutes and between 20 and 25 minutes.

Pay careful attention to my notation and inequalities.

Clearly we don't know the frequencies just yet, so how do we work out those frequencies? Well, firstly, let's insert our class intervals.

We know the class widths are 10, five, five, and five.

Next we can extract the frequency densities from our histogram.

We know the first bar has a frequency density of no 0.

3 and the following bars have frequency densities of 1.

6, six, and 1.

8.

So from here we can calculate the frequencies using the table or the area of the bar.

So no 0.

3 multiplied by 10 is three.

1.

6 multiplied by five is eight, six multiplied by five is 30 and 1.

8 multiplied by five is nine.

This is a really nice way of centralising all our information so we can identify our frequencies.

Alternatively, we can simply annotate the frequencies on the graph because we know the frequency is found using the area.

Here we know this has a frequency of three.

Here we know how this has a frequency of eight.

Here we know this has a frequency of 30 and here we know this has a frequency of nine.

The second method is more concise and easier and a little bit less work too.

So now we've completed our frequency table, we can find our estimate mean.

Work up the midpoint of our class intervals first.

So I've added this extra column identifying the midpoint.

The midpoint between naught and 10 is five.

The midpoint between 10 and 15 is 12.

5.

The midpoint between 15 and 2017 0.

5 and the midpoint between 20 and 25 is 22.

5.

Now let's work up the estimate total of each interval.

We simply multiply the frequency by the midpoint giving me these values.

Now let's work our estimate mean.

And to find this we have to sum the total frequency and we'd sum the total estimate times.

The total frequency in this case is 50 and the total estimate times is 842.

5.

So we can work our estimate mean by simply doing 842.

5 divided by our 50 gives me 16.

85 seconds.

In other words, on average it takes primary school pupils 16.

85 seconds to type the alphabet.

Now let's use this histogram to work out the estimate mean for the time taken for our teachers to type the alphabet.

What I'd like you to do is use this histogram, construct any tables you need and work out that estimate mean.

Great work.

So let's see how you got on.

Here all I've done is extract my frequencies using the class width and the frequency density and I identify my frequencies.

I can work out the midpoint and then multiply the frequency by the midpoint.

So from here I can work out the total frequency to be 50 and the estimate sum of our times to be 687.

5.

Working after our estimate mean it's simply 687.

5 divided by 50, which is 13.

75.

In other words, on average it took our teachers 13.

75 seconds to take the alphabet.

Really well done.

If you've got this, press pause if you want to copy this working out down.

Well done everybody.

Using the histogram I want you to work out the estimate mean time to one decimal place for the time taken for the pensioners to type the alphabet.

See if you can give it a go.

Press pause if you need more time.

Well done.

Let's move on to question two.

Here is a histogram showing the times to complete a test in minutes and Lucas says the modal class is 20, less than 10, less than equal to 22 because it's the highest bar in the histogram.

Is Lucas correct? And part B says, work out the estimate mean to one decimal place.

See if you can give it a go.

Press pause if you need more time.

Well done.

Let's move on to these answers.

Well, for question one, you should have had 27.

9 seconds.

In other words, it took our pensioners on average 27.

9 seconds to type the alphabet.

Next is Lucas correct? No, he's incorrect.

The modal class is the class width with the highest frequency, not the highest frequency density.

The frequency is the area of the bar.

Next, showing my working out, we should have had an estimate mean to be 15.

4.

Really well done if you've got this.

Press pause if you want to copy this working out down.

Excellent work everybody.

So let's move on to the second part of our lesson.

Median from a histogram.

An estimate for the median can also be calculated from a histogram, but firstly, if the data was raw, how would you find the median? Well, it's the middle value of N pieces of data and we can find it by crossing out numbers to find the middle.

For example, here you can see a set of numbers and all they've done is put them in ascending order.

I'm crossing out these numbers to identify the middle value to be 17.

So that means the median is 17.

But a more efficient approach is to use this formula N and one over two.

Where N is the number of data values.

Here we have 13 data values.

So the median is 13, add one all divided by two, which is the seventh value.

So let's count one, two, three, four, five, six, seventh value.

Here is our seventh value and we already knew this from before.

The median is 17.

This is a more efficient way to work out the median where we're simply using the formula and add one divided by two and it'll tell me which value is the median.

So using N plus one divided by two to find the position of the median, I want you to identify the median from these lists of data.

See if you can give it a go.

Press pause if you need more time.

Well done.

Let's see how you got on.

Well for part A, hopefully you spotted we have nine pieces of data.

So nine, add one, divide by two, means we are looking at the fifth value.

The fifth value here is our six, and that means the median is six.

Next we have eight values here.

So eight, add one, divide by two is the four point fifth value.

In other words, it's in between our 7.

7, our 7.

8.

So that means we find the mean of 7.

7 and 7.

8, which is 7.

75.

Next, this was a tricky one, you needed to put them in order first.

Ascending order or descending order doesn't make difference.

We can still identify the median.

Well we know there are seven data values.

Seven add one divided by two means the fourth value is our median.

So our median is 19.

0.

Massive well done if you've got this.

So let's use this information with our histogram.

The histogram shows the time it takes to draw a sketch of a bike.

How would you find the number of data values? Well, we calculate the area of each bar and then sum.

So we know the area of this bar is two.

In other words, the frequency of this bar is two.

In other words, it took two people between zero and 10 seconds to draw a bike.

Next, it took eight people between 10 and 15 seconds to draw this bike as the area is our frequency.

Next, let's work out the area of this bar.

Well, the class width this 12, multiplying by our frequency density of 2.

5, giving me a frequency of 30.

And lastly, the frequency of this bar is three times three, which is nine.

So now we know our frequencies.

We know that there are two, add eight, add 30, add nine pieces of data.

So there are 49 data values in total.

Given we know there are 49 data values, N add one, divide by two means the 25th value will be the estimate median.

But how do we find the 25th value if it's in a histogram? Well, to do this, we know this represents two pieces of our data.

Two pieces of our data lie in this class.

Here we have eight pieces of data.

So that means within these two classes we have 10 pieces of data.

We still have not found the 25th value.

Now if we sum these two, add eight, add 30, we have 40 pieces of data.

So we have gone way past that 25th value.

So that means we need 15 more pieces of data from this bar in order to get the 25th value.

So what proportion is 15 out of that frequency of 30? Well, 15 over 30 is a half.

So that means we know the class width, which is 27, subtract 15, which is 12 as that's the class width of this bar, we simply multiply the proportion of this bar we want by the class width.

So it's a half times 12, which is six.

So this means 15 pieces of data that we want lie in these six seconds of this class.

From here we can work out the estimate median as 25 pieces of data lie between zero and 21 seconds interval.

That means the 26th piece of data lies just outside of this interval, so therefore the estimate median in other words, the 25th value of our 49 pieces of data is 21 seconds.

In summary, to work out the estimate median, we find which bar the median lies in.

We work out the proportion of the frequency we want.

We multiply this proportion by the class width of the bar, and then from here then add it to the lower interval to give us that estimate median.

In this case the estimate median was 21 seconds.

So let's see if we can put this in a check.

The histogram shows the time taken by a group of pupils to draw a bike and Izzy started to work out the median time she found the frequencies and knows the median is the 20th value.

I want you to finish her working out and identify what is the estimate median.

Well done.

Let's see how you got on.

Well, we know the median is the 20th values and these bars show a total of 34 pieces of data.

So we've gone past the 20th value, so therefore we only need six pieces of data from this third bar.

So what's six outta 20? Well, six outta 20 is three tenths.

So I need to work out three tenths of this class width.

Well, three tenths of this class width is three.

So that means I simply add three seconds.

So therefore these three seconds have the six values we need.

So the estimate median is 12 seconds and three seconds, which gives us our 15 seconds.

So the estimate median is 15 seconds.

Great work everybody.

It was really tough.

So now let's work out the estimate median from this histogram showing the masses of parcels in kilogrammes.

See if you can give it a go.

Press pause for more time.

Well done.

Let's move on to question two.

Work up the estimate medium from this histogram showing lengths of wood.

See if you can give it a go.

Press pause for more time.

Well done.

Let's move on to question three.

Work out the estimate medium from this histogram showing the length of string.

See if you can give it a go.

Press pause for more time.

Great work.

Let's move on to these answers.

Here I've identified my frequencies and I know the total values are 23 data values.

23, add one, divided by two means the 12 data value needs to be found.

That 12 data value is 12 kilogrammes.

Press pause if you want to copy this working out down.

Next, working out my frequencies.

I know there are 21 data values.

So the estimate median lies in the 11th value.

So working this out, the estimate median is 37.

5 centimetres.

Press pause if you want to copy this working out down.

For question three, working out the sum of those frequencies.

I have 19.

19, add one, divide by two is the 10th value, which is where the estimate median lies and it's 16 centimetres.

Press pause if you want to copy this working out down.

Massive well done everybody.

That was a tough one.

So in summary, the estimate for the mean and median can be calculated from a histogram and the frequency table can be helpful when calculating the mean and or the median.

And to work out the estimate median from a histogram, the position of the estimate median can be found using N plus one all divided by two.

And once the value is down, a proportion of a bar is sometimes required to identify the value of that estimate median.

Massive well done everybody.

It was wonderful learning with you.