Lesson video

In progress...

Good day to you all.

I'm Mr. Gratton and thank you so much for joining me for another maths lesson.

In today's lesson, we will be looking at how the averages, the mean, the median, and the mode will change if a data point has to be modified, added, or removed from a dataset.

Pause here to familiarise yourself with the definitions of bimodal, mean, median, and mode.

We will look through the three averages one at a time, starting with the mode, but before we look at what happens to any of the averages, it is important to understand what we are doing.

We want to understand what small changes may occur to an average if we wanted to modify the value of a data point, add an extra data point, or remove one from a pre-collected set.

We cannot just randomly change data.

Tampering with it to suit your needs is very much frowned upon.

However, there are valid reasons to need to tweak a dataset, such as typos and errors being corrected, outdated data being removed, extra data points being added as further data is collected, or to simulate and predict changes in data over time.

Okay, let's have a look at influences on the mode.

This dataset can be represented as this dot plot.

Currently, the mode is four.

We can see that because the column of four has more dots in it than any other column.

What would happen if the value of one of the data points were modified due to them noticing an error? Visualise it as a dot on the dot plot moving location as the value of that data point changes.

Can you move one of the dots to make the mode of four change? Let's take this top dot on the four column and move it here.

Nope, four is still the most frequent value, so four is still the mode.

Four will remain the mode.

It will be invariant, or it doesn't change unless this happens.

We take the mode of four, and we make it a five.

The mode will have changed because the dataset is now bimodal at both four and five.

Okay, what happens if, rather than modify a data point, we added a new data point to the dataset because we collected more data? Could we change the mode of this dataset now? Visualise it as one extra dot appearing on the dot plot in addition to all of the dots that are already there.

Can you add a dot to make the mode of four change? Well, let's have a look.

The mode of four will remain invariant.

The only time the mode will change, in a different dataset that is, is if the second most frequent data point has a frequency one less than the current mode.

In this situation, the dataset will become bimodal.

So here is a different dataset.

Four has a frequency one greater than five, and so if we add one extra data point to the five, again, we will have bimodality at both four and five.

In this dataset, if we remove one of the fours, four is still the mode.

The mode remains invariant.

Using that alternative dataset as before, if we take one of the fours and remove it, again, we have bimodality at four and five.

This process stays exactly the same if you are given a list of raw data rather than a dot plot.

The current mode of this dataset is six.

If one extra data point was collected and added to the dataset, what value would that extra data point have to have in order for the properties of the mode to change? Notice how if we added a seven or an eight, the dataset would become bimodal? Because the frequency of seven or eight would become four.

The same frequency is six, the current mode.

If the data point had any other value, so five, nine, whatever else, the mode would not change.

Let's look at an example in context for when we would have to modify the value of a data point.

Izzy has taken part in a go-kart competition consisting of 12 races.

Below are her finishing positions for each of these 12 races.

However, the winner of race one got disqualified, promoting Izzy to third place.

Will this change her most common finishing position or her modal finishing position? Whilst not necessary, I advise you to always put a dataset into order of size, smallest to largest, so that all of the data points with the same value are grouped together.

Currently, this dataset is bimodal, with third position and fifth position being equally frequent.

However, we are going to change this fourth position to another third position, and therefore, the dataset is no longer going to be bimodal.

It will have one single mode of third position.

Izzy's most frequent finishing position is now third place.

Time for check for understanding.

If one extra data point had to be added to this dot plot, what value or values could it have in order to change the mode and what would the mode change into? Pause now and provide at least one of A, B, C, and at least one of one, two, three to describe your answer fully.

If a six or an eight were added, the dataset would become bimodal.

Okay, imagine we had to modify the value of a data point instead because an error has been spotted and corrected.

What value must that data point currently have, and what value must it turn into so that the mode changes from one single mode to a different single mode? Pause now to think through those options.

And the answer is that the value of six needs to be modified into a value of eight.

This is the only way where the mode can remain a single mode but change value.

If the six was modified into a seven, the dataset would become trimodal instead, and if the six moved to one, then the dataset would become bimodal at both six and eight.

On the left, we have three possible changes that could happen to a dataset.

On the right, we have three possible outcomes to those changes.

Match the possible change to its correct outcome.

Pause now to create those pairs.

The outcomes are as follows.

Time for some independent practise.

For question one, we have a dot plot.

Think about how the mode would change or not change if a data point had to be added from this dataset.

Pause now to think through those options.

For question number two, we've got a different dot plot.

What would happen to the mode of this dataset if a data point had to be modified in its value? Pause now to have a think through those options.

Okay, onto the answers.

Next up, we're gonna look at the mean and how that changes as we modify the dataset in some way.

Here is the same dataset as before.

The mean of that dataset is currently 4.

55.

We can visualise this by either putting an arrow in between the four and the five or by putting the arrow approximately here on the dot plot.

However, what would happen to the mean if a newly collected data point was added to the dataset? Well, this comes in three possibilities.

If the data point added has a value above the mean, the mean will increase.

For example, if an eight was added, the mean would increase from 4.

55 to 4.

63.

The mean would move in the direction of the location of that added data point.

The converse is also true.

If I were to add a value less than the mean, then the mean would decrease.

So, if I added a one, the mean would decrease from 4.

55 to 4.

46.

The inverse happens when we are removing a data point from the dataset.

Removing a data point smaller than the mean would actually make the mean increase.

Here's an example.

If we were to remove the number two, then the mean would increase to 4.

61.

Now, onto the third option.

What would happen if we had to modify the value of a data point? Well, it's actually relatively straightforward.

If you make the value of a data point larger, the mean will increase, and if you make the value of a data point smaller, the mean will decrease.

For example, if I wanted to modify the value of two to a value of nine, then the mean will increase from 4.

55 to 4.

725.

Okay, onto a check for understanding.

The mean of this dataset is 5.

58.

A new piece of data is collected and added to this dataset, but the mean has then decreased.

Which of these are possible values of this newly collected data point? Pause now to consider your answer.

The answers are one, three, and five.

This is because they are the only options less than the current mean, less than 5.

58.

What is the mean of this dataset? The mean is currently 6.

If a new data point had to be added, what value would it have to have in order for the mean to increase, decrease, or remain invariant, in other words, remain the same? If the data point added is greater than 6.

5, the mean would increase.

For example, adding the data point of 13 would make the mean increase from 6.

5 to 7, but here's a special case.

If the data point added had a value of exactly 6.

5, 6.

5 being the current mean, then the mean would not change.

It would remain invariant.

This is always true.

If you introduce an extra data point to a dataset that has the same value as the current mean, the mean will always stay the same.

Another quick check for understanding.

In this dataset, the mean is currently 15.

27.

Match the statements about modifying, adding, or removing a data point to its effect on the mean of the dataset on the right.

Pause now to pair up the A, B, C to the one, two, three.

If a data point of 15.

27 was added, the mean would remain invariant because 15.

27 is the current mean.

Removing a data point below the mean makes the mean increase, and modifying a 17 to an 11 will decrease the value of the mean.

Let's look at more properties of the mean as we change a dataset.

The mean of 10 numbers is 50.

What is the total value of this dataset? Well, we take the mean and multiply it by the number of data points to get the total value of the dataset.

So 10 times 50 equals 500.

The total value of all 10 data points is 500.

If I had to modify one data point from 45 to 55, by how much would the mean increase? Do not look at the number 50 in order to solve this problem.

Look at the total value of the dataset, 500.

Because 45 has increased to 55, that data point has increased in value by 10, and so the total value of the total dataset has also increased by 10, this time to 510.

Using the typical property of the mean, we take the total value of the dataset and divide by the still 10 values in that dataset and the new mean would be 51 or 510 divided by 10.

But what if we added an extra data point to the dataset because new data was collected? At the moment, we do not know the value of this data point that's been added, but we know that the mean has increased from 50 to 55.

What is the value of this extra data point? We know that there are now 11 points in the dataset because we had 10 and introduced one extra point in.

Let's make a comparison between the original dataset and the dataset with that extra 11th point.

The mean was 50 and is now 55 and the total value of the dataset has increased from 500 to 605.

The total value is now 605 because the mean of 55 has been multiplied by all 11 data points.

The total of the original 10 data points is 500, but the total value of the dataset is now 105 higher and that 105 extra comes from that one extra data point and so that extra data point has a value of 105.

Back to Izzy, who's now taking part in the regional championship, consisting of nine races.

Below are the points she scored on the first eight races.

In order for her to advance to the national championship, she needs to have a mean score of at least seven.

What is the minimum number of points she must score in that last race in order to advance to the national championship? After nine races, she needs to have a mean of at least seven.

Therefore, she must have a total of 63 points across all those races.

At the moment, she has 55 points, adding up the point score from all eight of those races.

Therefore, she needs to score eight points in that final race in order to advance to the national championship.

Okay, onto a check for understanding.

The mean of 5 numbers is 20.

However, an error has been spotted and a data point has to be modified in its value.

Match the statements on the left to the calculations on the right that complete each sentence.

Pause now to make those matches.

The answers are A with two, B with three, and C with one.

Onto some independent practise.

We've looked at both the mode and the mean.

For the last bit, we're gonna look at the median but only with regards to raw datasets and not for dot plots.

Let's have a look.

When a new piece of data is added to a dataset, the location of the median shifts either half a position left or right, whichever direction the new data point is added.

For example, the median is positioned at the third data point with a value of 138.

When the data point of 187 is added, the position of the median will shift right to 143, which is the midpoint of 138 and 148.

This is now the 3.

5th data point.

Conversely, for this set of data, we've got the median at 27, the midpoint of 25 and 29.

The median is currently positioned at the 2.

5th data point.

If the data point of 10, a small value, is added, the median would shift left, not in between the 25 and 29.

Now strictly at the 25.

This is now the third data point in that dataset.

This rule is true no matter how big or small the size of our dataset is.

In this dataset of 104 data points, well, let's zoom in a little bit so we can actually see some of the numbers, the current median is 211.

If the data point of 100, a smaller than the median value, is added to the dataset, the median would shift left or down by half a data point.

So, the median was 211 and is now 210.

It was in the midpoint between these two data points and is now strictly at the point 210.

The exact opposite is true when we are removing a data point.

If we were to remove 5.

1 from this dataset, rather than the median moving right towards that removed data point, it would move left, away from that removed data point.

However, notice how the value of the median hasn't changed? If the data points to the left or the right of the current median has the same value as the median, it is very likely that the median will not change in its value.

However, the position of the median will change even if the position changes to a data point with the same value as the previous median.

If a value had to be modified, it only impacts the location and therefore, the value of the median if that modified data point goes from either smaller than the median to greater or greater than the median to smaller.

In this example, the median is currently 68.

If this 59 changed in value to a 66, if we reordered the data, the median doesn't change because this change is happening all on one side and the median's position does not move.

However, if we take the value of 82 right at the end of the dataset and make it 60 instead, we would have to reorder the dataset so that the 60 would move past the median all the way to the left-hand side of the dataset.

As a consequence, have a look at where the median arrow is pointing now.

It is no longer pointing in between the 67 and the 69.

It is now pointing in between the 65 and the 67.

The new median is now 66.

The median has decreased because our data point has gone from above or greater than median to below the median.

Unlike with adding or removing a data point, which shifts the position by half a data point, when modifying a data point, the position of the median either changes by zero if it doesn't go from below to above or vice versa or by one whole data point if that modified data point goes from above the median to below or vice versa.

Back to Izzy, who's in the second round of the national championship.

In this round, only her most recent 10 races will be counted to her final score.

In the next race, she scores six points.

Will her median score change? Now that she's done an extra race, her score from the first race gets replaced with this new value.

Therefore, that nine gets replaced with a six.

In order to do any calculations with the median, we have to put the data into order first.

Therefore, we take the points and we put them into order of size.

That nine is the score that will change after she's done the next race.

The current median is 7.

Will that change if the nine gets updated to a score of six instead? Because we are taking a data point above the median and changing it to a data point below the median, the median will shift by one full data point.

Therefore, her median decreases from 7.

5 down to 6.

Time for the last few checks for understanding.

In this question, the median of this dataset is currently 44.

The data point of 65 is added to the dataset.

Which of these sentences best describes how the median will change? Pause here to look through each of those sentences and choose the correct one.

The answer is C.

The median shifts position by half a data point because 65 is greater than the current median of 44.

The median will end up in between the 44 and the 60 and the midpoint of 44 and 60 is 52.

And next up, this dataset has a median of 707.

The data point of 709 has been removed from this dataset.

Which of these statements best describes how the median will change after the 709 has been removed? Pause now to think of your answer.

And the answer is A.

Whilst the location of the median will shift left by half a data point, the value of the median will not change.

This is because there is a second data point with a value of 707 to the left of where the median previously was.

And in this check for understanding, a data point with a value of 12.

6 had to be modified to become 15.

Which of these statements best describes how the median will change now? Pause the video to choose one of those options.

The answer is C.

The median does not change because the data point hasn't changed from a value greater than the median to one smaller or vice versa.

Therefore, the position and the value of the median would have no reason to change.

Onto the final practise tasks.

On the left, you have three datasets with a description of how that dataset will be changed.

On the right, you have how the median has changed as a result of these modifications.

Pause here to match the correct dataset with the correct description of how the median has changed.

For question number two, fill in the blanks for both of these datasets to fully describe how these medians have changed.

Pause the video to give yourself time to fill in all of those blanks.

And with question three, it is very similar to question two, but now you've been given space to construct the modified dataset for each of the two questions.

Fill in the blanks to fully describe how the median has changed.

Pause now to give those a go.

Last question, question number four.

Two products have been reviewed nine times each, each rated on a scale from one to five stars.

Each of the stars below shows one review for that product.

What could the 10th rating of each product be in order for their median rating to increase? Pause now to give both products a go.

Onto the answers.

For the first dataset of question one, adding the data point of 17 will make the median increase, removing a data point of 16 will make the median decrease, whereas for the last one, modifying a data point from 14 to 16 would make the median become 15 rather than the 14 that it previously was.

For question number two, the missing values are 97, 62, left, 1, and 53.

For question number three, here are the modified datasets, and the missing values in the sentences are 99, 96, 62, and 58.

Finally, for question number four, the 10th rating would have to be 5 stars in order to change the median rating from 4 stars to 4.

5 stars.

For product B, the median rating was two, so the 10th rating would need to be 3 or above to change that median rating.

That is all for today's lesson.

Thank you for joining me in today's lesson where we have covered the mean, the median, and the mode and how they change as parts of a dataset are modified, where data points are added, and where data points have been removed.

Thank you for joining me, and I hope to see you soon for some more maths.

Have a good day.

I've finished the video