New
New
Year 10
Higher

Outliers in scatter graphs

I can recognise outliers and understand whether they should be included in the data set.

New
New
Year 10
Higher

Outliers in scatter graphs

I can recognise outliers and understand whether they should be included in the data set.

Lesson details

Key learning points

  1. Outliers can be visually seen on a scatter graph.
  2. Outliers do not necessarily occur at the start/end of the data set.
  3. Outliers must be carefully considered and only removed if they were recorded in error.

Common misconception

An outlier always has values that are outside the range of the dataset (i.e. at the start or end of a dataset).

For bivariate data, a data point may be distinct from the rest of the data because the value of one of its variable is extremely less/greater than other points that have a similar value in its other variable.

Keywords

  • Outlier - An outlier is a data point that is extremely large or small compared to the rest of the dataset. Visually, outliers lie far away from where the majority of the results are clustered.

Pupils could explore any previous scatter graphs they have seen (which may not necessarily have any outliers). They could discuss where a hypothetical outlier could be plotted on the graph, why it would be an outlier and how it would be interpreted in the context of its data.
Teacher tip

Licence

This content is © Oak National Academy Limited (2024), licensed on Open Government Licence version 3.0 except where otherwise stated. See Oak's terms & conditions (Collection 2).

Loading...

6 Questions

Q1.
Extrapolation is when you estimate values that are the range of available data
Correct Answer: outside, beyond
Q2.
Which values are obtained by interpolation on this scatter diagram?
An image in a quiz
Correct answer: A
B
Correct answer: C
D
Q3.
Select the regions where extrapolation can be used to make estimations on this scatter graph.
An image in a quiz
A only
B only
C only
Correct answer: A and C
B and C
Q4.
Alex says, “I can always use interpolation to make a valid prediction when my bivariate data has correlation.” Is Alex definitely correct?
No, interpolation is always completely unreliable.
No, it has to be used with caution as the trend might not continue.
Yes, interpolation is always completely reliable.
Correct answer: Yes, but you can only make accurate predictions about similar situations.
Q5.
The scatter graph shows data taken from the ONS, where each point represents a region in England and Wales in 2023. Select which of these Oak pupils can use this graph to make an accurate prediction.
An image in a quiz
Laura: I want the house price in Oakford where the median income is £19 000
Correct answer: Alex: I want the house price in Ashfield where the median income is £35 000
Jacob: I want the house price in Spain where the median income is £35 000
Sam: I want the house price in Elmway where the median income is £92 000
Q6.
The scatter graphs shows information about house prices and income in different regions. Estimate the difference between the house prices in the North and South for a median income of £25 000
An image in a quiz
£100 000
£125 000
Correct answer: £175 000
£250 000
£375 000

6 Questions

Q1.
An is a data point that is extremely large or small compared to the rest of the dataset.
Correct Answer: outlier
Q2.
Which of these points are an outlier?
An image in a quiz
Correct answer: A
Correct answer: B
C
Correct answer: D
E
Q3.
Sam says that it is possible to have a dataset with no outliers. Is Sam correct?
No, all datasets will have a highest value.
No, there will always be a datapoint that is lower than the rest.
No, not all the data points will cluster.
Correct answer: Yes, there may be no points that are far away from the main cluster of data.
Yes, not every dataset has errors in it.
Q4.
The scatter graph shows information about number of children and number of elderly people in a set of towns. Each point represents a town. Match each outlier to the correct statement.
An image in a quiz
Correct Answer:A,town with a low number of children and many elderly people

town with a low number of children and many elderly people

Correct Answer:B,town with a high number of children and many elderly people

town with a high number of children and many elderly people

Correct Answer:C,town with a high number of children and few elderly people

town with a high number of children and few elderly people

Correct Answer:D,town with a low number of children and few elderly people

town with a low number of children and few elderly people

Q5.
Sofia says you should never ignore an outlier. Is Sofia correct?
Correct answer: It depends, you should investigate each outlier carefully.
No, if a data point is extremely large then it must be an error.
Yes, an outlier is always just an unusual result in the data.
Q6.
Jacob says you should show the outlier on a scatter graph. Is Jacob correct?
No, outliers are always too far from the rest of the data to show.
Correct answer: Yes, you may need to zoom in on the graph to use interpolation.
Correct answer: No, removing it can make the trend of the data easier to see.
Yes, otherwise the scatter graph is wrong.
Correct answer: Yes, the outlier may help highlight useful information.