Standard Deviation after removing outlier. I am totally new to statistics. I'm learning the basics

Izabella Ponce

Izabella Ponce

Answered question

2022-06-10

Standard Deviation after removing outlier.
I am totally new to statistics. I'm learning the basics.
I came upon this question while solving Erwin Kreyszig's exercise on statistics. The problem is simple. It asks to calculate standard deviation after removing outliers from the dataset.
The dataset is as follows: 1, 2, 3, 4, 10. What I did is, I found out qm = 3. Then ql = 1 + 2 2 = 1.5 and qm = 4 + 10 2 = 7
Now, I Q R = 7 1.5 = 5.5 and 1.5 I Q R = 8.25
So, we can say numbers beyond 1.5 5.5 = 4 and 7 + 5.5 = 12.5 will be an outlier.
Since there is no outlier, I found out the Standard Deviation of the set which is 3.53.
But, the answer provided is 1.29 which is different from the standard deviation of the set.
Can anyone help me what I missed?
Also, I have another question - we can see with plain eyes 10 is an outlier. But it is not detected here - why?

Answer & Explanation

Hadley Cunningham

Hadley Cunningham

Beginner2022-06-11Added 20 answers

Well deciding what's an outlier is somewhat of an art so there's only a fuzzy line here. Still it seems like a good procedure might detect 10 here and based on your book's answer, it seems like the procedure they intended for you to use should delete 10.
So let's think about what could have gone wrong here. My guess is you should have picked 2 and 4 as your first and third quantiles instead of 1.5 and 7. The procdure for deciding quartiles is also not set in stone and results can vary wildly for small sets of discrete data. It's also an established procedure to consider the upper and lower quartile to be the medians of (1,2,3) and (3,4,10) rather than (1,2) and (4,10) as you've done. The first method is a little more robust (but tends to be biased inwards), as this example demonstrates.
Carolyn Beck

Carolyn Beck

Beginner2022-06-12Added 8 answers

Removing 10 gives the set { 1 , 2 , 3 , 4 } with x ¯ = 1 + 2 + 3 + 4 4 = 2.5. So the standard deviation is ( ( 1 2.5 ) 2 + ( 2 2.5 ) 2 + ( 3 2.5 ) 2 + ( 4 2.5 ) 2 ) ) / ( 4 1 ) = ( 2 1.5 2 + 2 0.5 2 ) / 3 = 5 / 3 = 1.29...

Do you have a similar question?

Recalculate according to your conditions!

New Questions in High school statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?