Boxplot: whiskers and outliers doubt I have a doubt on boxplot. I'll expose my knowledge and the

ttyme411gl

ttyme411gl

Answered question

2022-07-07

Boxplot: whiskers and outliers doubt
I have a doubt on boxplot.
I'll expose my knowledge and then my doubt.
x = { x 1 , x 2 . . . x n }: the set of samples
q 1 , q 3 : the first and third quartiles
w l , w u : the lower and upper whiskers
I Q R = q 3 q 1
box extends from q 1 to q 3
w l = m a x ( m i n ( x ) , q 1 1.5 I Q R )
w u = m i n ( m a x ( x ) , q 3 + 1.5 I Q R )
o u t l i e r s = { x i x | x i < w l x i > w u }
Observations:
whiskers' distance from box are not symmetric ( w l = m i n ( x ) w u = m a x ( x ) )
w u q 3 < q 1 w l x i : x i o u t l i e r s x i > w u
w u q 3 > q 1 w l x i : x i o u t l i e r s x i < w l
My doubt: if all what I exposed is correct, how do you explain the presence of outliers in this speed of light boxplot (third experiment, lower outliers) and in this plot (see wednesday, lower outliers)?
In the case my reasoning is wrong, please provide a simple numeric counterexample.

Answer & Explanation

thatuglygirlyu

thatuglygirlyu

Beginner2022-07-08Added 14 answers

Consider the data
{ 0 , 4 , 5 , 5 , 5 , 6 , 6 , 6 , 6 , 7 , 20 } .
The median is 6, the first quartile is 5, and the third quartile is 6. So the IQR is 1 and it easily follows that { 0 } is a lower outlier and { 20 } is an upper outlier. What you need to take into account is that the box shows you where 50% of the data lies, so if this is particularly narrow, then the IQR is small, and any values outside the range determined by the 1.5IQR rule are outliers. There can be many outliers, or none at all.
ttyme411gl

ttyme411gl

Beginner2022-07-09Added 6 answers

Ok I got the answer:
The definitions of w l and w u in my question were wrong. Referring to Wikipedia:
"whiskers can represent several possible alternative values" such as "the minimum and maximum of all of the data" or "the lowest datum still within 1.5 IQR of the lower quartile, and the highest datum still within 1.5 IQR of the upper quartile", or even "one standard deviation above and below the mean of the data" and finally "the 9th percentile and the 91st percentile" or "the 2nd percentile and the 98th percentile".

Do you have a similar question?

Recalculate according to your conditions!

New Questions in High school statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?