From d1c3519c01d1b0a32464d9c0f721c13f6444be97 Mon Sep 17 00:00:00 2001 From: RafaJPSantos Date: Mon, 27 Jan 2020 15:45:13 +0000 Subject: [PATCH] Correction to Tukey's method formula. --- 02_data_preparation.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/02_data_preparation.Rmd b/02_data_preparation.Rmd index 67d92d7..5f6aee5 100644 --- a/02_data_preparation.Rmd +++ b/02_data_preparation.Rmd @@ -1450,7 +1450,7 @@ The IQR (Inter-quartile range) comes from Q3 − Q1. The formula: * The bottom threshold is: Q1 − 3*IQR. All below are considered as outliers. -* The top threshold is: Q1 + 3*IQR. All above are considered as outliers. +* The top threshold is: Q3 + 3*IQR. All above are considered as outliers. The value 3 is to consider the "extreme" boundary detection. This method comes from the box plot, where the multiplier is 1.5 (not 3). This causes a lot more values to be flagged as shown in the next image.