Computes lower and upper bounds for a numeric vector using one of several methods:
"iqr"
: Tukey's Interquartile Range fences"mad"
: Median Absolute Deviation"sd"
: Standard deviation from mean"quantile"
: Fixed percentile cutoffs"z_normal"
: Standard Z-score using mean & SD"z_robust"
: Modified Z-score using median & MAD"fold_change"
: Median ± log10(k), assumes log-transformed data
Usage
get_outlier_bounds(
x,
method = c("iqr", "mad", "sd", "quantile", "z_normal", "z_robust", "fold_change"),
k = NULL,
outlier_log = FALSE,
na.rm = FALSE
)
Arguments
- x
A numeric vector.
- method
Character string: one of
"iqr"
,"mad"
,"sd"
,"quantile"
,"z_normal"
,"z_robust"
, or"fold_change"
.- k
Numeric multiplier or threshold. Defaults depend on method:
"iqr"
: 1.5 (multiplier of IQR)"mad"
: 3 (multiplier of MAD)"sd"
: 3 (multiplier of SD)"z_normal"
: 3 (threshold in SD units)"z_robust"
: 3.5 (threshold in robust Z units)"fold_change"
: 2 (fold-change multiplier, assumes log-transformed data)"quantile"
: 0.01 (lower/upper quantiles, e.g., 1% and 99%)
- outlier_log
Logical. If
TRUE
, applies log10 transformation tox
- na.rm
Logical. Should missing values be removed? Default is
FALSE
.
Value
A numeric vector of length 2: c(lower_bound, upper_bound)
representing
the smallest and largest observed values within the computed fences.
Examples
get_outlier_bounds(c(1, 2, 3, 4, 100), "iqr")
#> [1] 1 4
get_outlier_bounds(c(1, 2, 3, 4, 100), "mad")
#> [1] 1 4
get_outlier_bounds(c(1, 2, 3, 4, 100), "sd")
#> [1] 1 100
get_outlier_bounds(c(1, 2, 3, 4, 100), "quantile", k = 0.05)
#> [1] 2 4
get_outlier_bounds(c(1, 2, 3, 4, 100), "z_normal")
#> [1] 1 100
get_outlier_bounds(c(1, 2, 3, 4, 100), "z_robust")
#> [1] 1 4
get_outlier_bounds(log10(c(1, 2, 4, 8)), "fold_change") # default 2×
#> [1] 0.00000 0.90309
get_outlier_bounds(log10(c(1, 2, 4, 8)), "fold_change", k = 3) # 3× fold-change
#> [1] 0.00000 0.90309