<< ST_nalimov Outlier-Tests ST_pearsonhartley >>

SampleSTAT >> SampleSTAT > Outlier-Tests > ST_outlier

ST_outlier

Basic outlier tests for normal distributions

Calling Sequence

[outlierfree] = ST_outlier(v)
[outlierfree] = ST_outlier(v, mod)
[outlierfree, outlier] = ST_outlier(v)
[outlierfree, outlier] = ST_outlier(v, mod)

Parameters

v:

n-by-1 or 1-by-m matrix of doubles, numerical values (n>10, better n>25)

mod:

1-by-1 matrix of strings, "sd" "iqr15"or "iqr30" mode

outlierfree:

n-by-1 or 1-by-m matrix of doubles, outlier-free data

outlier:

n-by-1 or 1-by-m matrix of doubles, outliers

Description

Performs basic outlier tests.

SD-MODE: If you have a normal, symetric and unimodal distribution you can use the "sd" mode (population standard deviation, S.D. or sigma). In this mode a value is presented as an outlier when it is more than 2.5xS.D. off the arithmetic mean in both directions.

IQR-MODES:Testing on outliers with interquartile range (IQR) distance is recommended for skewed data in the first place. But it is also applicaple for normally distributed data.

IQR15-MODE: It is common to consider a value an outlier when it is more than 1.5xIQR (inter-quartile range) off from the lower or upper quartile. The "iqr15"-mode make use of this.

IQR30-MODE: But with a border of 1,5xIQR 0.7% of the distribution can be expected as an outlier automatically. This means that a distribution of 143 values or more could have at least one outlier in any case. To avoid this, values between 1.5xIQR and 3.0xIQR from the lower or upper quartile are called extreme values or weak outliers and just values outside of 3.0xIQR are strong outliers. SampleSTAT toolbox take care of this by introducing the "iqr30" mode.

Do use ST_outlier "sd" mode ONLY with NORMAL distributed data and with more than 10 or better more than 25 values! Use ST_deandixon (or ST_nalimov) for distributions with lower number of values.

Examples

data = [
0.4827129   0.3431706  -0.4127328    0.3843994 ..
-0.7107495  -0.2547306   0.0290803    0.1386087 ..
-0.7698385   1.0743628   1.0945652    0.4365680 ..
-0.5913411  -0.7426987   1.609719     0.8079680 ..
-2.1700554  -4.7361261   0.0069708    14.626386 ..
-2.5036545  -2.9046385 ..
];
of = ST_outlier(data')      // outlier-free values with sd-mode
[of, o] = ST_outlier(data', "sd")  // outlier and outlier-free values
[of15, o15] = ST_outlier(data', "iqr15")  // outlier and outlier-free values
[of30, o30] = ST_outlier(data', "iqr30")  // outlier and outlier-free values

See also

Authors

Bibliography

Lohringer, H., "Grundlagen der Statistik", Oct, 10th, 2012, http://www.statistics4u.info/fundstat_germ/cc_outlier_tests_4sigma.html


Report an issue
<< ST_nalimov Outlier-Tests ST_pearsonhartley >>