<< Tutorials Tutorials

samplestat >> samplestat > Tutorials > Tutorial - Measures of Variance

Tutorial - Measures of Variance

Evaluate the quality of univariate data

Purpose

The goal of this document is to illustrate practical uses of routines regarding measures of variation in the SampleSTAT toolbox.

Introduction

If you are collecting data on a process, it is important to determine not only the location of the mean, but also to look at the variation within the data. If you are, for example, interpreting the results of a chemical analysis, you may put much more emphasis on the obtained average value if you know that the individual samples vary only very little in comparison to the mean.

In general, the spread of a distribution, both in absolute and in relative terms, is a good measure of the variability (and hence reliability) of the data. There are several ways to specify the variation in the data. SampleSTAT provides routines to give you more information of your data as mean, median and standard deviation can do.

These functions are good to extend the built-in functions mean(), stdev(), max(), min(), median().

Functions

The toolbox SampleSTAT provide the following functions/macros for measuring variation (details see example below).

ST_strayarea:

Calculates the stray area (range of dispersion of the values) of univariate data for a statistical confidence level (95%, 99%, 99.9%) and level of significance (0.5, 0.01, 0.001), resp. This is similar to the sample standard deviation but gives more confidence as the sample standard deviation (68%) can provide.

ST_trustarea:

Calculates the trust area (range of dispersion of the mean) of univariate data and for a statistical confidence level (95%, 99%, 99.9%) and level of significance (0.5, 0.01, 0.001), resp. and is the sample standard deviation of the mean. Because specifying the sample standard deviation (s) is more or less useless without the additional specification of the mean (and of course the type of distribution). It makes a big difference if s = 5 with a mean of = 100, with a mean of = 3. Relating the sample standard deviation to the mean resolves this problem.

ST_studentfactor:

Determines the student factor for an amount of numbers and for a statistical confidence level (95%, 99%, 99.9%) and level of significance (0.5, 0.01, 0.001), resp.- service function for ST_strayarea and ST_trustarea

// Sample data
v = [ ..
9.999; ..
9.998; ..
10.002; ..
10.000; ..
10.001; ..
10.000 ..
];

// Statistical confidence level
p = "95%";

// Calculate statistical results
n  = length(v);           // Number of values
x  = mean(v);             // Arithmetic mean
s  = stdev (v);           // Sample standard deviation
sa = ST_strayarea(v, p);  // Range of dispersion of the values (stray area)
ta = ST_trustarea(v, p);  // Range of dispersion of the mean (trust area)
mi = min(v);              // Minimal value
ma = max(v);              // Maximal value

// Output
clc;
mprintf("\nDemo for range of dispersion (stray- and trustarea)\n\n");
mprintf("Values:\n");
disp(v); // Display data
mprintf("\n"); // blank line
mprintf(..
"Number of Values            : %i\n" + ..
"Arithmetic Mean             : %f\n" + ..
"Sample Standard Deviation   : %f\n" + ..
"Confidence Level            : %s\n" + ..
"Range of Dispersion (values): %f\n" + ..
"Range of Dispersion (mean)  : %f\n" + ..
"Minimum                     : %f\n" + ..
"Maximum                     : %f\n", ..
n, x, s, p, sa, ta, mi, ma);
mprintf("\n"); // blank line

mprintf( ..
"68 percent of the values will stray around %.3f +/- %.3f (sample standard \n" + .. 
"deviation). %s of the values will be expected around %.3f +/- %.3f (Range \n" + ..
"of disp. of the values, stray area).\n" + ..
"With a propability of %s the mean of %.3f will stray around %.3f +/- %.3f \n" + ..
"(Rage of dispersion of the mean, trust area).\n", x, s, p, x, sa, p, x, x, ta);

Output:

Demo for range of dispersion (stray- and trustarea)

Values:
 
    9.999   
    9.998   
    10.002  
    10.     
    10.001  
    10.     

Number of Values            : 6
Arithmetic Mean             : 10.000000
Sample Standard Deviation   : 0.001414
Confidence Level            : 95%
Range of Dispersion (values): 0.003635
Range of Dispersion (mean)  : 0.001484
Minimum                     : 9.998000
Maximum                     : 10.002000

68 percent of the values will stray around 10.000 +/- 0.001 (sample standard
deviation). 95% of the values will be expected around 10.000 +/- 0.004 (Range
of disp. of the values, stray area).
With a propability of 95% the mean of 10.000 will stray around 10.000 +/- 0.001
(Rage of dispersion of the mean, trust area)
   

Bibliography

R. Kaiser, G. Gottschalk; "Elementare Tests zur Beurteilung von Meßdaten", BI Hochschultaschenbücher, Bd. 774, Mannheim 1972.

Authors

Hani A. Ibrahim - hani.ibrahim@gmx.de


Report an issue
<< Tutorials Tutorials