An overview of the SampleSTAT toolbox for univariate data.
This toolbox provides elementary tests for the evaluation of univariate measurement data that are typically recorded by scientists and engineers. These data have to be normally distributed for SampleSTAT's routines. SampleSTAT is focused on very small sample sizes (<10), but also offers routines for larger distributions (up to 50 values). It provides functions for calculating the range of values and the mean value with respect to a given statistical confidence level. It also provides tests for outliers and a method to test the data for normal distribution.
![]() | These routines are applicable for normally distributed data ONLY. |
Calculates the stray areas of values and mean of normally distributed data. That gives you more information of the quality of your data as the sample standard deviation (s) of samples can do. s has an statistical confidence of 68%. Stray area (of values) can provide a confidence level of 95%, 99% and 99.9%. The trust area (stray area of the mean) gives you information of the confidence of your mean with a probability of 95%, 99% and 99.9%.
Provides test to determine whether an extreme value is an outlier or not. Test are available for small and bigger distributions.
All routines are applicable for normally distributed data only. To test for normal distribution, tests are provided.
These functions are good to extend the built-in functions mean(), stdev(), max(), min(), median() and can be extended by routines of the toolboxes "Stixbox" and "Distfun".
Calculates the stray area (range of dispersion of the values) of univariate data for a statistical confidence level (95%, 99%, 99.9%) and level of significance (0.5, 0.01, 0.001), resp.
Calculates the trust area (range of dispersion of the mean) of univariate data and for a statistical confidence level (95%, 99%, 99.9%) and level of significance (0.5, 0.01, 0.001), resp. and is the standard deviation of the mean.
Determines the student factor for an amount of numbers and for a statistical confidence level (95%, 99%, 99.9%) and level of significance (0.5, 0.01, 0.001), resp.- service function for ST_strayarea and ST_trustarea
This test for outliers is eminently suitable for small sample sizes; for samples having more than 30 observations the test for significance of Pearson and Hartley can be used as well.
For random samples larger than 30 objects possible outliers may be identified by using the significance thresholds of Pearson and Hartley.
This test on outliers provides a quick hint for all sample sizes and is often used in German publications.
![]() | There is some scientific discussion about this test. It is thus recommended to use other tests instead of the Nalimov test (Dean-Dixon test for small samples, the test of Pearson and Hartley for larger ones). |
Provide two basic tests on outliers based on population standard deviation (S.D.) or inter-quartile range (IQR). It should be applied to sample sizes larger than 10 better more than 25. For small sample sizes the Dean-Dixon test is recommended.
The Shapiro-Wilk test is a statistical significance test that tests the hypothesis that the underlying population of a sample is normally distributed. It tests for normality exhibiting high power, leading to good results even with a small number of observations, especially in comparison to Chi-Square or Kolmogorov-Smirnov tests.
Very basic "Individual Value Plot" function. Graphic test on normal distribution for small sample sizes. Better alternatives to histograms or box-plots for this kind of sample sizes.
![]() | ST_ivplot is experimental and not finished. Some important features are missing, e.g. displaying ties (duplicates). |
Hani A. Ibrahim - hani.ibrahim@gmx.de
R. Kaiser, G. Gottschalk; "Elementare Tests zur Beurteilung von Meßdaten", BI Hochschultaschenbücher, Bd. 774, Mannheim 1972.
Lohringer, H., "Grundlagen der Statistik", Oct, 10th, 2012, http://www.statistics4u.info/
Shapiro, Wilk: "An Analysis of Variance Test for Normality", Biometrika, Vol. 52, No. 3/4. (Dec., 1965), pp. 591-611.