<< histocvimse Miscellaneous ksdensity >>

Stixbox >> Stixbox > Miscellaneous > histocvsearch

histocvsearch

Search optimal CV number of bins in a histogram.

Calling Sequence

n=histocvsearch(x)
n=histocvsearch(x,nstep)
n=histocvsearch(x,nstep,nmax)
[n,jhat]=histocvsearch(...)

Parameters

x :

a m-by-1 or 1-by-m matrix of doubles, the data

n :

a 1-by-1 matrix of doubles, the number of bins in the histogram

nstep :

a 1-by-1 matrix of doubles, the number of bins increment in the search (default nstep=1). Must be in the range {1,...,m}.

nmax :

a 1-by-1 matrix of doubles, the maximum number of bins in the search (default nmax=m). Must be in the range {1,...,m}.

jhat :

a 1-by-1 matrix of doubles, the minimum achieved leave-one-out cross-validation MSE

Description

Searches the number of bins in a histogram which minimizes the leave-one-out cross-validation MSE as computed by histocvimse.

The algorithm proceeds with a for loop, with indices from 1 to nmax, with step nstep.

Examples

m=100; // Number of observations
x=distfun_normrnd(0,1,m,1);
[n,jhat]=histocvsearch(x)
histo(x,n)
xlabel("X")
ylabel("Frequency")
title("Optimal bin width")

// Use increment equal to 2
nstep=2;
[n,jhat]=histocvsearch(x,nstep)

// Search only up to n=50
nmax=50;
[n,jhat]=histocvsearch(x,[],nmax)

scf();
subplot(1,2,1)
// Prints the Cross-Validation IMSE versus the number of bins
m=100; // Number of observations
x=distfun_normrnd(0,1,m,1);
xlabel("Number of bins");
ylabel("Cross-Validation IMSE");
jhat=histocvimse(x,1:nmax);
plot((1:nmax)',jhat,"bo")
// Plot the optimal histogram
[n,jhat]=histocvsearch(x)
title(string(m)+" observations")
subplot(1,2,2)
histo(x,n,%t);
xtitle("n="+string(n)+" bins");
xlabel("X")
ylabel("Frequency")
x=linspace(-3,3)
y=distfun_normpdf(x,0,1)
plot(x,y,"-")
title("n="+string(n))

Bibliography

All of non parametric statistics, L. Wasserman, Springer, 2006

Authors


Report an issue
<< histocvimse Miscellaneous ksdensity >>