Search optimal CV number of bins in a histogram.
n=histocvsearch(x) n=histocvsearch(x,nstep) n=histocvsearch(x,nstep,nmax) [n,jhat]=histocvsearch(...)
a m-by-1 or 1-by-m matrix of doubles, the data
a 1-by-1 matrix of doubles, the number of bins in the histogram
a 1-by-1 matrix of doubles, the number of bins increment in the search (default nstep=1). Must be in the range {1,...,m}.
a 1-by-1 matrix of doubles, the maximum number of bins in the search (default nmax=m). Must be in the range {1,...,m}.
a 1-by-1 matrix of doubles, the minimum achieved leave-one-out cross-validation MSE
Searches the number of bins in a histogram which minimizes the leave-one-out cross-validation MSE as computed by histocvimse.
The algorithm proceeds with a for loop, with indices from 1 to nmax, with step nstep.
m=100; // Number of observations x=distfun_normrnd(0,1,m,1); [n,jhat]=histocvsearch(x) histo(x,n) xlabel("X") ylabel("Frequency") title("Optimal bin width") // Use increment equal to 2 nstep=2; [n,jhat]=histocvsearch(x,nstep) // Search only up to n=50 nmax=50; [n,jhat]=histocvsearch(x,[],nmax) scf(); subplot(1,2,1) // Prints the Cross-Validation IMSE versus the number of bins m=100; // Number of observations x=distfun_normrnd(0,1,m,1); xlabel("Number of bins"); ylabel("Cross-Validation IMSE"); jhat=histocvimse(x,1:nmax); plot((1:nmax)',jhat,"bo") // Plot the optimal histogram [n,jhat]=histocvsearch(x) title(string(m)+" observations") subplot(1,2,2) histo(x,n,%t); xtitle("n="+string(n)+" bins"); xlabel("X") ylabel("Frequency") x=linspace(-3,3) y=distfun_normpdf(x,0,1) plot(x,y,"-") title("n="+string(n)) | ![]() | ![]() |
All of non parametric statistics, L. Wasserman, Springer, 2006