Histogram
[theText, rawN, x] = nan_nhist(cellValues, 'parameter', value, ...) t = nan_nhist(Y) [t, N, X]= nan_nhist(...) nan_nhist(Y,'PropertyName', . . . ) nan_nhist(Y,'PropertyName',PropertyValue, . . . )
Effects the number of bins used. A larger number
this will make all the bins align with each other
The minimum number of bins allowed for each graph
The maximum number of bins allowed for each graph
Number of times the standard deviation to set the
crop the axis and histogram on the left. 'xmin'
crop the axis and histogram on the right. 'xmax'
Plot proportion of total points on the y axis
Plot the pdf on the y axis
Plot the raw numbers on the graph. 'number'
Plot a smooth line instead of the step function.
A cell array with strings to put in the legend or
In case you pass a struct, you may force a legend
Outputs all numbers to text, even ones that are
Number of decimal places numbers will be output
this will add (number of points) to the legend or
Label of the lowest X axis
Label of the Y axis, note that the ylabel default
Font size, default 12. 'fontsize'
Sets the location of the legend,
NorthOutside. 'legendlocation'
This will plot a stem plot of the median
This will plot a stem plot of the mode
Will put the mean and 'standard error' bars above
Will remove the mean and standard deviation error
Sets the width of the lines for all the graphs
Sets the colormap to decide the colors of the
Plot each histogram separately, also use normal
Will make a new figure to plot it in. When using
EPS file name of the generated plot to save. It
t = nhist(Y) bins the elements of Y into equally spaced containers and returns a string with information about the distributions. If Y is a cell array or a list nhist will make graph the binned (discrete) probability density function of each data set for comparison on the same graph. It will return A cell array or structure which includes a string for each set of data.
[t, N, X]= nhist(...) also returns the number of items in each bin, N, and the locations of the left edges of each bin. If Y is a cell array or structure then the output is in the same form.
__________________________________________________________________________
Summary of what function does: 1) Automatically sets the number and range of the bins to be appropriate for the data.
2) Compares multiple sets of data elegantly on one or more plots, with legend or titles. It also graphs the mean and standard deviations. It can also plot the median and mode.
3) Outputs text with the usefull statistics for each distribution.
4) Allows for changing many more parameters
Highlighted features (see below for details)
'separate' to plot each set on its own axis, but with the same bounds
'binfactor' change the number of bins used, larger value =more bins
'samebins' force all bins to be the same for all plots
'legend' add a legend in the graph (default for structs)
'noerror' remove the mean and std plot from the graph
'median' add the median of the data to the graph
'text' return many details about each graph even if not plotted
Optional Properties
Note: Alternative names to call the properties are listed at the end of each entry.
The bin width is defined in the following way Disclaimer: this function is specialized to compare data with comparable standard deviations and means, but greatly varying numbers of points.
Scotts Choice used for this function is a theoretically ideal way of choosing the number of bins. Of course the theory is general and so not rigorous, but I feel it does a good job. (bin width) = 3.5*std(data points)/(number of points)^(1/3);
I did not follow it exactly though, restricting smaller bin sizes to be divisible by the larger bin sizes. In this way the different conditions can be accurately compared to each other.
The bin width is further adulterated by user parameter 'binFactor' (new bin width) = (old bin width) / (binFactor); it allows the user to make the bins larger or smaller to their tastes. Larger binFactor means more bins. 1 is the default
Source: http://en.wikipedia.org/wiki/Histogram#Number_of_bins_and_width
Default function behaviour
If you pass it a structure, the field names will become the legend. All of the data outputted will be in structure form with the same field names. If you pass a cell array, then the output will be in cell form. If you pass an array or vector then the data is outputted as a string and two arrays.
standard deviation will be plotted as a default, unless one puts in the 'serror' paramter which will plot the standard error = std/sqrt(N)
There is no maximum or minimum X values.
minBins=10; The minimum number of bins for the histogram
maxBins=100;The maximum number of bins for a histogram
AxisFontSize = 12; 'fsize' the fontsize of everything.
The number of data points is not displayed
The lines in the histograms are black
faceColor = [.7 .7 .7]; The face of the histogram is gray.
It will plot inside a figure, unless 'newfig' is passed then it will make a new figure. It will take over and refit all axes.
linewidth=2; The width of the lines in the errobars and the histogram
stdTimes=4; The axes will be cutoff at a maximum of 4 times the standard deviation from the mean. Different data sets will be plotted with a different number of bins.
Acknowledgments
Thank you to the AP-Lab at Boston University for funding me while I developed this function. Thank you to the AP-Lab, Avi and Eli for help with designing and testing it.
A=list(rand(1,10^5,'normal'),rand(10^3,1,'normal')+1); nan_nhist(A); nan_nhist(A,'legend',['u=0','u=1']); nan_nhist(A,'legend',['u=0','u=1'],'separate'); nan_nhist(A,'color','summer') nan_nhist(A,'color',[.3 .8 .3],'separate') nan_nhist(A,'binfactor',4) nan_nhist(A,'samebins') nan_nhist(A,'median','noerror') // example #1: variations around an histogram of a gaussian random sample d=rand(1,10000,'normal'); clf();nan_nhist(d,'proportion') clf();nan_nhist(d) clf();nan_nhist(d,'legend','rand(1,10000,''normal'')','color',[1 0 0],'proportion') //example #2: histogram of a binomial (B(6,0.5)) random sample d = grand(1000,1,"bin", 6, 0.5); clf() subplot(2,1,1) nan_nhist(d,'proportion','legend',"normalized histogram") subplot(2,1,2) nan_nhist(d,'legend',"non normalized histogram") // example #3: histogram of an exponential random sample lambda = 2; X = grand(100000,1,"exp", 1/lambda); Xmax = max(X); clf() nan_nhist(X,'pdf','minx',0,'maxx',max(Xmax)); x = linspace(0,max(Xmax),100)'; plot2d(x,lambda*exp(-lambda*x),strf="000",style=5) legends(["exponential random sample histogram" "exact density curve"],[1,5],opt="ur"); | ![]() | ![]() |