Select a predictor subset for regression
[Q, I, B, BB] = lsselect(y,x,crit,how,pmax,level)
dependant variate (column vector)
regressor variates
selection criterion (string):
'HT' : Hypothesis Test (default level = 0.05)
'AIC' : Akaike's Information Criterion
'BIC': Bayesian Information Criterion
'CMV' : Cross Model Validation (inner criterion RSS)
(string) choses between :
'AS' : All Subsets
'FI' : Forward Inclusion
'BE' : Backward Elimination
limits the number of included parameters (scalar).
optional input argument, p-value reference used for inclusion or deletion.
criterion as a function of the number of parameters; might be interpreted as an estimate of the prediction standard deviation. For the method 'HT', Q is instead the successive p-values for inclusion or elimination.
index numbers of the included columns.
vector of coefficients, ie the suggested model is Y = X*B.
Column p of BB is the best B of parameter size p.
Selects a good subset of regressors in a multiple linear regression model.
The last column of the prediction matrix x must be an intercept column, ie all elements are ones. This column is never excluded in the search for a good model. If it is not present it is added.
This function is not highly optimized for speed but rather for flexibility. It would be faster if 'all subsets' were in a separate routine and 'forward' and 'backward' were in another routine, especially for CMV.