Darwin Help

Back to Index

SvdBestBasis

Function SvdBestBasis - Least squares by selecting best basis (subset)

Calling Sequence  SvdBestBasis(AtA,btA,btb,NData,names,k,svmin,try,startset)
Parameters
NameTypeDescription

AtA matrix(m,m)the product of A^t * A
btA vector(m)the product b^t * A
btb numericthe norm squared of b, i.e. b*b
NData posintnumber of data points (dim A is n x m)
names list(string)names associated with each column of A
k posintnumber of variables in the solution
svmin numericoptional lower limit for using singular values
try posintoptional, trials after a new local minimum
startset list(integer)optional, k column numbers to start
Return Type  SvdResult
Globals SvdBestHash, SvdBest_A, SvdBest_d, SvdGoodBases, SvdGoodPerms, SvdHashSig, Svd_svmin,
Synopsis SvdBestBasis finds the best set of k variables to do a least square fit. For k<=2 this the result is the global minimum (and the variable "try" is ignored), for k>2 this is a heuristic, not an exact algorithm, and its precision depends on how many trials are performed. The problem of finding the best set of variables, when done incrementally, one variable at a time, is called Stepwise regression. The results of SvdBestBasis are generally much better than those obtained by stepwise regression.

The problem is formally defined as follows: Given a matrix of A (dim n x m) and a vector b (dim n), we want to find a vector x (dim m) such that Ax ~ b, where x has k non-zero components and m-k zero components. This approximation is in the least squares sense, i.e. |Ax-b|^2 is minimum.


The output is a SvdResult data structure. The global variable SvdGoodBases is assigned a list of SvdResult data structures for all the other local minima that are found. The global variable SvdGoodPerms is assigned a list of the permutations of the variables which gave the good bases in SvdGoodBases. SvdBestBasis prints information as it computes. The amount of information printed can be regulated with printlevel.


svmin is an optional positive numeric value. All singular values less than svmin will not be used. Making svmin=0, all singular values are used, and this is equivalent to pure least squares. The selection of singular values is used for the final computation of the SvdResult, not for the computation of the best basis.


try is an optional integer. It indicates the number of trials will be done after a new local minima is found before stopping. If omitted, 15 trials are done after the lowest norm has been found.


startset is an optional list of k integers. SvdBestBasis will start its search for an optimal from this set. If try is greater than 1, then other trials, starting at random sets, will also be tried.

See also ExpFit,   LSBestDelete,   LSBestSum,   LSBestSumDelete,   Stat,   SvdAnalysis,   SvdReduceGood,   SvdResult