| Calling Sequence | LeastSquaresTree(Dist,Var)
LeastSquaresTree(Dist,Var,Labels) LeastSquaresTree(Dist,Var,Labels,IniTree,Keep) | ||||||||||||||||||||||||||||||
| Parameters |
| ||||||||||||||||||||||||||||||
| Return Type | Tree | ||||||||||||||||||||||||||||||
| Synopsis | This function computes a binary tree which approximates the given distances Dist by least squares. The distances are assumed to have a variance given by the matrix Var. If a list Labels is given, the leaf of the resulting trees are labelled with these values. The Leaf nodes produced have 3 fields: (1) the label given (or their integer index if no Labels are given), (2) the height of the Leaf and (3) their integer index. If the global variable MinLen is assigned a positive value, it will determine the minimum branch length. If not set, 1/1000th of the average distance between leaves is used. The quality of the fit is measured by the sum of the squares of the weighted deviations divided by (n-2)(n-3)/2. This value is stored in the global variable MST_Qual. A dimensionless fitting index is also computed, it is the MST_Qual / variance(Dist) * harmonic_mean(Var). This value is printed and stored in the global variable DimensionlessFit. Trees built over the same set of species, even with radically different methods, can be ranked by the quality of their fit with this index. If the fourth parameter has a Tree, then this tree is taken and optimized. | ||||||||||||||||||||||||||||||
If the fourth argument is the word "Random", then the optimization is started over a random tree. For large trees it makes sense to try several random trees and choose the one with the best MST_Qual. When starting with random trees, the global variable MST_Prob can be set to any numerical value between 0 and 1. Values close to 1 select trees which are very close to the one given by Neighbour Joining. Values close to 0 select completely random trees. Leaving MST_Prob unassigned is equivalent to using NJRandom. | |||||||||||||||||||||||||||||||
When "NJRandom" is used, a Neighbour-joining like tree is make with a variable level of randomness at each step which may produce better random trees. | |||||||||||||||||||||||||||||||
When the word KeepTopology is used, the optimization is done only on the branch lengths. This is useful to optimize the branches of a given tree. | |||||||||||||||||||||||||||||||
The function Tree_matrix extracts the distance matrix from a tree. It is sort of the inverse of LeastSquaresTree. | |||||||||||||||||||||||||||||||
| Examples | > D := [[0, 3, 13, 10], [3, 0, 14, 11], [13, 14, 0, 9], [10, 11, 9, 0]]; D := [[0, 3, 13, 10], [3, 0, 14, 11], [13, 14, 0, 9], [10, 11, 9, 0]] > V := [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]; V := [[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]] > LeastSquaresTree(D, V); dimensionless fitting index 0 > t := LeastSquaresTree(D, V, [AA, BB, CC, DD]); dimensionless fitting index 0 > print(Tree_matrix(t)); 0 3 13 10 3 0 14 11 13 14 0 9 10 11 9 0 | ||||||||||||||||||||||||||||||
| See also | BootstrapTree, ComputeDimensionlessFit, DrawTree, GapTree, Leaf, PhylogeneticTree, RBFS_Tree, SignedSynteny, Synteny, Tree, ViewPlot | ||||||||||||||||||||||||||||||