Hickey, RJ (2007) Structure and Majority Classes in Decision Tree Learning. Journal of Machine Learning Research, 8 . pp. 1747-1768. [Journal article]
Full text not available from this repository.
To provide good classification accuracy on unseen examples, a decision tree, learned by an algorithm such as ID3, must have sufficient structure and also identify the correct majority class in each of its leaves. If there are inadequacies in respect of either of these, the tree will have a percentage classification rate below that of the maximum possible for the domain, namely (100 - Bayes error rate). An error decomposition is introduced which enables the relative contributions of deficiencies in structure and in incorrect determination of majority class to be isolated and quantified. A sub-decomposition of majority class error permits separation of the sampling error at the leaves from the possible bias introduced by the attribute selection method of the induction algorithm. It is shown that sampling error can extend to 25% when there are more than two classes. Decompositions are obtained from experiments on several data sets. For ID3, the effect of selection bias is shown to vary from being statistically non-significant to being quite substantial, with the latter appearing to be associated with a simple underlying model.
|Item Type:||Journal article|
|Faculties and Schools:||Faculty of Computing & Engineering|
Faculty of Computing & Engineering > School of Computing and Information Engineering
|Research Institutes and Groups:||Computer Science Research Institute|
Computer Science Research Institute > Information and Communication Engineering
|Deposited By:||Mr Raymond Hickey|
|Deposited On:||22 Jan 2010 09:38|
|Last Modified:||22 Jan 2010 09:38|
Repository Staff Only: item control page