Package weka.classifiers.trees
Class SimpleCart
- java.lang.Object
-
- weka.classifiers.Classifier
-
- weka.classifiers.RandomizableClassifier
-
- weka.classifiers.trees.SimpleCart
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,AdditionalMeasureProducer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
public class SimpleCart extends RandomizableClassifier implements AdditionalMeasureProducer, TechnicalInformationHandler
Class implementing minimal cost-complexity pruning.
Note when dealing with missing values, use "fractional instances" method instead of surrogate split method.
For more information, see:
Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California. BibTeX:@book{Breiman1984, address = {Belmont, California}, author = {Leo Breiman and Jerome H. Friedman and Richard A. Olshen and Charles J. Stone}, publisher = {Wadsworth International Group}, title = {Classification and Regression Trees}, year = {1984} }
Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
- Version:
- $Revision: 10491 $
- Author:
- Haijian Shi (hs69@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description SimpleCart()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
buildClassifier(Instances data)
Build the classifier.void
calculateAlphas()
Updates the alpha field for all nodes.double[]
distributionForInstance(Instance instance)
Computes class probabilities for instance using the decision tree.java.util.Enumeration
enumerateMeasures()
Return an enumeration of the measure names.Capabilities
getCapabilities()
Returns default capabilities of the classifier.boolean
getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.double
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.double
getMinNumObj()
Get minimal number of instances at the terminal nodes.int
getNumFoldsPruning()
Set number of folds in internal cross-validation.java.lang.String[]
getOptions()
Gets the current settings of the classifier.java.lang.String
getRevision()
Returns the revision string.double
getSizePer()
Get training set size.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.boolean
getUseOneSE()
Get if use the 1SE rule to choose final model.boolean
getUsePrune()
Get if use minimal cost-complexity pruning.java.lang.String
globalInfo()
Return a description suitable for displaying in the explorer/experimenter.java.lang.String
heuristicTipText()
Returns the tip text for this propertyjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] args)
Main method.double
measureTreeSize()
Return number of tree size.java.lang.String
minNumObjTipText()
Returns the tip text for this propertyvoid
modelErrors()
Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted.java.lang.String
numFoldsPruningTipText()
Returns the tip text for this propertyint
numInnerNodes()
Method to count the number of inner nodes in the tree.int
numLeaves()
Compute number of leaf nodes.int
numNodes()
Compute size of the tree.void
prune(double alpha)
Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.int
prune(double[] alphas, double[] errors, Instances test)
Method for performing one fold in the cross-validation of minimal cost-complexity pruning.void
setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.void
setMinNumObj(double value)
Set minimal number of instances at the terminal nodes.void
setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.void
setOptions(java.lang.String[] options)
Parses a given list of options.void
setSizePer(double value)
Set training set size.void
setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.void
setUsePrune(boolean value)
Set if use minimal cost-complexity pruning.java.lang.String
sizePerTipText()
Returns the tip text for this propertyjava.lang.String
toString()
Prints the decision tree using the protected toString method from below.void
treeErrors()
Updates the numIncorrectTree field for all nodes.java.lang.String
useOneSETipText()
Returns the tip text for this propertyjava.lang.String
usePruneTipText()
Return the tip text for this property-
Methods inherited from class weka.classifiers.RandomizableClassifier
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Return a description suitable for displaying in the explorer/experimenter.- Returns:
- a description suitable for displaying in the explorer/experimenter
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
getCapabilities
public Capabilities getCapabilities()
Returns default capabilities of the classifier.- Specified by:
getCapabilities
in interfaceCapabilitiesHandler
- Overrides:
getCapabilities
in classClassifier
- Returns:
- the capabilities of this classifier
- See Also:
Capabilities
-
buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
Build the classifier.- Specified by:
buildClassifier
in classClassifier
- Parameters:
data
- the training instances- Throws:
java.lang.Exception
- if something goes wrong
-
prune
public void prune(double alpha) throws java.lang.Exception
Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.- Parameters:
alpha
- the cost-complexity parameter- Throws:
java.lang.Exception
- if something goes wrong
-
prune
public int prune(double[] alphas, double[] errors, Instances test) throws java.lang.Exception
Method for performing one fold in the cross-validation of minimal cost-complexity pruning. Generates a sequence of alpha-values with error estimates for the corresponding (partially pruned) trees, given the test set of that fold.- Parameters:
alphas
- array to hold the generated alpha-valueserrors
- array to hold the corresponding error estimatestest
- test set of that fold (to obtain error estimates)- Returns:
- the iteration of the pruning
- Throws:
java.lang.Exception
- if something goes wrong
-
modelErrors
public void modelErrors() throws java.lang.Exception
Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted. This is needed for calculating the alpha-values.- Throws:
java.lang.Exception
- if something goes wrong
-
treeErrors
public void treeErrors() throws java.lang.Exception
Updates the numIncorrectTree field for all nodes. This is needed for calculating the alpha-values.- Throws:
java.lang.Exception
- if something goes wrong
-
calculateAlphas
public void calculateAlphas() throws java.lang.Exception
Updates the alpha field for all nodes.- Throws:
java.lang.Exception
- if something goes wrong
-
distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
Computes class probabilities for instance using the decision tree.- Overrides:
distributionForInstance
in classClassifier
- Parameters:
instance
- the instance for which class probabilities is to be computed- Returns:
- the class probabilities for the given instance
- Throws:
java.lang.Exception
- if something goes wrong
-
toString
public java.lang.String toString()
Prints the decision tree using the protected toString method from below.- Overrides:
toString
in classjava.lang.Object
- Returns:
- a textual description of the classifier
-
numNodes
public int numNodes()
Compute size of the tree.- Returns:
- size of the tree
-
numInnerNodes
public int numInnerNodes()
Method to count the number of inner nodes in the tree.- Returns:
- the number of inner nodes
-
numLeaves
public int numLeaves()
Compute number of leaf nodes.- Returns:
- number of leaf nodes
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableClassifier
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-M <min no> The minimal number of instances at the terminal nodes. (default 2)
-N <num folds> The number of folds used in the minimal cost-complexity pruning. (default 5)
-U Don't use the minimal cost-complexity pruning. (default yes).
-H Don't use the heuristic method for binary split. (default true).
-A Use 1 SE rule to make pruning decision. (default no).
-C Percentage of training data size (0-1]. (default 1).
- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableClassifier
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an options is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableClassifier
- Returns:
- the current setting of the classifier
-
enumerateMeasures
public java.util.Enumeration enumerateMeasures()
Return an enumeration of the measure names.- Specified by:
enumerateMeasures
in interfaceAdditionalMeasureProducer
- Returns:
- an enumeration of the measure names
-
measureTreeSize
public double measureTreeSize()
Return number of tree size.- Returns:
- number of tree size
-
getMeasure
public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.- Specified by:
getMeasure
in interfaceAdditionalMeasureProducer
- Parameters:
additionalMeasureName
- the name of the measure to query for its value- Returns:
- the value of the named measure
- Throws:
java.lang.IllegalArgumentException
- if the named measure is not supported
-
minNumObjTipText
public java.lang.String minNumObjTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMinNumObj
public void setMinNumObj(double value)
Set minimal number of instances at the terminal nodes.- Parameters:
value
- minimal number of instances at the terminal nodes
-
getMinNumObj
public double getMinNumObj()
Get minimal number of instances at the terminal nodes.- Returns:
- minimal number of instances at the terminal nodes
-
numFoldsPruningTipText
public java.lang.String numFoldsPruningTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumFoldsPruning
public void setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.- Parameters:
value
- number of folds in internal cross-validation.
-
getNumFoldsPruning
public int getNumFoldsPruning()
Set number of folds in internal cross-validation.- Returns:
- number of folds in internal cross-validation.
-
usePruneTipText
public java.lang.String usePruneTipText()
Return the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUsePrune
public void setUsePrune(boolean value)
Set if use minimal cost-complexity pruning.- Parameters:
value
- if use minimal cost-complexity pruning
-
getUsePrune
public boolean getUsePrune()
Get if use minimal cost-complexity pruning.- Returns:
- if use minimal cost-complexity pruning
-
heuristicTipText
public java.lang.String heuristicTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setHeuristic
public void setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.- Parameters:
value
- if use heuristic search for nominal attributes in multi-class problems
-
getHeuristic
public boolean getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.- Returns:
- if use heuristic search for nominal attributes in multi-class problems
-
useOneSETipText
public java.lang.String useOneSETipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setUseOneSE
public void setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.- Parameters:
value
- if use the 1SE rule to choose final model
-
getUseOneSE
public boolean getUseOneSE()
Get if use the 1SE rule to choose final model.- Returns:
- if use the 1SE rule to choose final model
-
sizePerTipText
public java.lang.String sizePerTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui.
-
setSizePer
public void setSizePer(double value)
Set training set size.- Parameters:
value
- training set size
-
getSizePer
public double getSizePer()
Get training set size.- Returns:
- training set size
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
public static void main(java.lang.String[] args)
Main method.- Parameters:
args
- the options for the classifier
-
-