Class RegressionByDiscretization

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler

    public class RegressionByDiscretization
    extends SingleClassifierEnhancer
    A regression scheme that employs any classifier on a copy of the data that has the class attribute (equal-width) discretized. The predicted value is the expected value of the mean class value for each discretized interval (based on the predicted probabilities for each interval).

    Valid options are:

     -B <int>
      Number of bins for equal-width discretization
      (default 10).
     
     -E
      Whether to delete empty bins after discretization
      (default false).
     
     -F
      Use equal-frequency instead of equal-width discretization.
     -D
      If set, classifier is run in debug mode and
      may output additional info to the console
     -W
      Full name of base classifier.
      (default: weka.classifiers.trees.J48)
     
     Options specific to classifier weka.classifiers.trees.J48:
     
     -U
      Use unpruned tree.
     -C <pruning confidence>
      Set confidence threshold for pruning.
      (default 0.25)
     -M <minimum number of instances>
      Set minimum number of instances per leaf.
      (default 2)
     -R
      Use reduced error pruning.
     -N <number of folds>
      Set number of folds for reduced error
      pruning. One fold is used as pruning set.
      (default 3)
     -B
      Use binary splits only.
     -S
      Don't perform subtree raising.
     -L
      Do not clean up after the tree has been built.
     -A
      Laplace smoothing for predicted probabilities.
     -Q <seed>
      Seed for random data shuffling (default 1).
    Version:
    $Revision: 4746 $
    Author:
    Len Trigg (trigg@cs.waikato.ac.nz), Eibe Frank (eibe@cs.waikato.ac.nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • RegressionByDiscretization

        public RegressionByDiscretization()
        Default constructor.
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing classifier
        Returns:
        a description suitable for displaying in the explorer/experimenter gui
      • buildClassifier

        public void buildClassifier​(Instances instances)
                             throws java.lang.Exception
        Generates the classifier.
        Specified by:
        buildClassifier in class Classifier
        Parameters:
        instances - set of instances serving as training data
        Throws:
        java.lang.Exception - if the classifier has not been generated successfully
      • classifyInstance

        public double classifyInstance​(Instance instance)
                                throws java.lang.Exception
        Returns a predicted class for the test instance.
        Overrides:
        classifyInstance in class Classifier
        Parameters:
        instance - the instance to be classified
        Returns:
        predicted class value
        Throws:
        java.lang.Exception - if the prediction couldn't be made
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a given list of options.

        Valid options are:

         -B <int>
          Number of bins for equal-width discretization
          (default 10).
         
         -E
          Whether to delete empty bins after discretization
          (default false).
         
         -F
          Use equal-frequency instead of equal-width discretization.
         -D
          If set, classifier is run in debug mode and
          may output additional info to the console
         -W
          Full name of base classifier.
          (default: weka.classifiers.trees.J48)
         
         Options specific to classifier weka.classifiers.trees.J48:
         
         -U
          Use unpruned tree.
         -C <pruning confidence>
          Set confidence threshold for pruning.
          (default 0.25)
         -M <minimum number of instances>
          Set minimum number of instances per leaf.
          (default 2)
         -R
          Use reduced error pruning.
         -N <number of folds>
          Set number of folds for reduced error
          pruning. One fold is used as pruning set.
          (default 3)
         -B
          Use binary splits only.
         -S
          Don't perform subtree raising.
         -L
          Do not clean up after the tree has been built.
         -A
          Laplace smoothing for predicted probabilities.
         -Q <seed>
          Seed for random data shuffling (default 1).
        Specified by:
        setOptions in interface OptionHandler
        Overrides:
        setOptions in class SingleClassifierEnhancer
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • numBinsTipText

        public java.lang.String numBinsTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getNumBins

        public int getNumBins()
        Gets the number of bins numeric attributes will be divided into
        Returns:
        the number of bins.
      • setNumBins

        public void setNumBins​(int numBins)
        Sets the number of bins to divide each selected numeric attribute into
        Parameters:
        numBins - the number of bins
      • deleteEmptyBinsTipText

        public java.lang.String deleteEmptyBinsTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getDeleteEmptyBins

        public boolean getDeleteEmptyBins()
        Gets the number of bins numeric attributes will be divided into
        Returns:
        the number of bins.
      • setDeleteEmptyBins

        public void setDeleteEmptyBins​(boolean b)
        Sets the number of bins to divide each selected numeric attribute into
        Parameters:
        numBins - the number of bins
      • useEqualFrequencyTipText

        public java.lang.String useEqualFrequencyTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • getUseEqualFrequency

        public boolean getUseEqualFrequency()
        Get the value of UseEqualFrequency.
        Returns:
        Value of UseEqualFrequency.
      • setUseEqualFrequency

        public void setUseEqualFrequency​(boolean newUseEqualFrequency)
        Set the value of UseEqualFrequency.
        Parameters:
        newUseEqualFrequency - Value to assign to UseEqualFrequency.
      • toString

        public java.lang.String toString()
        Returns a description of the classifier.
        Overrides:
        toString in class java.lang.Object
        Returns:
        a description of the classifier as a string.
      • main

        public static void main​(java.lang.String[] argv)
        Main method for testing this class.
        Parameters:
        argv - the options