Package weka.core.stemmers
Class SnowballStemmer
- java.lang.Object
-
- weka.core.stemmers.SnowballStemmer
-
- All Implemented Interfaces:
java.io.Serializable
,OptionHandler
,RevisionHandler
,Stemmer
public class SnowballStemmer extends java.lang.Object implements Stemmer, OptionHandler
A wrapper class for the Snowball stemmers. Only available if the Snowball classes are in the classpath.
If the class discovery is not dynamic, i.e., the property 'UseDynamic' in the props file 'weka/gui/GenericPropertiesCreator.props' is 'false', then the property 'org.tartarus.snowball.SnowballProgram' in the 'weka/gui/GenericObjectEditor.props' file has to be uncommented as well. If necessary you have to discover and fill in the snowball stemmers manually. You can use the 'weka.core.ClassDiscovery' for this:
java weka.core.ClassDiscovery org.tartarus.snowball.SnowballProgram org.tartarus.snowball.ext
For more information visit these web sites:
http://weka.wikispaces.com/Stemmers
http://snowball.tartarus.org/
Valid options are:-S <name> The name of the snowball stemmer (default 'porter'). available stemmers: danish, dutch, english, finnish, french, german, italian, norwegian, porter, portuguese, russian, spanish, swedish
- Version:
- $Revision: 5836 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
PACKAGE
the package name for snowball.static java.lang.String
PACKAGE_EXT
the package name where the stemmers are located.
-
Constructor Summary
Constructors Constructor Description SnowballStemmer()
initializes the stemmer ("porter").SnowballStemmer(java.lang.String name)
initializes the stemmer with the given stemmer.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String[]
getOptions()
Gets the current settings of the classifier.java.lang.String
getRevision()
Returns the revision string.java.lang.String
getStemmer()
returns the name of the current stemmer, null if none is set.java.lang.String
globalInfo()
Returns a string describing the stemmer.static boolean
isPresent()
returns whether Snowball is present or not, i.e.java.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static java.util.Enumeration
listStemmers()
returns an enumeration over all currently stored stemmer names.static void
main(java.lang.String[] args)
Runs the stemmer with the given options.void
setOptions(java.lang.String[] options)
Parses the options.void
setStemmer(java.lang.String name)
sets the stemmer with the given name, e.g., "porter".java.lang.String
stem(java.lang.String word)
Returns the word in its stemmed form.java.lang.String
stemmerTipText()
Returns the tip text for this property.java.lang.String
toString()
returns a string representation of the stemmer.
-
-
-
Field Detail
-
PACKAGE
public static final java.lang.String PACKAGE
the package name for snowball.- See Also:
- Constant Field Values
-
PACKAGE_EXT
public static final java.lang.String PACKAGE_EXT
the package name where the stemmers are located.- See Also:
- Constant Field Values
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing the stemmer.- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses the options. Valid options are:-S <name> The name of the snowball stemmer (default 'porter'). available stemmers: danish, dutch, english, finnish, french, german, italian, norwegian, porter, portuguese, russian, spanish, swedish
- Specified by:
setOptions
in interfaceOptionHandler
- Parameters:
options
- the options to parse- Throws:
java.lang.Exception
- if parsing fails
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Returns:
- an array of strings suitable for passing to setOptions
-
isPresent
public static boolean isPresent()
returns whether Snowball is present or not, i.e. whether the classes are in the classpath or not- Returns:
- whether Snowball is available
-
listStemmers
public static java.util.Enumeration listStemmers()
returns an enumeration over all currently stored stemmer names.- Returns:
- all available stemmers
-
getStemmer
public java.lang.String getStemmer()
returns the name of the current stemmer, null if none is set.- Returns:
- the name of the stemmer
-
setStemmer
public void setStemmer(java.lang.String name)
sets the stemmer with the given name, e.g., "porter".- Parameters:
name
- the name of the stemmer, e.g., "porter"
-
stemmerTipText
public java.lang.String stemmerTipText()
Returns the tip text for this property.- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
stem
public java.lang.String stem(java.lang.String word)
Returns the word in its stemmed form.
-
toString
public java.lang.String toString()
returns a string representation of the stemmer.- Overrides:
toString
in classjava.lang.Object
- Returns:
- a string representation of the stemmer
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-
main
public static void main(java.lang.String[] args)
Runs the stemmer with the given options.- Parameters:
args
- the options
-
-