Package com.swabunga.spell.event
Class AbstractWordTokenizer
- java.lang.Object
-
- com.swabunga.spell.event.AbstractWordTokenizer
-
- All Implemented Interfaces:
WordTokenizer
- Direct Known Subclasses:
FileWordTokenizer
,StringWordTokenizer
public abstract class AbstractWordTokenizer extends java.lang.Object implements WordTokenizer
This class tokenizes a input string.It also allows for the string to be mutated. The result after the spell checking is completed is available to the call to getFinalText
- Author:
- Jason Height(jheight@chariot.net.au), Anthony Roy (ajr@antroy.co.uk)
-
-
Field Summary
Fields Modifier and Type Field Description protected Word
currentWord
The word being analyzedprotected WordFinder
finder
The word finder used to filter out words which are non pertinent to spell checkingprotected java.text.BreakIterator
sentenceIterator
An iterator to work through the sentenceprotected int
wordCount
The cumulative word count that have been processed
-
Constructor Summary
Constructors Constructor Description AbstractWordTokenizer(WordFinder wf)
Creates a new AbstractWordTokenizer object.AbstractWordTokenizer(java.lang.String text)
Creates a new AbstractWordTokenizer object.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description java.lang.String
getContext()
Returns the current text that is being tokenized (includes any changes that have been made)int
getCurrentWordCount()
Returns the current number of words that have been processedint
getCurrentWordEnd()
Returns the end of the current word in the textint
getCurrentWordPosition()
Returns the index of the start of the current word in the textboolean
hasMoreWords()
Returns true if there are more words that can be processed in the stringboolean
isNewSentence()
returns true if the current word is at the start of a sentencejava.lang.String
nextWord()
Returns searches for the next word in the text, and returns that word.abstract void
replaceWord(java.lang.String newWord)
Replaces the current word token
-
-
-
Field Detail
-
currentWord
protected Word currentWord
The word being analyzed
-
finder
protected WordFinder finder
The word finder used to filter out words which are non pertinent to spell checking
-
sentenceIterator
protected java.text.BreakIterator sentenceIterator
An iterator to work through the sentence
-
wordCount
protected int wordCount
The cumulative word count that have been processed
-
-
Constructor Detail
-
AbstractWordTokenizer
public AbstractWordTokenizer(java.lang.String text)
Creates a new AbstractWordTokenizer object.- Parameters:
text
- the text to process.
-
AbstractWordTokenizer
public AbstractWordTokenizer(WordFinder wf)
Creates a new AbstractWordTokenizer object.- Parameters:
wf
- the custom WordFinder to use in searching for words.
-
-
Method Detail
-
getCurrentWordCount
public int getCurrentWordCount()
Returns the current number of words that have been processed- Specified by:
getCurrentWordCount
in interfaceWordTokenizer
- Returns:
- number of words so far iterated.
-
getCurrentWordEnd
public int getCurrentWordEnd()
Returns the end of the current word in the text- Specified by:
getCurrentWordEnd
in interfaceWordTokenizer
- Returns:
- index in string of the end of the current word.
- Throws:
WordNotFoundException
- current word has not yet been set.
-
getCurrentWordPosition
public int getCurrentWordPosition()
Returns the index of the start of the current word in the text- Specified by:
getCurrentWordPosition
in interfaceWordTokenizer
- Returns:
- index in string of the start of the current word.
- Throws:
WordNotFoundException
- current word has not yet been set.
-
hasMoreWords
public boolean hasMoreWords()
Returns true if there are more words that can be processed in the string- Specified by:
hasMoreWords
in interfaceWordTokenizer
- Returns:
- true if there are further words in the text.
-
nextWord
public java.lang.String nextWord()
Returns searches for the next word in the text, and returns that word.- Specified by:
nextWord
in interfaceWordTokenizer
- Returns:
- the string representing the current word.
- Throws:
WordNotFoundException
- search string contains no more words.
-
replaceWord
public abstract void replaceWord(java.lang.String newWord)
Replaces the current word token- Specified by:
replaceWord
in interfaceWordTokenizer
- Parameters:
newWord
- replacement word.- Throws:
WordNotFoundException
- current word has not yet been set.
-
getContext
public java.lang.String getContext()
Returns the current text that is being tokenized (includes any changes that have been made)- Specified by:
getContext
in interfaceWordTokenizer
- Returns:
- the text being tokenized.
-
isNewSentence
public boolean isNewSentence()
returns true if the current word is at the start of a sentence- Specified by:
isNewSentence
in interfaceWordTokenizer
- Returns:
- true if the current word starts a sentence.
- Throws:
WordNotFoundException
- current word has not yet been set.
-
-