Class PrototypicalNodeFactory
- java.lang.Object
-
- org.htmlparser.PrototypicalNodeFactory
-
- All Implemented Interfaces:
java.io.Serializable
,NodeFactory
public class PrototypicalNodeFactory extends java.lang.Object implements java.io.Serializable, NodeFactory
A node factory based on the prototype pattern. This factory uses the prototype pattern to generate new nodes. These are cloned as needed to form newText
,Remark
andTag
nodes.Text and remark nodes are generated from prototypes accessed via the
textPrototype
andremarkPrototype
properties respectively. Tag nodes are generated as follows:Prototype tags, in the form of undifferentiated tags, are held in a hash table. On a request for a tag, the attributes are examined for the name of the tag to be created. If a prototype of that name has been registered (exists in the hash table), it is cloned and the clone is given the characteristics (
Attributes
, start and end position) of the requested tag.In the case that no tag has been registered under that name, a generic tag is created from the prototype acessed via the
tagPrototype
property.The hash table of registered tags can be automatically populated with all the known tags from the
org.htmlparser.tags
package when the factory is constructed, or it can start out empty and be populated explicitly.Here is an example of how to override all text issued from
Text.toPlainTextString()
, in this case decoding (converting character references), which illustrates the use of setting the text prototype:PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.setTextPrototype ( // create a inner class that is a subclass of TextNode new TextNode () { public String toPlainTextString() { String original = super.toPlainTextString (); return (org.htmlparser.util.Translate.decode (original)); } }); Parser parser = new Parser (); parser.setNodeFactory (factory);
Here is an example of using a custom link tag, in this case just printing the URL, which illustrates registering a tag:
class PrintingLinkTag extends LinkTag { public void doSemanticAction () throws ParserException { System.out.println (getLink ()); } } PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.registerTag (new PrintingLinkTag ()); Parser parser = new Parser (); parser.setNodeFactory (factory);
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description PrototypicalNodeFactory()
Create a new factory with all tags registered.PrototypicalNodeFactory(boolean empty)
Create a new factory.PrototypicalNodeFactory(Tag tag)
Create a new factory with the given tag as the only registered tag.PrototypicalNodeFactory(Tag[] tags)
Create a new factory with the given tags registered.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
clear()
Clean out the registry.Remark
createRemarkNode(Page page, int start, int end)
Create a new remark node.Text
createStringNode(Page page, int start, int end)
Create a new string node.Tag
createTagNode(Page page, int start, int end, java.util.Vector attributes)
Create a new tag node.Tag
get(java.lang.String id)
Gets a tag from the registry.Remark
getRemarkPrototype()
Get the object that is cloned to generate remark nodes.java.util.Set
getTagNames()
Get the list of tag names.Tag
getTagPrototype()
Get the object that is cloned to generate tag nodes.Text
getTextPrototype()
Get the object that is cloned to generate text nodes.Tag
put(java.lang.String id, Tag tag)
Adds a tag to the registry.void
registerTag(Tag tag)
Register a tag.PrototypicalNodeFactory
registerTags()
Register all known tags in the tag package.Tag
remove(java.lang.String id)
Remove a tag from the registry.void
setRemarkPrototype(Remark remark)
Set the object to be used to generate remark nodes.void
setTagPrototype(Tag tag)
Set the object to be used to generate tag nodes.void
setTextPrototype(Text text)
Set the object to be used to generate text nodes.void
unregisterTag(Tag tag)
Unregister a tag.
-
-
-
Constructor Detail
-
PrototypicalNodeFactory
public PrototypicalNodeFactory()
Create a new factory with all tags registered. Equivalent toPrototypicalNodeFactory(false)
.
-
PrototypicalNodeFactory
public PrototypicalNodeFactory(boolean empty)
Create a new factory.- Parameters:
empty
- Iftrue
, creates an empty factory, otherwise create a new factory with all tags registered.
-
PrototypicalNodeFactory
public PrototypicalNodeFactory(Tag tag)
Create a new factory with the given tag as the only registered tag.- Parameters:
tag
- The single tag to register in the otherwise empty factory.
-
PrototypicalNodeFactory
public PrototypicalNodeFactory(Tag[] tags)
Create a new factory with the given tags registered.- Parameters:
tags
- The tags to register in the otherwise empty factory.
-
-
Method Detail
-
put
public Tag put(java.lang.String id, Tag tag)
Adds a tag to the registry.- Parameters:
id
- The name under which to register the tag. For proper operation, the id should be uppercase so it will be matched by a Map lookup.tag
- The tag to be returned from acreateTagNode(org.htmlparser.lexer.Page, int, int, java.util.Vector)
call.- Returns:
- The tag previously registered with that id if any,
or
null
if none.
-
get
public Tag get(java.lang.String id)
Gets a tag from the registry.- Parameters:
id
- The name of the tag to return.- Returns:
- The tag registered under the
id
name, ornull
if none.
-
remove
public Tag remove(java.lang.String id)
Remove a tag from the registry.- Parameters:
id
- The name of the tag to remove.- Returns:
- The tag that was registered with that
id
, ornull
if none.
-
clear
public void clear()
Clean out the registry.
-
getTagNames
public java.util.Set getTagNames()
Get the list of tag names.- Returns:
- The names of the tags currently registered.
-
registerTag
public void registerTag(Tag tag)
Register a tag. Registers the given tag under everyid
that the tag has (i.e. all names returned bytag.getIds()
.For proper operation, the ids are converted to uppercase so they will be matched by a Map lookup.
- Parameters:
tag
- The tag to register.
-
unregisterTag
public void unregisterTag(Tag tag)
Unregister a tag. Unregisters the given tag from everyid
the tag has.The ids are converted to uppercase to undo the operation of registerTag.
- Parameters:
tag
- The tag to unregister.
-
registerTags
public PrototypicalNodeFactory registerTags()
Register all known tags in the tag package. Registers tags from thetag package
by callingregisterTag()
.- Returns:
- 'this' nodefactory as a convenience.
-
getTextPrototype
public Text getTextPrototype()
Get the object that is cloned to generate text nodes.- Returns:
- The prototype for
Text
nodes. - See Also:
setTextPrototype(org.htmlparser.Text)
-
setTextPrototype
public void setTextPrototype(Text text)
Set the object to be used to generate text nodes.- Parameters:
text
- The prototype forText
nodes. Ifnull
the prototype is set to the default (TextNode
).- See Also:
getTextPrototype()
-
getRemarkPrototype
public Remark getRemarkPrototype()
Get the object that is cloned to generate remark nodes.- Returns:
- The prototype for
Remark
nodes. - See Also:
setRemarkPrototype(org.htmlparser.Remark)
-
setRemarkPrototype
public void setRemarkPrototype(Remark remark)
Set the object to be used to generate remark nodes.- Parameters:
remark
- The prototype forRemark
nodes. Ifnull
the prototype is set to the default (RemarkNode
).- See Also:
getRemarkPrototype()
-
getTagPrototype
public Tag getTagPrototype()
Get the object that is cloned to generate tag nodes. Clones of this object are returned fromcreateTagNode(org.htmlparser.lexer.Page, int, int, java.util.Vector)
when no specific tag is found in the list of registered tags.- Returns:
- The prototype for
Tag
nodes. - See Also:
setTagPrototype(org.htmlparser.Tag)
-
setTagPrototype
public void setTagPrototype(Tag tag)
Set the object to be used to generate tag nodes. Clones of this object are returned fromcreateTagNode(org.htmlparser.lexer.Page, int, int, java.util.Vector)
when no specific tag is found in the list of registered tags.- Parameters:
tag
- The prototype forTag
nodes. Ifnull
the prototype is set to the default (TagNode
).- See Also:
getTagPrototype()
-
createStringNode
public Text createStringNode(Page page, int start, int end)
Create a new string node.- Specified by:
createStringNode
in interfaceNodeFactory
- Parameters:
page
- The page the node is on.start
- The beginning position of the string.end
- The ending position of the string.- Returns:
- A text node comprising the indicated characters from the page.
-
createRemarkNode
public Remark createRemarkNode(Page page, int start, int end)
Create a new remark node.- Specified by:
createRemarkNode
in interfaceNodeFactory
- Parameters:
page
- The page the node is on.start
- The beginning position of the remark.end
- The ending positiong of the remark.- Returns:
- A remark node comprising the indicated characters from the page.
-
createTagNode
public Tag createTagNode(Page page, int start, int end, java.util.Vector attributes)
Create a new tag node. Note that the attributes vector contains at least one element, which is the tag name (standalone attribute) at position zero. This can be used to decide which type of node to create, or gate other processing that may be appropriate.- Specified by:
createTagNode
in interfaceNodeFactory
- Parameters:
page
- The page the node is on.start
- The beginning position of the tag.end
- The ending positiong of the tag.attributes
- The attributes contained in this tag.- Returns:
- A tag node comprising the indicated characters from the page.
-
-