HTML Parser Home Page

org.htmlparser.lexer.nodes
Class TagNode

java.lang.Object
  extended byorg.htmlparser.AbstractNode
      extended byorg.htmlparser.lexer.nodes.TagNode
All Implemented Interfaces:
Node, Serializable
Direct Known Subclasses:
Tag

public class TagNode
extends AbstractNode

TagNode represents a generic tag.

See Also:
Serialized Form

Field Summary
protected static Hashtable breakTags
          Set of tags that breaks the flow.
protected  Vector mAttributes
          The tag attributes.
 
Fields inherited from class org.htmlparser.AbstractNode
children, mPage, nodeBegin, nodeEnd, parent
 
Constructor Summary
TagNode()
          Create an empty tag.
TagNode(Page page, int start, int end, Vector attributes)
          Create a tag with the location and attributes provided
 
Method Summary
 void accept(Object visitor)
          Apply the visitor object (of type NodeVisitor) to this node.
 boolean breaksFlow()
          Determines if the given tag breaks the flow of text.
 String getAttribute(String name)
          Returns the value of an attribute.
 Attribute getAttributeEx(String name)
          Returns the attribute with the given name.
 Hashtable getAttributes()
          Gets the attributes in the tag.
 Vector getAttributesEx()
          Gets the attributes in the tag.
 int getEndingLineNumber()
          Get the line number where this tag ends.
 String getParameter(String name)
          Deprecated. use getAttribute instead
 Hashtable getParsed()
          Deprecated. This method is deprecated. Use getAttributes() instead.
 String getRawTagName()
          Return the name of this tag.
 int getStartingLineNumber()
          Get the line number where this tag starts.
 int getTagBegin()
          Gets the nodeBegin.
 int getTagEnd()
          Gets the nodeEnd.
 String getTagName()
          Return the name of this tag.
 String getText()
          Return the text contained in this tag.
 boolean isEmptyXmlTag()
          Is this an empty xml tag of the form <tag/>.
 boolean isEndTag()
          Predicate to determine if this tag is an end tag (i.e.
 void removeAttribute(String key)
          Remove the attribute with the given key, if it exists.
 void setAttribute(Attribute attribute)
          Set an attribute.
 void setAttribute(String key, String value)
          Set attribute with given key, value pair.
 void setAttribute(String key, String value, char quote)
          Set attribute with given key, value pair where the value is quoted by quote.
 void setAttributes(Hashtable attributes)
          Sets the attributes.
 void setAttributesEx(Vector attribs)
          Sets the attributes.
 void setEmptyXmlTag(boolean emptyXmlTag)
          Set this tag to be an empty xml node, or not.
 void setTagBegin(int tagBegin)
          Sets the nodeBegin.
 void setTagEnd(int tagEnd)
          Sets the nodeEnd.
 void setTagName(String name)
          Set the name of this tag.
 void setText(String text)
          Sets the string contents of the node.
 String toHtml()
          Render the tag as HTML.
 String toPlainTextString()
          Get the plain text from this node.
 String toString()
          Print the contents of the tag
 
Methods inherited from class org.htmlparser.AbstractNode
collectInto, doSemanticAction, elementBegin, elementEnd, getChildren, getEndPosition, getPage, getParent, getStartPosition, setChildren, setEndPosition, setPage, setParent, setStartPosition, toHTML
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

mAttributes

protected Vector mAttributes
The tag attributes. Objects of type Attribute.


breakTags

protected static Hashtable breakTags
Set of tags that breaks the flow.

Constructor Detail

TagNode

public TagNode()
Create an empty tag.


TagNode

public TagNode(Page page,
               int start,
               int end,
               Vector attributes)
Create a tag with the location and attributes provided

Parameters:
page - The page this tag was read from.
start - The starting offset of this node within the page.
end - The ending offset of this node within the page.
attributes - The list of attributes that were parsed in this tag.
See Also:
Attribute
Method Detail

getAttribute

public String getAttribute(String name)
Returns the value of an attribute.

Parameters:
name - Name of attribute, case insensitive.
Returns:
The value associated with the attribute or null if it does not exist, or is a stand-alone or

setAttribute

public void setAttribute(String key,
                         String value)
Set attribute with given key, value pair. Figures out a quote character to use if necessary.

Parameters:
key - The name of the attribute.
value - The value of the attribute.

removeAttribute

public void removeAttribute(String key)
Remove the attribute with the given key, if it exists.

Parameters:
key - The name of the attribute.

setAttribute

public void setAttribute(String key,
                         String value,
                         char quote)
Set attribute with given key, value pair where the value is quoted by quote.

Parameters:
key - The name of the attribute.
value - The value of the attribute.
quote - The quote character to be used around value. If zero, it is an unquoted value.

getAttributeEx

public Attribute getAttributeEx(String name)
Returns the attribute with the given name.

Parameters:
name - Name of attribute, case insensitive.
Returns:
The attribute or null if it does not exist.

setAttribute

public void setAttribute(Attribute attribute)
Set an attribute. This replaces an attribute of the same name. To set the zeroth attribute (the tag name), use setTagName().

Parameters:
attribute - The attribute to set.

getParameter

public String getParameter(String name)
Deprecated. use getAttribute instead

Eqivalent to getAttribute (name).

Parameters:
name - Name of attribute.

getAttributesEx

public Vector getAttributesEx()
Gets the attributes in the tag.

Returns:
Returns the list of Attributes in the tag.

getAttributes

public Hashtable getAttributes()
Gets the attributes in the tag. This is not the preferred method to get attributes, see getAttributesEx which returns a list of Attribute objects, which offer more information than the simple String objects available from this Hashtable.

Returns:
Returns a list of name/value pairs representing the attributes. These are not in order, the keys (names) are converted to uppercase and the values are not quoted, even if they need to be. The table will return null if there was no value for an attribute (no equals sign or nothing to the right of the equals sign). A special entry with a key of SpecialHashtable.TAGNAME ("$$") holds the tag name. The conversion to uppercase is performed with an ENGLISH locale.

getTagName

public String getTagName()
Return the name of this tag.

Note: This value is converted to uppercase and does not begin with "/" if it is an end tag. Nor does it end with a slash in the case of an XML type tag. To get at the original text of the tag name use getRawTagName(). The conversion to uppercase is performed with an ENGLISH locale.

Returns:
The tag name.

getRawTagName

public String getRawTagName()
Return the name of this tag.

Returns:
The tag name or null if this tag contains nothing or only whitespace.

setTagName

public void setTagName(String name)
Set the name of this tag. This creates or replaces the first attribute of the tag (the zeroth element of the attribute vector).

Parameters:
name - The tag name.

getText

public String getText()
Return the text contained in this tag.

Specified by:
getText in interface Node
Overrides:
getText in class AbstractNode
Returns:
The complete contents of the tag (within the angle brackets).

setAttributes

public void setAttributes(Hashtable attributes)
Sets the attributes.

Parameters:
attributes - The attribute collection to set.

setAttributesEx

public void setAttributesEx(Vector attribs)
Sets the attributes. NOTE: Values of the extended hashtable are two element arrays of String, with the first element being the original name (not uppercased), and the second element being the value.

Parameters:
attribs - The attribute collection to set.

setTagBegin

public void setTagBegin(int tagBegin)
Sets the nodeBegin.

Parameters:
tagBegin - The nodeBegin to set

getTagBegin

public int getTagBegin()
Gets the nodeBegin.

Returns:
The nodeBegin value.

setTagEnd

public void setTagEnd(int tagEnd)
Sets the nodeEnd.

Parameters:
tagEnd - The nodeEnd to set

getTagEnd

public int getTagEnd()
Gets the nodeEnd.

Returns:
The nodeEnd value.

setText

public void setText(String text)
Description copied from class: AbstractNode
Sets the string contents of the node.

Specified by:
setText in interface Node
Overrides:
setText in class AbstractNode
Parameters:
text - The new text for the node.

toPlainTextString

public String toPlainTextString()
Get the plain text from this node.

Specified by:
toPlainTextString in interface Node
Specified by:
toPlainTextString in class AbstractNode
Returns:
An empty string (tag contents do not display in a browser). If you want this tags HTML equivalent, use toHtml().

toHtml

public String toHtml()
Render the tag as HTML. A call to a tag's toHtml() method will render it in HTML.

Specified by:
toHtml in interface Node
Specified by:
toHtml in class AbstractNode
Returns:
The tag as an HTML fragment.
See Also:
Node.toHtml()

toString

public String toString()
Print the contents of the tag

Specified by:
toString in interface Node
Specified by:
toString in class AbstractNode
Returns:
java.lang.String

breaksFlow

public boolean breaksFlow()
Determines if the given tag breaks the flow of text.

Returns:
true if following text would start on a new line, false otherwise.

getParsed

public Hashtable getParsed()
Deprecated. This method is deprecated. Use getAttributes() instead.

Returns table of attributes in the tag

Returns:
Hashtable

accept

public void accept(Object visitor)
Description copied from interface: Node
Apply the visitor object (of type NodeVisitor) to this node.

Specified by:
accept in interface Node
Specified by:
accept in class AbstractNode

isEmptyXmlTag

public boolean isEmptyXmlTag()
Is this an empty xml tag of the form <tag/>.

Returns:
true if the last character of the last attribute is a '/'.

setEmptyXmlTag

public void setEmptyXmlTag(boolean emptyXmlTag)
Set this tag to be an empty xml node, or not. Adds or removes an ending slash on the tag.

Parameters:
emptyXmlTag - If true, ensures there is an ending slash in the node, i.e. <tag/>, otherwise removes it.

isEndTag

public boolean isEndTag()
Predicate to determine if this tag is an end tag (i.e. </HTML>).

Returns:
true if this tag is an end tag.

getStartingLineNumber

public int getStartingLineNumber()
Get the line number where this tag starts.

Returns:
The (zero based) line number in the page where this tag starts.

getEndingLineNumber

public int getEndingLineNumber()
Get the line number where this tag ends.

Returns:
The (zero based) line number in the page where this tag ends.

© 2004 Somik Raha
Mar 14, 2004

HTML Parser is an open source library released under LGPL.
SourceForge.net