HTML Parser Home Page

org.htmlparser.nodeDecorators
Class EscapeCharacterRemovingNode

java.lang.Object
  extended byorg.htmlparser.nodeDecorators.AbstractNodeDecorator
      extended byorg.htmlparser.nodeDecorators.EscapeCharacterRemovingNode
All Implemented Interfaces:
Node

public class EscapeCharacterRemovingNode
extends AbstractNodeDecorator


Field Summary
 
Fields inherited from class org.htmlparser.nodeDecorators.AbstractNodeDecorator
delegate
 
Constructor Summary
EscapeCharacterRemovingNode(Node newDelegate)
           
 
Method Summary
 String toPlainTextString()
          Returns a string representation of the node.
 
Methods inherited from class org.htmlparser.nodeDecorators.AbstractNodeDecorator
accept, collectInto, doSemanticAction, elementBegin, elementEnd, equals, getChildren, getEndPosition, getParent, getStartPosition, getText, setChildren, setEndPosition, setParent, setStartPosition, setText, toHtml, toString
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EscapeCharacterRemovingNode

public EscapeCharacterRemovingNode(Node newDelegate)
Method Detail

toPlainTextString

public String toPlainTextString()
Description copied from interface: Node
Returns a string representation of the node. This is an important method, it allows a simple string transformation of a web page, regardless of a node.
Typical application code (for extracting only the text from a web page) would then be simplified to :
 Node node;
 for (Enumeration e = parser.elements();e.hasMoreElements();) {
    node = (Node)e.nextElement();
    System.out.println(node.toPlainTextString()); // Or do whatever processing you wish with the plain text string
 }
 

Specified by:
toPlainTextString in interface Node
Overrides:
toPlainTextString in class AbstractNodeDecorator

© 2004 Somik Raha
Mar 14, 2004

HTML Parser is an open source library released under LGPL.
SourceForge.net