HTML Parser Home Page

org.htmlparser.visitors
Class NodeVisitor

java.lang.Object
  extended byorg.htmlparser.visitors.NodeVisitor
Direct Known Subclasses:
HtmlPage, LinkFindingVisitor, ObjectFindingVisitor, StringBean, StringFindingVisitor, TagFindingVisitor, TextExtractingVisitor, UrlModifyingVisitor

public abstract class NodeVisitor
extends Object

The base class for the 'Visitor' pattern. Classes that wish to use visitAllNodesWith() will subclass this class and provide implementations for methods they are interested in processing.

The operation of visitAllNodesWith() is to call beginParsing(), then visitXXX() according to the types of nodes encountered in depth-first order and finally finishedParsing().

There are currently three specialized visitXXX() calls for titles, images and links. Thes call their specialized visit, and then perform the generic processing. Typical code to print all the link tags:

 import org.htmlparser.Parser;
 import org.htmlparser.tags.LinkTag;
 import org.htmlparser.util.ParserException;
 import org.htmlparser.visitors.NodeVisitor;
 
 public class Visitor extends NodeVisitor
 {
     public Visitor ()
     {
     }
     public void visitLinkTag (LinkTag linkTag)
     {
         System.out.println (linkTag);
     }
     public static void main (String[] args) throws ParserException
     {
         Parser parser = new Parser ("http://cbc.ca");
         Visitor visitor = new Visitor ();
         parser.visitAllNodesWith (visitor);
     }
 }
 


Constructor Summary
NodeVisitor()
           
NodeVisitor(boolean recurseChildren)
           
NodeVisitor(boolean recurseChildren, boolean recurseSelf)
           
 
Method Summary
 void beginParsing()
          Override this method if you wish to do special processing prior to the start of parsing.
 void finishedParsing()
          Override this method if you wish to do special processing upon completion of parsing.
 boolean shouldRecurseChildren()
           
 boolean shouldRecurseSelf()
           
 void visitEndTag(Tag tag)
           
 void visitImageTag(ImageTag imageTag)
           
 void visitLinkTag(LinkTag linkTag)
           
 void visitRemarkNode(RemarkNode remarkNode)
           
 void visitStringNode(StringNode stringNode)
           
 void visitTag(Tag tag)
           
 void visitTitleTag(TitleTag titleTag)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NodeVisitor

public NodeVisitor()

NodeVisitor

public NodeVisitor(boolean recurseChildren)

NodeVisitor

public NodeVisitor(boolean recurseChildren,
                   boolean recurseSelf)
Method Detail

beginParsing

public void beginParsing()
Override this method if you wish to do special processing prior to the start of parsing.


visitTag

public void visitTag(Tag tag)

visitEndTag

public void visitEndTag(Tag tag)

visitStringNode

public void visitStringNode(StringNode stringNode)

visitRemarkNode

public void visitRemarkNode(RemarkNode remarkNode)

finishedParsing

public void finishedParsing()
Override this method if you wish to do special processing upon completion of parsing.


visitLinkTag

public void visitLinkTag(LinkTag linkTag)

visitImageTag

public void visitImageTag(ImageTag imageTag)

visitTitleTag

public void visitTitleTag(TitleTag titleTag)

shouldRecurseChildren

public boolean shouldRecurseChildren()

shouldRecurseSelf

public boolean shouldRecurseSelf()

© 2004 Somik Raha
Mar 14, 2004

HTML Parser is an open source library released under LGPL.
SourceForge.net