HtmlParser中的各類Filter(1)

全部的Filter均實現了NodeFilter接口,此接口只有一個方法Boolean accept(Node node),用於肯定某個節點 是否屬於此Filter過濾的範圍。 HtmlParser在org.htmlparser.filters包以內一共定義了16個不一樣的Filter,也能夠分爲幾類。html

判斷類Filter: TagNameFilternode

                  HasAttributeFilterorm

                  HasChildFilterhtm

                  HasParentFilter接口

                  HasSiblingFilterip

                  IsEqualFilterget

邏輯運算Filterit

                  AndFilterio

                  NotFilterList

                  OrFilter

                  XorFilter

其餘Filter:

                 NodeClassFilter

                 StringFilter

                 LinkStringFilter

                 LinkRegexFilter

                 RegexFilter

                 CssSelectorNodeFilter

除此以外,能夠自定義一些Filter,用於完成特殊需求的過濾

 

Tag類

  主要和NodeClassFilter配合使用

         Remark:註釋

         AppletTag:

         BaseHrefTag:

         Body Tag:"BODY";//getBody();內部調用額是toPlainTextString();

         Bullet:"LI"

         BulletList:"UL","OL"

         CompositeTag:

         DefinitionList:"DL"

         DefinitionListBullet:"DD","DT"

         Div:"DIV"

         DoctypeTag:「!DOCTYPE"

         FormTag:

         FrameSetTag:

         FrameTag:

         HeadingTag:"H1","H2","H3","H4","H5","H6"

         HeadTag:"HEAD"

         Html:"HTML"

         ImageTag:

         InputTag:"INPUT"

         JspTag:"%","%=","%@"

         LabelTag:"LABEL"

        

         LinkTag:

         MetaTag:

         ObjectTag:

         OptionTag:

         ParagraphTag:"P"

         ProcessingInstructionTag:"?"

         ScriptTag:

         SelectTag:"SELECT"

         Span:"SPAN"

         StyleTag:"STYLE"

          TableColumn:"TD"

          TableHeader:"TH"

          TableRow:"TR"

          TableTag:"TABLE"

          TagNode:

          TextareaTag:"TEXTAREA"

          TitleTag:"TITLE"

           TextNode:

相關文章
相關標籤/搜索