一:什麼是RSS
RSS(really simple syndication) :網頁內容聚合器。RSS的格式是XML。必須符合XML 1.0規範。 RSS的做用:訂閱BLOG,訂閱新聞 二:RSS的歷史版本: http://blogs.law.harvard.edu/tech/rssVersionHistory RSS的版本有不少個,0.90、0.9一、0.9二、0.9三、0.9四、1.0 和 2.0。與RSS相對的還有ATOM。 國內主要是RSS2.0,國外主要用ATOM0.3. 因爲RSS出現2派,致使混亂場面。其中RSS2.0規範由哈佛大學定義並鎖定。 地址:http://blogs.law.harvard.edu/tech/rss 三:RSS 文件形式 1:例子: <?xml version="1.0"?> <rss version="2.0"> <channel> <title>The channel's name goes here</title> <link>http://www.urlofthechannel.com/</link> <description>This channel is an example channel for an article. </description> <language>en-us</language> <image> <title>The image title goes here</title> <url>http://www.urlofthechannel.com/images/logo.gif</url> <link>http://www.urlofthechannel.com/</link> </image> <item> <title>The Future of content</title> <link>http://www.itworld.com/nl/ecom_in_act/11122003/</link> <description> The issue of people distributing and reusing digital media is a problem for many businesses. It may also be a hidden opportunity. Just as open source licensing has opened up new possibilities in the world of technology, it promises to do the same in the area of creative content.</description> </item> <item> <title>Online Music Services - Better than free?</title> <link>http://www.itworld.com/nl/ecom_in_act/08202003/</link> <description>More people than ever are downloading music from the Internet. Many use person-to-person file sharing programs like Kazaa to share and download music in MP3 format, paying nothing. This has made it difficult for companies to setup online music businesses. How can companies compete against free?</description> </item> </channel> </rss> 2:RSS文件由一個 <channel> 元素及其子元素組成。除了頻道內容自己以外,<channel> 還以項的形式包含表示頻道元數據的元素 —— 好比 <title>、<link> 和 <description>。 項一般是頻道的主要部分,包含常常變化的內容。 3:頻道(channel)用<channel>表示 頻道通常有三個元素,提供關於頻道自己的信息: <title>:頻道或提要的名稱。 <link>:與該頻道關聯的 Web 站點或者站點區域的 URL。 <description>:簡要介紹該頻道是作什麼的。 許多頻道子元素都是可選的。經常使用的 <image> 元素包含三個必需的子元素: <url>:表示該頻道的 GIF、JPEG 或 PNG 圖像的 URL。 <title>:圖象的描述。當頻道以 HTML 呈現時,用做 HTML <image> 標籤的 ALT 屬性。 <link>:站點的 URL。若是頻道以 HTML 呈現,該圖像做爲到這個站點的連接。 <image> 還有三個可選的子元素: <width>:數字,表示圖象的像素寬度,最大值是 188,默認值爲 88。 <height>:數字,表示圖象的像素高度。最大值是 400,默認值爲 31。 <description>:包含文本,在呈現時能夠做爲圍繞着該圖像造成的連接元素的 title 屬性。 此外還可使用許多其餘可選的頻道元素。多數都是不言自明的: <language>:en-us <copyright>:Copyright 2003, James Lewin <managingEditor>: dan@spam_me.com (Dan Deletekey) <webMaster>: dan@spam_me.com (Dan Deletekey) <pubDate>:Sat, 15 Nov 2003 0:00:01 GMT <lastBuildDate>:Sat, 15 Nov 2003 0:00:01 GMT <category>:ebusiness <generator>:Your CMS 2.0 <docs>: http://blogs.law.harvard.edu/tech/rss <cloud>:容許進程註冊爲「cloud」,頻道更新時通知它,爲 RSS 提要實現了一種輕量級的發佈-訂閱協議。 <ttl>:存活時間 是一個數字,表示提要在刷新以前緩衝的分鐘數。 <rating>:關於該頻道的 PICS 評價。 <textInput>:定義可與頻道一塊兒顯示的輸入框。 <skipHours>:告訴彙集器哪些小時的更新能夠忽略。 <skipDays>:告訴彙集器那一天的更新能夠忽略。 4:摘要(feed)用<item>表示,<item>的格式以下: 每一個摘要一般包含三個元素: <title>:這是項的名稱,在標準應用中被轉換成 HTML 中的標題。 <link>:這是該項的 URL。title 一般做爲一個連接,指向包含在 <link> 元素中的 URL。 <description>:一般做爲 link 中所指向的 URL 的摘要或者補充。 全部的元素都是可選的,可是一個項至少要麼 包含一個 <title>,要麼包含一個 <description>。 項還有其餘一些可選的元素: <author>:做者的 e-mail 地址。 <category>:支持有組織的記錄。 <comments>:關於項的註釋頁的 URL。 <enclosure>:支持和該項有關的媒體對象。 <guid>:惟一與該項聯繫在一塊兒的永久性連接。 <pubDate>:該項是何時發佈的。 <source>:該項來自哪一個 RSS 頻道,當 四:主流java rss lib及其評測: 主要有一下幾種: 1:Rome: http://wiki.java.net/bin/view/Javawsxml/Rome Rome是 java.net 上的一個開源項目,如今的版本是0.5。爲何叫Rome呢,按它的介紹上的說法,有個「條條大路通羅馬」的意思,有些RSS的意味。Rome多是 sun 公司從本身某個子項目中抽離出來的,package和類的命名就象j2sdk同樣感受規範。功能上支持RSS的全部版本及 Atom 0.3(Atom是和RSS相似的一種內容聚合的方式)。Rome 自己是提供API和功能實現. 2:rssutils: http://gceclub.sun.com.cn/staticcontent/html/2004-04-22/rss.html rssutils是一個工具包,sun 的 develope站點上有文章 RSS Utilities: A Tutorial 專門介紹用taglib 顯示RSS內容,附帶的能夠下載這個工具包,但我從網上搜索不到它的出處,天然也沒法看到它的源碼。但從反編譯的代碼來看,也是sun公司內部高手所作,設計精巧,代碼簡練。實現一個handler,用sax的方式解析xml內容,handler內部用反射和javabean的機制構造RSS元素對象並賦值。 3rsslib4j: http://sourceforge.net/projects/rsslib4j rsslib4j 是 sourceforget 上的項目,一樣支持全部RSS版本。 4:rsslibj:http://enigmastation.com/rsslibj/ 5:總結 Rome: 優 - 1)可擴展性好,有前途。2)功能強大,除了用來解析RSS,還能夠聚合和構造RSS。 劣 - 1)兼容性待增強,2)綁定jdom。 rssutils: 優 - 1)代碼設計精妙,值得學習。2)附帶 taglib 實現,直接可在 jsp 中應用。 劣 - 1)沒有源碼。 2)兼容性有待增強。 3)功能較弱,只能用來解析RSS,沒有聚合和構造RSS功能。 rsslib4j: 優 - 1)簡單有效,體積小。2)兼容性不錯。 劣 - 1)有小bug。2)功能較弱,只能用來解析RSS,沒有聚合和構造RSS功能。 rsslibj: 優 - 1)簡單有效,體積小,才25K。2)能解析和生成RSS(動態和靜態) 劣 - 1)有小bug。2)版本好久沒有更新了,陳舊.
五:選擇ROME做爲RSS實現工具
在官網 http://wiki.java.net/bin/view/Javawsxml/Rome下載rome-0.8.jar, rome用到了jdom1.0,下載地址:http://www.jdom.org rome支持:rss_0.9 rss_0.91 rss_0.92 rss_0.93 rss_0.94 rss_1.0 rss_2.0 atom_0.3 atom_1.0 生成RSS類新須要在程序中指定,如:rss_2.0 六:包結構 com.sun.syndication.feed 提供RSS and Atom beans的父類 com.sun.syndication.feed.atom 提供實現Atom feeds核心元素的beans com.sun.syndication.feed.module 提供處理聚合modules的beans com.sun.syndication.feed.rss 提供實現Rss feeds核心元素的beans com.sun.syndication.feed.synd 咱們主要用的就是這個包,SyndFeed and SyndEntryImpl com.sun.syndication.io 提供對讀取和分析feeds的輸入和輸出 七:實例:
1:讀取遠端url的rss,而後輸出到控制檯:
/** * 關鍵代碼: * SyndFeedInput input = new SyndFeedInput(); * SyndFeed feed = input.build(new XmlReader(feedUrl)); */
package com.sun.syndication.samples;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.io.SyndFeedInput; import com.sun.syndication.io.XmlReader; import java.net.URL;
/**
* It Reads and prints any RSS/Atom feed type. */ public class FeedReader {
public static void main(String[] args) {
boolean ok = false; if (args.length==0) { try { URL feedUrl = new URL(" http://seu.org.cn/bbs/rss.php"); //SyndFeedInput:從遠程讀到xml結構的內容轉成SyndFeedImpl實例 SyndFeedInput input = new SyndFeedInput(); //rome按SyndFeed類型生成rss和atom的實例, //SyndFeed是rss和atom實現類SyndFeedImpl的接口 SyndFeed feed = input.build(new XmlReader(feedUrl)); //打印到控制檯 System.out.println(feed); ok = true; } catch (Exception ex) { ex.printStackTrace(); System.out.println("ERROR: "+ex.getMessage()); } }
if (!ok) {
System.out.println(); System.out.println("FeedReader reads and prints any RSS/Atom feed type."); System.out.println("The first parameter must be the URL of the feed to read."); System.out.println(); } } } 2:將多個遠程RSS在本地彙集成一個RSS package com.sun.syndication.samples;
import java.net.URL;
import java.io.InputStreamReader; import java.io.PrintWriter; import java.util.List; import java.util.ArrayList;
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.feed.synd.SyndFeedImpl; import com.sun.syndication.io.SyndFeedOutput; import com.sun.syndication.io.SyndFeedInput; import com.sun.syndication.io.XmlReader;
/**
* It aggregates a list of RSS/Atom feeds (they can be of different types) * into a single feed of the specified type. * <p> * @author Alejandro Abdelnur * */ public class FeedAggregator {
public static void main(String[] args) {
boolean ok = false; if (args.length>=2) { try { String outputType = args[0];
SyndFeed feed = new SyndFeedImpl();
feed.setFeedType(outputType);
feed.setTitle("Aggregated Feed");
feed.setDescription("Anonymous Aggregated Feed"); feed.setAuthor("anonymous"); feed.setLink(" http://www.anonymous.com");
List entries = new ArrayList();
feed.setEntries(entries);
for (int i=1;i<args.length;i++) {
URL inputUrl = new URL(args[i]);
SyndFeedInput input = new SyndFeedInput();
SyndFeed inFeed = input.build(new XmlReader(inputUrl));
entries.addAll(inFeed.getEntries());
&nbp; }
SyndFeedOutput output = new SyndFeedOutput();
output.output(feed,new PrintWriter(System.out));
ok = true;
} catch (Exception ex) { System.out.println("ERROR: "+ex.getMessage()); } }
if (!ok) {
System.out.println(); System.out.println("FeedAggregator aggregates different feeds into a single one."); System.out.println("The first parameter must be the feed type for the aggregated feed."); System.out.println(" [valid values are: rss_0.9, rss_0.91U, rss_0.91N, rss_0.92, rss_0.93, ]"); System.out.println(" [ rss_0.94, rss_1.0, rss_2.0 & atom_0.3 ]"); System.out.println("The second to last parameters are the URLs of feeds to aggregate."); System.out.println(); } }
}
3:將動態生成的RSS存盤,造成靜態RSS package com.sun.syndication.samples;
import com.sun.syndication.feed.synd.*;
import com.sun.syndication.io.SyndFeedOutput;
import java.io.FileWriter;
import java.io.Writer; import java.text.DateFormat; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.List;
/**
* It creates a feed and writes it to a file. * <p> * @author Alejandro Abdelnur * */ public class FeedWriter {
private static final String DATE_FORMAT = "yyyy-MM-dd";
public static void main(String[] args) {
boolean ok = false; if (args.length==0) { try { String feedType = "rss_2.0";//指定rss類型 String fileName = "F:\\ss.xml";//靜態rss存放目錄
DateFormat dateParser = new SimpleDateFormat(DATE_FORMAT);
//feed是經過SyndFeedImpl的實例
SyndFeed feed = new SyndFeedImpl(); feed.setFeedType(feedType);
feed.setTitle("Sample Feed (created with Rome)");
feed.setLink(" http://rome.dev.java.net"); feed.setDescription("This feed has been created using Rome (Java syndication utilities"); //entries就是item集合 List entries = new ArrayList(); //一個entry就是一個item SyndEntry entry; SyndContent description; //第一個item entry = new SyndEntryImpl(); entry.setTitle("Rome v1.0"); entry.setLink(" http://wiki.java.net/bin/view/Javawsxml/Rome01"); entry.setPublishedDate(dateParser.parse("2004-06-08")); description = new SyndContentImpl(); description.setType("text/plain"); description.setValue("Initial release of Rome"); entry.setDescription(description); entries.add(entry);
//第二個item
entry = new SyndEntryImpl(); entry.setTitle("Rome v2.0"); entry.setLink(" http://wiki.java.net/bin/view/Javawsxml/Rome02"); entry.setPublishedDate(dateParser.parse("2004-06-16")); description = new SyndContentImpl(); description.setType("text/xml"); description.setValue("Bug fixes, <xml>XML</xml> minor API changes and some new features"); entry.setDescription(description); entries.add(entry);
//第三個item
entry = new SyndEntryImpl(); entry.setTitle("Rome v3.0"); entry.setLink(" htp://wiki.java.net/bin/view/Javawsxml/Rome03"); entry.setPublishedDate(dateParser.parse("2004-07-27")); description = new SyndContentImpl(); description.setType("text/html"); description.setValue("<p>More Bug fixes, mor API changes, some new features and some Unit testing</p>"+ "<p>For details check the <a href=\" Changes">http://wiki.java.net/bin/view/Javawsxml/RomeChangesLog#RomeV03\">Changes Log</a></p>"); entry.setDescription(description); //將全部item存入entries entries.add(entry); //將entries加入channel feed.setEntries(entries);
Writer writer = new FileWriter(fileName);
SyndFeedOutput output = new SyndFeedOutput(); //存盤,造成靜態rss output.output(feed,writer); writer.close();
System.out.println("The feed has been written to the file ["+fileName+"]");
System.out.println(feed);
ok = true;
} catch (Exception ex) { ex.printStackTrace(); System.out.println("ERROR: "+ex.getMessage()); } }
if (!ok) {
System.out.println(); System.out.println("FeedWriter creates a RSS/Atom feed and writes it to a file."); System.out.println("The first parameter must be the syndication format for the feed"); System.out.println(" (rss_0.90, rss_0.91, rss_0.92, rss_0.93, rss_0.94, rss_1.0 rss_2.0 or atom_0.3)"); System.out.println("The second parameter must be the file name for the feed"); System.out.println(); } }
}
4:動態生成rss,給一個blog站點動態生成rss
package com.vaga.rss.web.admin;
import java.io.IOException;
import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.ArrayList; import java.util.Iterator; import java.util.List;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.springframework.web.servlet.ModelAndView;
import org.springframework.web.servlet.mvc.ParameterizableViewController;
import com.sun.syndication.feed.synd.SyndContent;
import com.sun.syndication.feed.synd.SyndContentImpl; import com.sun.syndication.feed.synd.SyndEntry; import com.sun.syndication.feed.synd.SyndEntryImpl; import com.sun.syndication.feed.synd.SyndFeed; import com.sun.syndication.feed.synd.SyndFeedImpl; import com.sun.syndication.io.FeedException; import com.sun.syndication.io.SyndFeedOutput; import com.sun.syndication.feed.synd.SyndContent; import com.sun.syndication.feed.synd.SyndEntry; import com.totsp.xml.syndication.content.ContentModule; import com.vaga.blog.model.WeblogEntry; import com.vaga.blog.model.Website; import com.vaga.blog.service.WeblogEntryManager; import com.vaga.blog.service.WebsiteManager;
public class SiteRssViewController extends ParameterizableViewController {
// Constants /** Namespace URI for content:encoded elements */ private static final String CONTENT_NS =" http://purl.org/rss/1.0/modules/content/"; private static final String FEED_TYPE = "type"; private static final String MIME_TYPE = "application/xml; charset=UTF-8"; private static final String COULD_NOT_GENERATE_FEED_ERROR = "Could not generate feed"; private static final String _defaultFeedType="rss_2.0"; private static final String DATE_FORMAT = "yyyy-MM-dd"; //controller starts private WeblogEntryManager weblogEntryManager;//spring依賴注入 private WebsiteManager websiteManager; //spring依賴注入 //spring依賴注入 public void setWeblogEntryManager(WeblogEntryManager weblogEntryManager) { this.weblogEntryManager = weblogEntryManager; } //spring依賴注入 public void setWebsiteManager(WebsiteManager websiteManager) { this.websiteManager = websiteManager; } protected ModelAndView handleRequestInternal(HttpServletRequest request,HttpServletResponse response) throws Exception { try { SyndFeed feed = getFeed(request); String feedType = request.getParameter(FEED_TYPE);//null feedType = (feedType!=null) ? feedType : _defaultFeedType; feed.setFeedType(feedType);//rss_2.0 response.setContentType(MIME_TYPE); SyndFeedOutput output = new SyndFeedOutput(); output.output(feed,response.getWriter());//向發出請求的用戶輸出該RSS(xml格式) } catch (FedException ex) { String msg = COULD_NOT_GENERATE_FEED_ERROR; log(msg,ex); response.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR,msg); } return null; }
/**
* 請求的類型以下: * siteRss.htm?websiteId=21 |ID=66的我的站點最新20條文章 * siteRss.htm?websiteId=21&entryType=hot |ID=66的我的站點最熱20條文章 * * @param request */ protected SyndFeed getFeed(HttpServletRequest request) throws IOException,FeedException { DateFormat dateParser = new SimpleDateFormat(DATE_FORMAT); //feed就是channel SyndFeed feed = new SyndFeedImpl(); //item集合 List entries = new ArrayList(); //一個entry就是表明一個item SyndEntry entry; SyndContent description; setFeed(request,feed); Iterator iterator = setIterator(request); //將文章的20記錄轉成20個item while(iterator.hasNext()){ entry = new SyndEntryImpl(); WeblogEntry weblogEntry = (WeblogEntry)iterator.next(); entry.setTitle(weblogEntry.getTitle()); entry.setLink(feed.getLink()+"?weblogEntryId="+weblogEntry.getId()); try { entry.setPublishedDate(dateParser.parse(weblogEntry.getPubTime().toString())); } catch (ParseException ex) { ex.printStackTrace(); } //該item的description description = new SyndContentImpl(); description.setType("text/plain"); String text=null; if(weblogEntry.getText().length()>500){ text = weblogEntry.getText().substring(0, 500); }else{ text = weblogEntry.getText(); } description.setValue(text); entry.setDescription(description); addFooter(entry); entries.add(entry); } //將全部的item存入channel feed.setEntries(entries); return feed; } private SyndFeed setFeed(HttpServletRequest request,SyndFeed feed){ //blog中的website Website website = websiteManager.getWebsite(request.getParameter("websiteId")); 設置當前website的channel屬性 feed.setTitle(website.getName()); feed.setAuthor(website.getCreator()); feed.setCopyright(website.getEmailAddress()); feed.setLink(" http://wxz.vaga.com.cn:8080/blog/weblog/"+website.getHandle()); feed.setDescription(website.getDescription()); return feed; }
//從數據庫中得到20條該website的文章
private Iterator setIterator(HttpServletRequest request){ if(request.getParameter("entryType")==null){ return weblogEntryManager.getRecentWeblogEntriesForRss(request.getParameter("websiteId"), null, "PUBLISHED", 21).iterator(); }else{ return weblogEntryManager.getHotWeblogEntriesForRss(request.getParameter("websiteId"), null, 21).iterator(); } } /** * Add footer to an entry.給每一個文章摘要添加頁腳 * @param entry */ public static void addFooter(SyndEntry entry) { // Prep variables used in loops String title = entry.getTitle(); String link = entry.getLink();
// Use the add-on ContentModule to handle
// <content:encoded/> elments within the feed ContentModule module =((ContentModule) entry.getModule(CONTENT_NS));
// If content:encoded is found, use that.
if(module!=null) { // Container for footer-appended HTML strings List newStringList = new ArrayList();
// Iterate through encoded HTML, creating footers
Iterator oldStringIter =module.getEncodeds().iterator(); while (oldStringIter.hasNext()) { String original = (String) oldStringIter.next(); newStringList.add(createFooter(original,link, title)); }
// Set new encoded HTML strings on entry
module.setEncodeds(newStringList); } else { // Fall back to adding footer in <description/> // This results in escaped HTML. Ugly, but common. //Target the description node SyndContent content = entry.getDescription(); // Create and set a footer-appended description String original = content.getValue(); content.setValue(createFooter(original,link, title)); } }
/**
* Create a feed item footer of immediate actions * by using information from the feed item itself * @param original The original text of the feed item * @param link The link for the feed item * @param title The title of the feed item * @return */ private static String createFooter(String original, String link,String title) { // Use StringBuffer to create a sb StringBuffer sb; if(original==null){ sb=new StringBuffer("<br />"); }else{ sb= new StringBuffer(original); } sb.append("\n\n<div class='feedwarmer'><hr/>"); sb.append("<i>相關操做:</i> ");
// Add email link using title and item link
sb.append("<a href='mailto:?body=Check this out: "); sb.append(link).append("'>推薦該連接</a> | ");
// Add delicious link using item title link
sb.append("<a href='http://del.icio.us/post/?url="); sb.append(link).append("&title=").append(title); sb.append("'>添加到delicious</a> | ");
// Add Google Blogs Search link using item title
sb.append("<a href='http://blogsearch.google.com/"); sb.append("blogsearch?hl=en&q=").append(title); sb.append("'>搜索相關內容</a>");
// Finish and return the sb sb.append("</div>\n"); return sb.toString(); } }
|