關於寫過關於JAVA採集入庫的三篇文章:html
基於Java數據採集入庫(一):http://www.cnblogs.com/lichenwei/p/3904715.html
java
基於Java數據採集入庫(二):http://www.cnblogs.com/lichenwei/p/3905370.htmlmysql
基於Java數據採集入庫(三):http://www.cnblogs.com/lichenwei/p/3907007.htmlsql
分別實現了數據庫
①抓取頁面信息並顯示數組
②簡單採集入庫存儲緩存
③調用本地數據庫查詢服務器
④遠程調用實現操做(未實現)ide
以上這些功能都是基於本地的,有時候咱們須要遠程去調用這類數據,這時咱們就能夠用JAVA提供的RMI機制實行遠程調用訪問。ui
固然也能夠用WebServices實現(PHP版本,有時間再寫個JAVA版本的):http://www.cnblogs.com/lichenwei/p/3891297.html
什麼是RMI?
RMI 指的是遠程方法調用 (Remote Method Invocation)。它是一種機制,可以讓在某個 Java虛擬機上的對象調用另外一個 Java 虛擬機中的對象上的方法。能夠用此方法調用的任何對象必須實現該遠程接口。調用這樣一個對象時,其參數爲 "marshalled" 並將其從本地虛擬機發送到遠程虛擬機(該遠程虛擬機的參數爲 "unmarshalled")上。該方法終止時,將編組來自遠程機的結果並將結果發送到調用方的虛擬機。若是方法調用致使拋出異常,則該異常將指示給調用方。
簡單瞭解下RMI,看下簡單實現吧
一、定義遠程接口
首先,咱們須要寫個遠程接口IHello 該接口繼承了遠程對象Remote.
接口IHello裏面有個hello的方法,用於客戶端鏈接後 打招呼.
因爲IHello繼承了遠程Remote對象, 因此須要拋一個 RemoteException 遠程異常.
1 import java.rmi.Remote; 2 import java.rmi.RemoteException; 3 4 5 public interface IHello extends Remote{ 6 7 public String hello(String name) throws RemoteException; 8 }
二、實現接口
接下來,咱們實現下 該接口裏的方法, 實現接口的方法在服務端.
這裏的HelloImpl類 實現了接口IHello裏的方法.
注意:這裏HelloImpl一樣繼承了 UnicastRemoteObject 遠程對象,這個必須寫,否則服務端啓動後會莫名其妙報錯.
1 import java.rmi.RemoteException; 2 import java.rmi.server.UnicastRemoteObject; 3 4 /** 5 * UnicastRemoteObject 這個必須寫,雖然不寫代碼也不會出錯,但在運行服務器的時候會出現莫名錯誤 6 * @author Balla_兔子 7 * 8 */ 9 public class HelloImpl extends UnicastRemoteObject implements IHello { 10 11 protected HelloImpl() throws RemoteException { 12 super(); 13 } 14 15 @Override 16 public String hello(String name) { 17 String strHello="你好!"+name+"正在訪問服務端"; 18 System.out.println(name+"正在訪問服務端"); 19 return strHello; 20 } 21 22 }
服務端,因爲RMI實現遠程訪問的機制是指:客戶端經過在RMI註冊表上尋找遠程接口對象的地址(服務端地址) 達到實現遠程訪問的目的,
因此,咱們須要在服務端建立一個遠程對象的註冊表,用於綁定和註冊 服務端地址 和 遠程接口對象,便於後期客戶端可以成功找到服務端
1 import java.rmi.Naming; 2 import java.rmi.RemoteException; 3 import java.rmi.registry.LocateRegistry; 4 5 6 public class Server { 7 8 /** 9 * @param args 10 */ 11 public static void main(String[] args) { 12 try { 13 IHello hello=new HelloImpl(); 14 int port=6666; 15 LocateRegistry.createRegistry(port); 16 String address="rmi://localhost:"+port+"/tuzi"; 17 Naming.bind(address, hello); 18 System.out.println(">>>服務端啓動成功"); 19 System.out.println(">>>請啓動客戶端進行鏈接訪問.."); 20 21 } catch (Exception e) { 22 e.printStackTrace(); 23 } 24 } 25 26 }
客戶端上一樣須要定義一個 遠程訪問的地址 - 即服務端地址,
而後,經過在RMI註冊表上尋找該地址; 若是找到 則創建鏈接.
1 import java.net.MalformedURLException; 2 import java.rmi.Naming; 3 import java.rmi.NotBoundException; 4 import java.rmi.RemoteException; 5 import java.util.Scanner; 6 7 8 public class Client { 9 public static void main(String[] args) { 10 11 int port=6666; 12 String address="rmi://localhost:"+port+"/tuzi"; 13 try { 14 IHello hello=(IHello) Naming.lookup(address); 15 System.out.println("<<<客戶端訪問成功!"); 16 //客戶端 Client 調用 遠程接口裏的 sayHello 方法 並打印出來 17 System.out.println(hello.hello("Rabbit")); 18 Scanner scanner=new Scanner(System.in); 19 String input=scanner.next(); 20 } catch (MalformedURLException e) { 21 // TODO Auto-generated catch block 22 e.printStackTrace(); 23 } catch (RemoteException e) { 24 // TODO Auto-generated catch block 25 e.printStackTrace(); 26 } catch (NotBoundException e) { 27 // TODO Auto-generated catch block 28 e.printStackTrace(); 29 } 30 31 } 32 }
運行效果圖:
華麗的分割線
接下來就來看看咱們的程序吧,今天換種口味來採集下《2013-2014賽季常規賽排名》
這是數據網址:http://nbadata.sports.qq.com/teams_stat.aspx
先上效果圖:
好了,剩下的上代碼吧,具體看代碼註釋:
IdoAction.java (功能調用接口代碼)
1 package com.lcw.rmi.collection; 2 3 import java.rmi.Remote; 4 import java.rmi.RemoteException; 5 import java.util.List; 6 7 public interface IdoAction extends Remote{ 8 9 10 public void initData() throws RemoteException; 11 12 public void getAllDatas() throws RemoteException; 13 14 public List<String> getAllTeams() throws RemoteException; 15 16 public List<String> getTeamInfo(String team) throws RemoteException; 17 18 public List<String> getAllInfo() throws RemoteException; 19 20 }
doActionImpl.java (接口實現類)
1 package com.lcw.rmi.collection; 2 3 import java.rmi.RemoteException; 4 import java.rmi.server.UnicastRemoteObject; 5 import java.sql.ResultSet; 6 import java.sql.SQLException; 7 import java.util.ArrayList; 8 import java.util.List; 9 10 public class doActionImpl extends UnicastRemoteObject implements IdoAction { 11 12 /** 13 * 14 */ 15 private static final long serialVersionUID = 1L; 16 private Mysql mysql; 17 private ResultSet resultSet; 18 19 public doActionImpl() throws RemoteException { 20 mysql = new Mysql(); 21 } 22 23 @Override 24 public void getAllDatas() throws RemoteException { 25 // 調用採集類,獲取全部數據 26 CollectData data = new CollectData(); 27 data.getAllDatas(); 28 System.out.println("數據採集成功!"); 29 } 30 31 @Override 32 public List<String> getAllInfo() throws RemoteException { 33 // 查詢全部數據 34 String sql = "select * from data"; 35 resultSet = mysql.querySQL(sql); 36 List<String> list=new ArrayList<String>(); 37 System.out.println("當前執行命令5,正在獲取NBA(2013-2014)賽季常規賽隊伍全部信息.."); 38 System.out.println("獲取成功,已在客戶端展現.."); 39 try { 40 while(resultSet.next()) { 41 for (int i = 2; i < 17; i++) { 42 //System.out.println("++++++++++++++");調試 43 list.add(resultSet.getString(i)); 44 } 45 System.out.println(); 46 } 47 } catch (SQLException e) { 48 e.printStackTrace(); 49 } 50 return list; 51 } 52 53 @Override 54 public List<String> getAllTeams() throws RemoteException { 55 // 查詢全部隊伍名稱 56 String sql = "select team from data"; 57 resultSet = mysql.querySQL(sql); 58 List<String> list = new ArrayList<String>(); 59 System.out.println("當前執行命令3,正在獲取NBA(2013-2014)賽季常規賽隊伍.."); 60 System.out.println("獲取成功,已在客戶端展現.."); 61 try { 62 while (resultSet.next()) { 63 list.add(resultSet.getString("team")); 64 } 65 } catch (SQLException e) { 66 System.out.println("數據庫暫無信息,請執行自動化採集命令"); 67 e.printStackTrace(); 68 } 69 return list; 70 71 } 72 73 @Override 74 public List<String> getTeamInfo(String team) throws RemoteException { 75 // 根據隊伍查詢隊伍信息 76 ResultSet resultSet = mysql.querySQL("select * from data where team='" 77 + team + "'"); 78 List<String> list=new ArrayList<String>(); 79 System.out.println("當前執行命令4,正在獲取用戶所查詢隊伍信息.."); 80 System.out.println("獲取成功,已在客戶端展現.."); 81 try { 82 if (resultSet.next()) { 83 for (int i = 2; i < 17; i++) { 84 list.add(resultSet.getString(i)); 85 } 86 } 87 System.out.println(); 88 } catch (SQLException e) { 89 System.out.println("數據庫暫無信息,請執行自動化採集命令"); 90 e.printStackTrace(); 91 } 92 return list; 93 } 94 95 @Override 96 public void initData() throws RemoteException { 97 // 初始化數據庫 98 String sql = "delete from data"; 99 try { 100 mysql.updateSQL(sql); 101 System.out.println("數據庫初始化成功!"); 102 } catch (Exception e) { 103 System.out.println("數據庫初始化失敗!"); 104 } 105 106 } 107 108 }
CollectData.java (採集主類)
1 package com.lcw.rmi.collection; 2 3 import java.io.BufferedReader; 4 import java.io.IOException; 5 import java.io.InputStream; 6 import java.io.InputStreamReader; 7 import java.net.MalformedURLException; 8 import java.net.URL; 9 import java.util.ArrayList; 10 import java.util.Arrays; 11 import java.util.List; 12 13 public class CollectData { 14 15 /** 16 * 採集類,獲取全部數據 17 */ 18 public void getAllDatas() { 19 String address = "http://nbadata.sports.qq.com/teams_stat.aspx";// 要採集數據的url 20 try { 21 URL url = new URL(address); 22 try { 23 InputStream inputStream = url.openStream();// 打開url,返回字節流 24 InputStreamReader inputStreamReader = new InputStreamReader( 25 inputStream, "gbk");// 將字節流轉換爲字符流,編碼utf-8 26 BufferedReader reader = new BufferedReader(inputStreamReader);// 提升效率,緩存 27 String rankRegEx = ">\\d{1,2}</td>";// 排名正則 28 String teamRegEx = ">[^<>]*</a>";// 隊名正則 29 String dataRegEx = ">\\d{1,3}(\\.)\\d{0,2}</td>";// 正常數據正則 30 String percentRegEX = ">\\d{1,2}(\\.)*(\\d)*%</span></td>";// 百分比數據 31 GetRegExData regExData = new GetRegExData(); 32 String temp = "";// 存放臨時讀取數據 33 int flag = 0; 34 String tempRank = "";// 存放匹配到的返回數據 35 String tempTeam = "";// 存放匹配到的返回數據 36 String tempData = ""; 37 String tempPercent = ""; 38 List<String> list = new ArrayList<String>(); 39 Mysql mysql = new Mysql(); 40 while ((temp = reader.readLine()) != null) { 41 // 匹配排名 42 if ((tempRank = regExData.getData(rankRegEx, temp)) != "") { 43 tempRank = tempRank.substring(1, tempRank 44 .indexOf("</td>")); 45 // System.out.println("排名:" + tempRank); 46 list.add(tempRank); 47 flag++; 48 } 49 // 匹配球隊 50 // 因爲該正則會匹配到其餘地方的數據,需給它一個標識符,讓它從"找到排名位置"纔開始匹配 51 if ((tempTeam = regExData.getData(teamRegEx, temp)) != "" 52 && flag == 1) { 53 tempTeam = tempTeam.substring(1, tempTeam 54 .indexOf("</a>")); 55 // System.out.println("球隊名稱:" + tempTeam); 56 list.add(tempTeam); 57 flag = 0; 58 } 59 // 匹配正常數據 60 if ((tempData = regExData.getData(dataRegEx, temp)) != "") { 61 tempData = tempData.substring(1, tempData 62 .indexOf("</td>")); 63 // System.out.println(tempData); 64 list.add(tempData); 65 66 } 67 // 匹配百分比數據 68 if ((tempPercent = regExData.getData(percentRegEX, temp)) != "") { 69 tempPercent = tempPercent.substring(1, tempPercent 70 .indexOf("</span></td>")); 71 // System.out.println(tempPercent); 72 list.add(tempPercent); 73 } 74 75 } 76 reader.close(); 77 Object[] arr = list.toArray();// 將集合轉換爲數組 78 int a = -15; 79 int b = 0; 80 String sql = "insert into data(rank,team,chushou1,mingzhong1,chushou2,mingzhong2,chushou3,mingzhong3,qianchang,houchang,zong,zhugong,shiwu,fangui,defen) values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"; 81 for (int i = 0; i < 30; i++) { 82 a += 15; 83 b += 15; 84 if (b <= 450) { 85 Object[] arr1 = Arrays.copyOfRange(arr, a, b); 86 mysql.insertNewData(sql, arr1); 87 System.out.println("正在採集數據..當前採集數據:" + (i + 1) + "條"); 88 } 89 } 90 91 } catch (IOException e) { 92 e.printStackTrace(); 93 } 94 } catch (MalformedURLException e) { 95 e.printStackTrace(); 96 } 97 } 98 99 }
GetRegExData.java (正則過濾功能類)
1 package com.lcw.rmi.collection; 2 3 import java.util.regex.Matcher; 4 import java.util.regex.Pattern; 5 6 public class GetRegExData { 7 8 public String getData(String regex, String content) { 9 Pattern pattern = Pattern.compile(regex); 10 Matcher matcher = pattern.matcher(content); 11 if (matcher.find()) { 12 return matcher.group(); 13 } else { 14 return ""; 15 } 16 17 } 18 }
Mysql.java (數據庫操做類)
1 package com.lcw.rmi.collection; 2 3 import java.sql.Connection; 4 import java.sql.DriverManager; 5 import java.sql.PreparedStatement; 6 import java.sql.ResultSet; 7 import java.sql.SQLException; 8 9 public class Mysql { 10 11 private String driver = "com.mysql.jdbc.Driver"; 12 private String url = "jdbc:mysql://localhost:3306/nba"; 13 private String user = "root"; 14 private String password = ""; 15 16 private PreparedStatement stmt = null; 17 private Connection conn = null; 18 private ResultSet resultSet = null; 19 20 /** 21 * 22 * @param insertSql 23 * 採集類,插入數據操做 24 * @param arr 25 */ 26 public void insertNewData(String insertSql, Object[] arr) { 27 28 try { 29 Class.forName(driver).newInstance(); 30 try { 31 conn = DriverManager.getConnection(url, user, password); 32 stmt = conn.prepareStatement(insertSql); 33 stmt.setString(1, arr[0].toString()); 34 stmt.setString(2, arr[1].toString()); 35 stmt.setString(3, arr[2].toString()); 36 stmt.setString(4, arr[3].toString()); 37 stmt.setString(5, arr[4].toString()); 38 stmt.setString(6, arr[5].toString()); 39 stmt.setString(7, arr[6].toString()); 40 stmt.setString(8, arr[7].toString()); 41 stmt.setString(9, arr[8].toString()); 42 stmt.setString(10, arr[9].toString()); 43 stmt.setString(11, arr[10].toString()); 44 stmt.setString(12, arr[11].toString()); 45 stmt.setString(13, arr[12].toString()); 46 stmt.setString(14, arr[13].toString()); 47 stmt.setString(15, arr[14].toString()); 48 stmt.executeUpdate(); 49 stmt.close(); 50 conn.close(); 51 52 } catch (SQLException e) { 53 e.printStackTrace(); 54 } 55 } catch (InstantiationException e) { 56 e.printStackTrace(); 57 } catch (IllegalAccessException e) { 58 e.printStackTrace(); 59 } catch (ClassNotFoundException e) { 60 e.printStackTrace(); 61 } 62 63 } 64 65 /** 66 * 67 * @param sql更新數據庫語句 68 */ 69 public void updateSQL(String updateSql) { 70 try { 71 Class.forName(driver).newInstance(); 72 try { 73 conn = DriverManager.getConnection(url, user, password); 74 } catch (SQLException e) { 75 e.printStackTrace(); 76 } 77 try { 78 stmt = conn.prepareStatement(updateSql); 79 stmt.execute(updateSql); 80 } catch (SQLException e) { 81 e.printStackTrace(); 82 } 83 84 } catch (InstantiationException e) { 85 e.printStackTrace(); 86 } catch (IllegalAccessException e) { 87 e.printStackTrace(); 88 } catch (ClassNotFoundException e) { 89 e.printStackTrace(); 90 } 91 } 92 93 /** 94 * 95 * @param sql通常查詢 96 */ 97 public ResultSet querySQL(String searchSql) { 98 try { 99 Class.forName(driver).newInstance(); 100 try { 101 conn = DriverManager.getConnection(url, user, password); 102 } catch (SQLException e) { 103 e.printStackTrace(); 104 } 105 try { 106 stmt = conn.prepareStatement(searchSql); 107 resultSet = stmt.executeQuery(); 108 } catch (SQLException e) { 109 e.printStackTrace(); 110 } 111 112 } catch (InstantiationException e) { 113 e.printStackTrace(); 114 } catch (IllegalAccessException e) { 115 e.printStackTrace(); 116 } catch (ClassNotFoundException e) { 117 e.printStackTrace(); 118 } 119 return resultSet; 120 } 121 }
Server.java (服務端類)
1 package com.lcw.rmi.collection; 2 3 import java.net.MalformedURLException; 4 import java.rmi.AlreadyBoundException; 5 import java.rmi.Naming; 6 import java.rmi.RemoteException; 7 import java.rmi.registry.LocateRegistry; 8 9 public class Server { 10 11 /** 12 * @param args 13 */ 14 public static void main(String[] args) { 15 try { 16 int port = 9797; 17 String address = "rmi://localhost:"+port+"/nba"; 18 IdoAction action = new doActionImpl(); 19 LocateRegistry.createRegistry(port); 20 try { 21 Naming.bind(address, action); 22 System.out.println(">>>正在啓動服務端.."); 23 System.out.println(">>>服務端啓動成功!"); 24 System.out.println(">>>等待客戶端鏈接..."); 25 System.out.println(">>>客戶端Balla_兔子已鏈接。"); 26 } catch (MalformedURLException e) { 27 e.printStackTrace(); 28 } catch (AlreadyBoundException e) { 29 e.printStackTrace(); 30 } 31 } catch (RemoteException e) { 32 e.printStackTrace(); 33 } 34 } 35 36 }
Client.java (客戶端類)
1 package com.lcw.rmi.collection; 2 3 import java.net.MalformedURLException; 4 import java.rmi.Naming; 5 import java.rmi.NotBoundException; 6 import java.rmi.RemoteException; 7 import java.util.List; 8 import java.util.Scanner; 9 10 public class Client { 11 12 public static void main(String[] args) { 13 int port = 9797; 14 String address = "rmi://localhost:" + port + "/nba"; 15 16 try { 17 IdoAction action = (IdoAction) Naming.lookup(address); 18 System.out.println("正在啓動客戶端.."); 19 System.out.println("客戶端啓動完畢,正在鏈接服務端.."); 20 System.out.println("鏈接成功..."); 21 System.out.println("---------------------------"); 22 23 while (true) { 24 System.out.println("①初始化數據庫-請按 (1)"); 25 System.out.println(); 26 System.out.println("②自動化採集NBA(2013-2014)賽季常規賽排名數據-請按(2)"); 27 System.out.println(); 28 System.out.println("③查詢NBA(2013-2014)賽季常規賽排名全部隊伍-請按(3)"); 29 System.out.println(); 30 System.out.println("④查詢具體球隊(2013-2014)賽季常規賽排名-請按(4)"); 31 System.out.println(); 32 System.out.println("⑤查詢具體詳情-請按(5)"); 33 System.out.println(); 34 35 Scanner scanner = new Scanner(System.in); 36 String input = scanner.next(); 37 38 if (input.equals("1")) { 39 System.out 40 .println("---------------------------------------------------------"); 41 System.out.println("服務端數據已初始化,請按2進行數據自動化採集.."); 42 action.initData(); 43 System.out 44 .println("---------------------------------------------------------"); 45 } 46 if (input.equals("2")) { 47 System.out 48 .println("---------------------------------------------------------"); 49 System.out.println("數據自動化採集中,請稍後.."); 50 int i=0; 51 while(i<10000){//延遲操做,給數據採集緩衝時間 52 i++; 53 } 54 System.out.println("數據採集完畢..按3,4,5進行相關操做"); 55 action.getAllDatas(); 56 System.out 57 .println("---------------------------------------------------------"); 58 } 59 if (input.equals("3")) { 60 System.out 61 .println("---------------------------------------------------------"); 62 System.out.println("正在獲取NBA(2013-2014)賽季常規賽隊伍,請稍後.."); 63 System.out.println(); 64 List<String> list = action.getAllTeams(); 65 for (int i = 0; i < list.size(); i++) { 66 if (i % 5 == 0 && i != 0) { 67 System.out.println(); 68 } 69 System.out.print(list.get(i) + "\t"); 70 } 71 System.out.println(); 72 73 System.out 74 .println("---------------------------------------------------------"); 75 } 76 if (input.equals("4")) { 77 System.out 78 .println("---------------------------------------------------------"); 79 System.out.println("請輸入你要查詢的隊伍名稱(如:76人)"); 80 String team = scanner.next(); 81 System.out 82 .print("排名\t球隊\t出手\t命中率\t出手\t命中率\t出手\t命中率\t前場\t後場\t總\t助攻\t失誤\t犯規\t得分"); 83 System.out.println(); 84 List<String> list=action.getTeamInfo(team); 85 for (int i = 0; i < 15; i++) { 86 System.out.print(list.get(i)+"\t"); 87 } 88 System.out.println(); 89 System.out 90 .println("---------------------------------------------------------"); 91 } 92 if (input.equals("5")) { 93 System.out 94 .println("---------------------------------------------------------"); 95 System.out.println("數據獲取中,請稍後..."); 96 System.out.println(); 97 System.out 98 .print("排名\t球隊\t出手\t命中率\t出手\t命中率\t出手\t命中率\t前場\t後場\t總\t助攻\t失誤\t犯規\t得分"); 99 System.out.println(); 100 List<String> list=action.getAllInfo(); 101 for(int i=0;i<450;i++){ 102 if(i%15==0&&i!=0){ 103 System.out.println(); 104 } 105 System.out.print(list.get(i)+"\t"); 106 } 107 System.out.println(); 108 System.out 109 .println("---------------------------------------------------------"); 110 } 111 } 112 } catch (MalformedURLException e) { 113 e.printStackTrace(); 114 } catch (RemoteException e) { 115 e.printStackTrace(); 116 } catch (NotBoundException e) { 117 e.printStackTrace(); 118 } 119 } 120 }
好了,關於JAVA採集數據文章就到此爲止了~ 撤··