面向基礎系列之---Java網絡編程---網絡鏈接組件的使用(URLConnection)

URL主要功能用於讀取服務端或者目標地址的數據,可是要具體對一個請求的元數據進行解析,就無能爲力了。這個時候,URLConnection就是一個很好的切入口。這東西不只僅能讀取數據,還能對元數據進行讀取,還能讀取header(頭header仍是很重要的,互聯網開發中,每每咱們就是要header裏面的數據),而且,URLConnection還能使用各類的HTTP方法(POST/GET/OPTIONS/PUT/DELETE)往服務端發數據。本章我不介紹太多,儘可能精簡。html

1、構建與讀

總體上使用URLConnection的基本步驟以下:java

  1. 構造一個URL對象
  2. 調用這個URL的openConnection()獲取一個對應的URLConnection對象
  3. 配置這個URLConnection
  4. 讀取首部字段
  5. 獲取輸入流並讀取數據
  6. 得到輸出流並寫入數據
  7. 關閉鏈接

基本的代碼片斷以下:web

try{
    URL u = new URL("http://www.baidu.com");
    URLConnection conn = u.openConnection();
    // 從URL讀取
} catch(MalformedURLException ex){
    System.err.println(ex);
} catch(IOException ex){
    System.err.println(ex)
}

一、內部一些簡單原理

  • URLConnection是一個抽象類,只有一個方法沒有實現:public void connect() throws IOException
  • 一些常見的實現類:
    • sun.net.www.protocol.file.FileURLConnection:文件名相關
    • sun.net.www.protocol.http.HttpURLConnection:網絡相關
  • 建立URLConnection以後,不進行connect的調用,在第一次要進行數據通訊的時候,才調用,例如:getInputStream、getContent、getHeaderField等

二、讀取服務器的數據

public class NetworkMain {

    public static void main(String[] args) {
        try {
            URL url = new URL("https://www.baidu.com");
            URLConnection urlConnection = url.openConnection();

            try (InputStream inputStream = urlConnection.getInputStream();){
                InputStream buffer = new BufferedInputStream(inputStream);
                InputStreamReader reader = new InputStreamReader(buffer);
                int c;
                while ((c=reader.read()) != -1){
                    System.out.print((char) c);
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

URL與URLConnection的區別:跨域

  • URLConnection提供了對HTTP首部的讀取
  • URLConnection能夠配置發送給服務器的請求
  • URLConnection出了讀取,還能寫入,箱服務器

三、首部與讀取

下面是一個百度首頁獲取的header具體信息:瀏覽器

Accept-Ranges:[bytes]
null:[HTTP/1.1 200 OK]
Server:[bfe/1.0.8.18]
Etag:["58860402-98b"]
Cache-Control:[private, no-cache, no-store, proxy-revalidate, no-transform]
Connection:[Keep-Alive]
Set-Cookie:[BDORZ=27315; max-age=86400; domain=.baidu.com; path=/]
Pragma:[no-cache]
Last-Modified:[Mon, 23 Jan 2017 13:24:18 GMT]
Content-Length:[2443]
Date:[Thu, 13 Sep 2018 09:51:05 GMT]
Content-Type:[text/html]

獲取的代碼以下:緩存

public class NetworkMain {

    public static void main(String[] args) {
        try {
            URL url = new URL("https://www.baidu.com");
            URLConnection urlConnection = url.openConnection();
            Map<String, List<String>> headerFields = urlConnection.getHeaderFields();
            for(Map.Entry<String,List<String>> entry : headerFields.entrySet()){
                System.out.println(entry.getKey()+":"+entry.getValue().toString());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

a、Content-type

返回響應主題的MIME((Multipurpose Internet Mail Extensions)多用途互聯網郵件擴展類型)。其實就是內容的類型和編碼方式服務器

  • 沒指定不會拋異常,直接返回null
  • text/html不指定編碼方式,默認使用ISO-8859-1,是http默認編碼方式
  • 其餘經常使用的類型還有:text/plain、image/gif、application/xml、image/jpeg
  • 可經過getContentEncoding方法進行獲取編碼方式,不指定會返回null

b、Content-length

獲取內容的總共的字節大小,若是沒有content-length頭,getContentLength()方法就返回-1網絡

  • Java7中增長了個getContentLengthLong方法,返回long,防止超出int最大值
  • http下載一個二進制文件,最好使用getContentLength方法來判斷何時結束InputStream對象

c、Date

指出文件什麼時候發送的app

d、Expires

指示什麼時候從緩存中刪除文檔,若是沒有這個header,getExpireation方法放回0,表示永遠不會過時less

e、Last-Modified

文檔最後修改時間,沒有這個header的話,getLastModified方法返回0

2、緩存

緩存是永恆的話題~好吧,web瀏覽器的緩存,也是一個能屠龍的功能。這小節會介紹下若是使用web緩存,與Java中設置緩存的幾個類

一、如何設置header使之可以緩存

通常來講GET的HTTP請求都會緩存,也應該緩存,可是POST請求就不該該緩存。固然這些均可以經過header進行調整:

  • Expires首部(HTTP1.0)指示能夠緩存這個資源,知道指定的時間爲止
  • Cache-control首部(HTTP1.1)細粒度的緩存控制,若是這個和expires首部都有,會以這個首部爲主,多個cache-control是被容許的:
    • Max-age=[second]:從如今到緩存項過時以前的秒數
    • s-maxage=[seconds]:從如今起,知道緩存項再共享緩存中過時以前的秒數。私有緩存能夠將緩存項保存更長時間
    • Public:能夠緩存一個通過認證的響應。不然已認證的響應不能緩存
    • Private:僅單個用戶緩存能夠保存響應,而共享緩存不該該保存
    • No-cache:緩存項仍然能夠緩存,不過客戶端在每次訪問時都要用一個Etag或者Last-modified頭從新驗證響應
    • no-store:無論怎樣都不緩存
  • Last-modified:最後一次修改日期。客戶端可使用一個HEAD請求來檢查這個日期,只有當本地緩存的日期早於這個值,纔會真正執行GET請求
  • Etag:資源的惟一標識。HEAD請求訪問這個Etag服務端的值,只有與本地的Etag值不一樣的狀況下,說明緩存失效了,纔會調用GET請求

二、Java的Web緩存

默認請款下,直接使用URL請求資源的時候,Java是不進行緩存的,要默認實現幾個類來增長Java對Web請求的緩存功能:

ResponseCache//設置默認緩存策略的對象
CacheRequest//設置請求的對象
CacheResponse//設置回覆請求的對象

一個簡簡單單的實現代碼,稍微有點長,不過不難,其中還有對header中cache-control字段的解析對象構建,是一個不錯的起步例子:

import java.io.*;
import java.net.*;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;

public class NetworkMain {

    public static class CacheControl {

        private Date maxAge = null;
        private Date sMaxAge = null;
        private boolean mustRevalidate = false;
        private boolean noCache = false;
        private boolean noStore = false;
        private boolean proxyRevalidate = false;
        private boolean publicCache = false;
        private boolean privateCache = false;

        public CacheControl(String s) {
            if (s == null || !s.contains(":")) {
                return; // default policy
            }

            String value = s.split(":")[1].trim();
            String[] components = value.split(",");

            Date now = new Date();
            for (String component : components) {
                try {
                    component = component.trim().toLowerCase(Locale.US);
                    if (component.startsWith("max-age=")) {
                        int secondsInTheFuture = Integer.parseInt(component.substring(8));
                        maxAge = new Date(now.getTime() + 1000 * secondsInTheFuture);
                    } else if (component.startsWith("s-maxage=")) {
                        int secondsInTheFuture = Integer.parseInt(component.substring(8));
                        sMaxAge = new Date(now.getTime() + 1000 * secondsInTheFuture);
                    } else if (component.equals("must-revalidate")) {
                        mustRevalidate = true;
                    } else if (component.equals("proxy-revalidate")) {
                        proxyRevalidate = true;
                    } else if (component.equals("no-cache")) {
                        noCache = true;
                    } else if (component.equals("public")) {
                        publicCache = true;
                    } else if (component.equals("private")) {
                        privateCache = true;
                    }
                } catch (RuntimeException ex) {
                    continue;
                }
            }
        }

        public Date getMaxAge() {
            return maxAge;
        }

        public Date getSharedMaxAge() {
            return sMaxAge;
        }

        public boolean mustRevalidate() {
            return mustRevalidate;
        }

        public boolean proxyRevalidate() {
            return proxyRevalidate;
        }

        public boolean noStore() {
            return noStore;
        }

        public boolean noCache() {
            return noCache;
        }

        public boolean publicCache() {
            return publicCache;
        }

        public boolean privateCache() {
            return privateCache;
        }
    }

    public static class SimpleCacheRequest extends CacheRequest {
        private ByteArrayOutputStream out = new ByteArrayOutputStream();


        @Override
        public OutputStream getBody() throws IOException {

            return out;
        }

        @Override
        public void abort() {
            out.reset();
        }

        public byte[] getData() {
            if (out.size() == 0) {
                return null;
            } else {
                return out.toByteArray();
            }
        }
    }

    public static class SimleCacheResponse extends CacheResponse {
        private final Map<String, List<String>> headers;
        private final SimpleCacheRequest request;
        private final Date expires;
        private final CacheControl control;

        public SimleCacheResponse(SimpleCacheRequest request, URLConnection uc, CacheControl control) throws IOException {
            this.request = request;
            this.control = control;
            this.expires = new Date(uc.getExpiration());
            this.headers = Collections.unmodifiableMap(uc.getHeaderFields());
        }

        @Override
        public InputStream getBody() {
            return new ByteArrayInputStream(request.getData());
        }

        @Override
        public Map<String, List<String>> getHeaders()
                throws IOException {
            return headers;
        }

        public CacheControl getControl() {
            return control;
        }

        public boolean isExpired() {
            Date now = new Date();
            if (control.getMaxAge().before(now)) return true;
            else if (expires != null && control.getMaxAge() != null) {
                return expires.before(now);
            } else {
                return false;
            }
        }
    }
    public static class MemoryCache extends ResponseCache {

        private final Map<URI, SimleCacheResponse> responses
                = new ConcurrentHashMap<URI, SimleCacheResponse>();
        private final int maxEntries;

        public MemoryCache() {
            this(100);
        }

        public MemoryCache(int maxEntries) {
            this.maxEntries = maxEntries;
        }

        @Override
        public CacheRequest put(URI uri, URLConnection conn)
                throws IOException {

            if (responses.size() >= maxEntries) return null;

            CacheControl control = new CacheControl(conn.getHeaderField("Cache-Control"));
            if (control.noStore()) {
                return null;
            } else if (!conn.getHeaderField(0).startsWith("GET ")) {
                // only cache GET
                return null;
            }

            SimpleCacheRequest request = new SimpleCacheRequest();
            SimleCacheResponse response = new SimleCacheResponse(request, conn, control);

            responses.put(uri, response);
            return request;
        }

        @Override
        public CacheResponse get(URI uri, String requestMethod,
                                 Map<String, List<String>> requestHeaders)
                throws IOException {

            if ("GET".equals(requestMethod)) {
                SimleCacheResponse response = responses.get(uri);
                // check expiration date
                if (response != null && response.isExpired()) {
                    responses.remove(response);
                    response = null;
                }
                return response;
            } else {
                return null;
            }
        }
    }

    public static void main(String[] args) {
        ResponseCache.setDefault(new MemoryCache());
        try {
            URL url = new URL("https://www.baidu.com");
            URLConnection urlConnection = url.openConnection();
            Map<String, List<String>> headerFields = urlConnection.getHeaderFields();
            for (Map.Entry<String, List<String>> entry : headerFields.entrySet()) {
                System.out.println(entry.getKey() + ":" + entry.getValue().toString());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

3、一些鏈接配置項

URLConnection類有7個保護的字段,定義了客戶端如何向服務端做出請求,JDK源碼中,對這些配置項作了很好的說明,直接讀英文無壓力,我就很少說了:

/**
     * The URL represents the remote object on the World Wide Web to
     * which this connection is opened.
     * <p>
     * The value of this field can be accessed by the
     * {@code getURL} method.
     * <p>
     * The default value of this variable is the value of the URL
     * argument in the {@code URLConnection} constructor.
     *
     * @see     java.net.URLConnection#getURL()
     * @see     java.net.URLConnection#url
     */
	protected URL url;

   	/**
     * This variable is set by the {@code setDoInput} method. Its
     * value is returned by the {@code getDoInput} method.
     * <p>
     * A URL connection can be used for input and/or output. Setting the
     * {@code doInput} flag to {@code true} indicates that
     * the application intends to read data from the URL connection.
     * <p>
     * The default value of this field is {@code true}.
     *
     * @see     java.net.URLConnection#getDoInput()
     * @see     java.net.URLConnection#setDoInput(boolean)
     */
    protected boolean doInput = true;

   /**
     * This variable is set by the {@code setDoOutput} method. Its
     * value is returned by the {@code getDoOutput} method.
     * <p>
     * A URL connection can be used for input and/or output. Setting the
     * {@code doOutput} flag to {@code true} indicates
     * that the application intends to write data to the URL connection.
     * <p>
     * The default value of this field is {@code false}.
     *
     * @see     java.net.URLConnection#getDoOutput()
     * @see     java.net.URLConnection#setDoOutput(boolean)
     */
    protected boolean doOutput = false;
   	/**
     * If {@code true}, this {@code URL} is being examined in
     * a context in which it makes sense to allow user interactions such
     * as popping up an authentication dialog. If {@code false},
     * then no user interaction is allowed.
     * <p>
     * The value of this field can be set by the
     * {@code setAllowUserInteraction} method.
     * Its value is returned by the
     * {@code getAllowUserInteraction} method.
     * Its default value is the value of the argument in the last invocation
     * of the {@code setDefaultAllowUserInteraction} method.
     *
     * @see     java.net.URLConnection#getAllowUserInteraction()
     * @see     java.net.URLConnection#setAllowUserInteraction(boolean)
     * @see     java.net.URLConnection#setDefaultAllowUserInteraction(boolean)
     */
    protected boolean allowUserInteraction = defaultAllowUserInteraction;
	/**
     * If {@code true}, the protocol is allowed to use caching
     * whenever it can. If {@code false}, the protocol must always
     * try to get a fresh copy of the object.
     * <p>
     * This field is set by the {@code setUseCaches} method. Its
     * value is returned by the {@code getUseCaches} method.
     * <p>
     * Its default value is the value given in the last invocation of the
     * {@code setDefaultUseCaches} method.
     *
     * @see     java.net.URLConnection#setUseCaches(boolean)
     * @see     java.net.URLConnection#getUseCaches()
     * @see     java.net.URLConnection#setDefaultUseCaches(boolean)
     */
	protected boolean useCaches = defaultUseCaches;

   	/**
     * Some protocols support skipping the fetching of the object unless
     * the object has been modified more recently than a certain time.
     * <p>
     * A nonzero value gives a time as the number of milliseconds since
     * January 1, 1970, GMT. The object is fetched only if it has been
     * modified more recently than that time.
     * <p>
     * This variable is set by the {@code setIfModifiedSince}
     * method. Its value is returned by the
     * {@code getIfModifiedSince} method.
     * <p>
     * The default value of this field is {@code 0}, indicating
     * that the fetching must always occur.
     *
     * @see     java.net.URLConnection#getIfModifiedSince()
     * @see     java.net.URLConnection#setIfModifiedSince(long)
     */
    protected long ifModifiedSince = 0;

   	/**
     * If {@code false}, this connection object has not created a
     * communications link to the specified URL. If {@code true},
     * the communications link has been established.
     */
    protected boolean connected = false;

對象中有相對應的set和get方法,通常若是在openConnection方法調用以後進行set,都會拋出IllegalStateException異常

4、向服務端寫數據

這部分兩塊,寫header,寫內容

一、設置請求數據的header

這裏設置header和前面的不同,前面是對服務端請求過來的數據進行header讀取,這裏會回寫服務端的時候,對這個請求Request進行header添加的操做,主要用下面這幾個方法:

public void setRequestProperty(String key, String value);//設置一個key對應的值,value能夠逗號分隔設置多個
public void addRequestProperty(String key, String value);//對一個key的值進行添加值的操做

比較好玩的是,發現setRequestProperty的源碼不難,能夠看看,增長源碼的親密度

public abstract class URLConnection {
    ...
	public void setRequestProperty(String key, String value) {
        if (connected)
            throw new IllegalStateException("Already connected");
        if (key == null)
            throw new NullPointerException ("key is null");

        if (requests == null)
            requests = new MessageHeader();

        requests.set(key, value);
    }
    
    ...
}
public class MessageHeader {
    private String[] keys;
    private String[] values;
    private int nkeys;

	public synchronized void set(String var1, String var2) {
        int var3 = this.nkeys;

        do {
            --var3;
            if (var3 < 0) {
                this.add(var1, var2);
                return;
            }
        } while(!var1.equalsIgnoreCase(this.keys[var3]));

        this.values[var3] = var2;
    }
    public synchronized void add(String var1, String var2) {
        this.grow();
        this.keys[this.nkeys] = var1;
        this.values[this.nkeys] = var2;
        ++this.nkeys;
    }
    private void grow() {
        if (this.keys == null || this.nkeys >= this.keys.length) {
            String[] var1 = new String[this.nkeys + 4];
            String[] var2 = new String[this.nkeys + 4];
            if (this.keys != null) {
                //會發現底層JDK會使用這種,由於快速!
                System.arraycopy(this.keys, 0, var1, 0, this.nkeys);
            }

            if (this.values != null) {
                System.arraycopy(this.values, 0, var2, 0, this.nkeys);
            }

            this.keys = var1;
            this.values = var2;
        }

    }
}

二、POST寫數據

其實對因而GET仍是POST寫數據,Java的URLConnection會有個相似於自動判斷的功能:

  • 默認是GET
  • 若是將doOutput參數置爲true,使用OutputStream寫數據,就是POST,會自動設置header
  • 固然,有其餘方法主動設置請求方法

下面是一個提交POST請求的小小例子:

public static void main(String[] args) {
    try {
        URL url = new URL("https://www.baidu.com");
        URLConnection urlConnection = url.openConnection();
        urlConnection.setDoOutput(true);
        OutputStream outputStream = urlConnection.getOutputStream();
        BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(outputStream, "8859_1"));
        bw.write("lalalalallala");
        bw.flush();
        bw.close();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

5、HttpURLConnection

默認如何URL請求是一個http的協議的話,返回的就是這個HttpURLConnection這個對象,他是URLConnection的抽象子類。使用public void setRequestMethod(String method) throws ProtocolException 方法來設置具體使用什麼HTTP請求方法。下面幾個經常使用的方法羅列:

  • GET
  • POST
  • HEAD:常常用於獲取最後修改時間以淘汰緩存
  • PUT
  • DELETE
  • OPTIONS:跨域使用(重點),詢問服務器支持哪些HTTP的方法
  • TRACE:查看服務器和客戶端之間的代理服務器作了哪些修改,能夠ng配置查詢使用
相關文章
相關標籤/搜索