URL URI

簡介

URL統一資源定位符;URI統一資源標識符。URI是比較新的概念,包含URL(for example 身份證是URI,家庭住址及時URI有時URL),而且功能上更爲強大,包含自動編碼功能。html

URL的組成

協議:主機號(端口號):文件路徑信息:query參數:參考位置
相對URL 絕對URL。相對URL是協議、主機號(端口)借用基url。以下,用基URL表示兩個URLpage1.html;page2.html。參考位置文件所在的當前文件夾。git

http://example.com/pages/page1.html
http://example.com/pages/page2.html
URL myURL = new URL("http://example.com/pages/");
URL page1URL = new URL(myURL, "page1.html;page2.html。");
URL page2URL = new URL(myURL, "page2.html");
複製代碼


URL的地址不能包含一些特殊字符不能進行自動編碼與解碼,須要藉助URI。bash

URL的接口函數介紹

  1. 初始化函數,主要初始化的時候,會初始化URLStreamHandler的底層實現,一個底層接口,若是沒有特殊指定系統會調用默認的。
  2. 常規的URL解釋函數,調用各個部分
  3. 打開流部分,與打開鏈接部分

URI介紹

組成與分類

組成通常分爲三個部分
[scheme:]scheme-specific-part[#fragment]
相對uri與絕對url。分層url與不透明uri 不透明url只能是決定url。協議後面的scheme-specific-part不易/開頭。 層次uri,例如httpuri有決定於相對之分。相對url的地址開頭不易/開頭函數


url的操做方法:ui

  1. 標準化,去除. ..地址,
  2. 解析:合併絕對地址與相對地址
  3. 相對化:求取相對地址

uri字符的組成

介紹以下:編碼

  1. 字符
  2. 數字
  3. 字符數字
  4. 菲保留字符:字符數字+ 「_-!.~'()*」
  5. 符號 ,;:$&+=
  6. 保留字符 符號+ 「」"?/[]@"
  7. unicode字符不包含:控制字符、空格
RFC 2396 specifies precisely which characters are permitted in the various components of a URI reference. The following categories, most of which are taken from that specification, are used below to describe these constraints:
    alpha	The US-ASCII alphabetic characters, 'A' through 'Z' and 'a' through 'z'
    digit	The US-ASCII decimal digit characters, '0' through '9'
    alphanum	All alpha and digit characters
    unreserved    	All alphanum characters together with those in the string "_-!.~'()*"
    punct	The characters in the string ",;:$&+="
    reserved	All punct characters together with those in the string "?/[]@"
    escaped	Escaped octets, that is, triplets consisting of the percent character ('%') followed by two hexadecimal digits ('0'-'9', 'A'-'F', and 'a'-'f')
    other	The Unicode characters that are not in the US-ASCII character set, are not control characters (according to the Character.isISOControl method), and are not space characters (according to the Character.isSpaceChar method)  (Deviation from RFC 2396, which is limited to US-ASCII)
The set of all legal URI characters consists of the unreserved, reserved, escaped, and other characters.
複製代碼

uri編碼

緣由:有其餘非unicode編碼;非法字符入如空格。 一、uri的構造函數單個參數的時候,非法字符必須爲引用,用轉義字符表示 2. 多個參數的構造函數,直接採用原來的模式 3. get解析構造中的轉義字符 4. getraw*直接輸出,不解析轉義字符 5. he toString method returns a URI string with all necessary quotation but which may contain other characters. 6. The toASCIIString method returns a fully quoted and encoded URI string that does not contain any other characters.url


uri的功能都是對url補充,URI沒有底層實現,沒有handle接口函數等等。例如你在url中協議寫爲htp的時候,會報錯,可是uri不會。uri會進行一些檢查,例如空格,url不會。spa

相關文章
相關標籤/搜索