今天解決了一個設置下載文件名爲中文的問題:直接在Content-Disposition中設置中文會致使亂碼。按照網上的辦法(Content-Disposition + UTF-8)就搞定了。不過爲了能搞清楚問題的關鍵所在,我仍是去看了下官方文檔,瞭解了下Content-Disposition的字段與意義。使用Content-Disposition能夠設置文件名,可是要設置中文就須要進行編碼,而RFC 822規定Message只能爲ASCII,這就是問題所在。前端
Content-Disposition的定義
Hypertext Transfer Protocol – HTTP/1.1中的描述node
Content-Disposition is not part of the HTTP standard, but since it is widely implemented, we are documenting its use and risks for implementors.瀏覽器
The Content-Disposition response-header field has been proposed as a means for the origin server to suggest a default filename if the user requests that the content is saved to a file. This usage is derived from the definition of Content-Disposition in RFC 1806.ide
RFC 1806中的描述ui
the Content-Disposition header field is defined as follows:google
disposition := "Content-Disposition" ":" disposition-type *(";" disposition-parm) disposition-type := "inline" / "attachment" / extension-token ; values are not case-sensitive disposition-parm := filename-parm / parameter filename-parm := "filename" "=" value;
‘extension-token’, ‘parameter’ and`’value’ are defined according to [RFC 822] and [RFC 1521].編碼
首先要注意的是disposition-type。由上面給出的是Content-Disposition header 字段咱們能夠知道,Content-Disposition有兩種type,即inline和attachment。根據文檔中的介紹可知,inline類型會自動顯示附件內容,好比顯示一個圖片;而attachment不會自動顯示,在郵件中可能會顯示爲一個帶圖標的附件,在瀏覽器中可能會提示下載。url
其次就是disposition-parm。主要做用就是提供一個建議的文件名(filename-parm),客戶端(瀏覽器、郵件系統)在儘量的狀況下會以該文件名去保存文件。儘量的意思是存在不同的狀況,好比文件名非法、存在同名文件,這些狀況下客戶端會採起一些措施,好比修改文件名。code
最後看看filename-parm的value。這個value就是文件名(本文的目的是給value設置一箇中文,主要的坑就是這裏)。server
經過上面的介紹,要給前端發送一個文件,而且定義文件的名字爲中文,能夠在發送文件以前返回這樣一個HTTP Response Header :
Content-Disposition: attachment; filename=文件.txt
須要注意的是,RFC 822( Standard for ARPA Internet Text Messages)規定了文本消息只能爲ASCII,所以這個Content-Disposition是非法的。RFC 1521(Multipurpose Internet Mail Extensions)基於前者對編碼方式進行了拓展,使用了4種機制:
MIME-Version header
Content-Type header
Content-Transfer-Encoding header
Content-ID and Content-Description header
惋惜這些與給HTTP Response Header中設置中文沒啥關係。後來google了一下,經過stackoverflow找到了一個叫 RFC 5987 - Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters 的文檔,頓時撥開迷霧見青天:
By default, message header field parameters in HTTP ([RFC2616]) messages cannot carry characters outside the ISO-8859-1 character set. RFC 2231 defines an encoding mechanism for use in MIME headers. This document specifies an encoding suitable for use in HTTP header fields that is compatible with a profile of the encoding defined in RFC 2231.
文中的Guidelines for Usage in HTTP Header Field Definitions給出了一個通用表達式:
foo-header = "foo" LWSP ":" LWSP token ";" LWSP title-param title-param = "title" LWSP "=" LWSP value / "title*" LWSP "=" LWSP ext-value ext-value = charset "'" [ language ] "'" value-chars charset = "UTF-8" / "ISO-8859-1" / mime-charset value-chars = *( pct-encoded / attr-char )
將前面非法的Content-Disposition轉化過來就是:
Content-Disposition : attachment; filename* = UTF-8''%E6%96%87%E4%BB%B6.txt
這裏對「文件.txt」進行了編碼:先進行UTF-8編碼,再進行pct-encoded編碼。其實就是URL_ENCODE的過程。。。
Node中的用法
除了使用filename*=UTF-8」+value外,還須要對中文進行編碼。在node中的寫法:
let name = urlencode("文件.txt", "utf-8"); res.setHeader("Content-Disposition", "attachment; filename* = UTF-8''"+name);
親測在Chrome、Edge、IE 11 下有效。