php 獲取 http 響應頭 get_headers 方法的一個細節

時間 2019-11-16

標籤 php 獲取 http 響應 headers 方法一個細節欄目 PHP 简体版

原文原文鏈接

背景

在 Web 後端開發過程當中，常常須要確認一個遠程網絡文件是否存在，或者查看文件的大小。javascript

若是直接將文件下載下來，固然能夠在本地計算文件大小，可是未免要消耗大量時間和帶寬資源。php

有沒有辦法只獲取文件大小，但又不用去下載呢？html

答案是：能夠。java

HTTP 協議

在 http response 報文中，主要分爲兩大塊：linux

響應頭部 response header
響應內容 response body

以瀏覽器請求一張圖片爲例：
後端

能夠看到，響應頭包含一個內容長度的字段：瀏覽器

Content-Length:5567複製代碼

因此，只要獲取到 http 響應頭，就能夠知道網絡文件的大小了。服務器

在 http 協議中專門定義了 HEAD 請求方法，告知服務器只返回響應頭信息，不用返回響應內容。微信

請求示例：網絡

curl -v -I 'http://upload.jianshu.io/users/upload_avatars/19687/a0ac666907b7?imageMogr2/auto-orient/strip|imageView2/1/w/240/h/240'複製代碼

請求過程和響應：

> HEAD /users/upload_avatars/19687/a0ac666907b7?imageMogr2/auto-orient/strip|imageView2/1/w/240/h/240 HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.13.1.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2
> Host: upload.jianshu.io
> Accept: */* > < HTTP/1.1 200 OK < Date: Tue, 23 May 2017 12:21:54 GMT < Server: openresty < Content-Type: image/jpeg < Content-Length: 5567 < Accept-Ranges: bytes複製代碼

PHP

使人欣喜的是，php 已經內置了獲取網絡資源響應頭的方法了：

get_headers();複製代碼

執行命令：

php -r "print_r(get_headers('http://upload.jianshu.io/users/upload_avatars/19687/a0ac666907b7?imageMogr2/auto-orient/strip|imageView2/1/w/240/h/240'));"複製代碼

打印結果：

Array
(
    [0] => HTTP/1.1 200 OK
    [1] => Date: Tue, 23 May 2017 13:35:38 GMT
    [2] => Server: openresty
    [3] => Content-Type: image/jpeg
    [4] => Content-Length: 5567
    [5] => Accept-Ranges: bytes
)複製代碼

問題

看起來 php 中 get_headers 方法很好的執行了 http 協議，只獲取了 header 信息。

可是，Nginx日誌卻顯示 get_headers 方法用的是 GET 方法，並非 HEAD：

127.0.0.1 - - [23/May/2017:21:51:22 +0800] GET /test.html HTTP/1.0 "200" ......複製代碼

事實上，在使用 get_headers以前，你得手動指定用 HEAD 方法：

<?php
// By default get_headers uses a GET request to fetch the headers. If you want to send a HEAD request instead, you can do so using a stream context:
stream_context_set_default(
    array(
        'http' => array(
            'method' => 'HEAD'
        )
    )
);
$headers = get_headers('http://example.com/test.html');複製代碼

Nginx 日誌：

127.0.0.1 - - [23/May/2017:21:51:22 +0800] HEAD /test.html HTTP/1.0 "200" ......複製代碼

總結

在使用 get_headers 獲取響應頭的時候，默認用的是 GET 方法，這就意味着除了返回頭信息外，還有響應內容，這不就是直接下載了麼？

事實上，在獲取到頭信息和少許的響應內容後，get_headers 方法就會主動關閉 tcp 鏈接，這樣就不會把整個文件下載到內存了。

可是，咱們的需求畢竟是隻獲取響應頭，因此，在使用 get_headers 的時候建議手動指定 HEAD 請求方法。

還有問題？聯繫做者微信/微博 @Ceelog

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。