最近線上php模塊偶現 read error on connection;具體報錯日誌以下php
Uncaught exception 'RedisException' with message 'read error on connection'
經過分析和學習以後,發現兩種緣由可能致使 phpredis 返回 'read error on connection':git
下面將對這兩種狀況進行具體的分析。github
超時又能夠分兩種狀況:一種是客戶端設置的超時時間太短致使的;另一種是客戶端未設置超時時間,可是服務端執行時間超過了默認超時時間設置。redis
測試環境的 get 操做 執行耗時約 0.1ms 數量級;所以客戶端設置執行超時時間爲0.01ms, 測試腳本以下:app
<?php $rds = new Redis(); try { $ret = $rds->pconnect("127.0.0.1", 6390); if ($ret == false) { echo "Connect return false"; exit; } //設置超時時間爲 0.1ms $rds->setOption(3,0.0001); $rds->get("aa"); } catch (Exception $e) { var_dump ($e); }
手動執行該腳本會捕獲'read error on connection'異常;less
客戶端未設置超時時間,可是在命令執行的過程當中,超時達到php設置的默認值,詳見 phpredis subscribe超時問題及解決 分析socket
經過strace 查看執行過程能夠發現發送 get aa 指令後,poll 想要拉取 POLLIN 事件的時候等待超時:函數
php鏈接redis 使用的是phpredis擴展,在phpredis源碼中全文搜索 'read error on connection' 能夠發現 此錯誤位於 phpredis/library.c 文件的 redis_sock_gets 函數,詳見 phpredis ;php-fpm
phpredis 的 library.c 文件的 redis_sock_gets 函數學習
/* * Processing for variant reply types (think EVAL) */ PHP_REDIS_API int redis_sock_gets(RedisSock *redis_sock, char *buf, int buf_size, size_t *line_size) { // Handle EOF if(-1 == redis_check_eof(redis_sock, 0)) { return -1; } if(php_stream_get_line(redis_sock->stream, buf, buf_size, line_size) == NULL) { char *errmsg = NULL; if (redis_sock->port < 0) { spprintf(&errmsg, 0, "read error on connection to %s", ZSTR_VAL(redis_sock->host)); } else { spprintf(&errmsg, 0, "read error on connection to %s:%d", ZSTR_VAL(redis_sock->host), redis_sock->port); } // Close our socket redis_sock_disconnect(redis_sock, 1); // Throw a read error exception REDIS_THROW_EXCEPTION(errmsg, 0); efree(errmsg); return -1; } /* We don't need \r\n */ *line_size-=2; buf[*line_size]='\0'; /* Success! */ return 0; }
附: 這個msg 看着比線上的msg 多了 host 和 port , 是由於最近合併分支的緣由,如圖
從源碼中能夠發現若是php_stream_get_line讀取stream數據爲NUll的時候就會拋出read error on connection這個錯誤。那麼何時php_stream_get_line會返回NULL呢, 對應於php源碼的php-src/main/streams/streams.c 文件 , 詳見php-src;
/* If buf == NULL, the buffer will be allocated automatically and will be of an * appropriate length to hold the line, regardless of the line length, memory * permitting */ PHPAPI char *_php_stream_get_line(php_stream *stream, char *buf, size_t maxlen, size_t *returned_len) { size_t avail = 0; size_t current_buf_size = 0; size_t total_copied = 0; int grow_mode = 0; char *bufstart = buf; if (buf == NULL) { grow_mode = 1; } else if (maxlen == 0) { return NULL; } /* * If the underlying stream operations block when no new data is readable, * we need to take extra precautions. * * If there is buffered data available, we check for a EOL. If it exists, * we pass the data immediately back to the caller. This saves a call * to the read implementation and will not block where blocking * is not necessary at all. * * If the stream buffer contains more data than the caller requested, * we can also avoid that costly step and simply return that data. */ for (;;) { avail = stream->writepos - stream->readpos; if (avail > 0) { size_t cpysz = 0; char *readptr; const char *eol; int done = 0; readptr = (char*)stream->readbuf + stream->readpos; eol = php_stream_locate_eol(stream, NULL); if (eol) { cpysz = eol - readptr + 1; done = 1; } else { cpysz = avail; } if (grow_mode) { /* allow room for a NUL. If this realloc is really a realloc * (ie: second time around), we get an extra byte. In most * cases, with the default chunk size of 8K, we will only * incur that overhead once. When people have lines longer * than 8K, we waste 1 byte per additional 8K or so. * That seems acceptable to me, to avoid making this code * hard to follow */ bufstart = erealloc(bufstart, current_buf_size + cpysz + 1); current_buf_size += cpysz + 1; buf = bufstart + total_copied; } else { if (cpysz >= maxlen - 1) { cpysz = maxlen - 1; done = 1; } } memcpy(buf, readptr, cpysz); stream->position += cpysz; stream->readpos += cpysz; buf += cpysz; maxlen -= cpysz; total_copied += cpysz; if (done) { break; } } else if (stream->eof) { break; } else { /* XXX: Should be fine to always read chunk_size */ size_t toread; if (grow_mode) { toread = stream->chunk_size; } else { toread = maxlen - 1; if (toread > stream->chunk_size) { toread = stream->chunk_size; } } php_stream_fill_read_buffer(stream, toread); if (stream->writepos - stream->readpos == 0) { break; } } } if (total_copied == 0) { if (grow_mode) { assert(bufstart == NULL); } return NULL; } buf[0] = '\0'; if (returned_len) { *returned_len = total_copied; } return bufstart; }
從 php_stream_get_line方法中能夠看出 只有 bufstart=NULL的時候纔會返回NULL,bufstart=NULL說明並未在buf緩衝和stream中接收到任何數據,包括終止符。
客戶端設置合理的超時時間,有兩種方式:
ini_set('default_socket_timeout', -1);
$redis->setOption(Redis::OPT_READ_TIMEOUT, -1);
注: -1均表示不超時,也能夠將超時設置爲本身但願的時間, 前面復現時就是設爲爲0.01ms
使用已經斷開的鏈接也有可能致使 'read error on connection', 這裏須要區分 'Connection closed' 和 'Connection lost'。
測試腳本以下,客戶端主動關閉鏈接,可是下文接着使用該斷開的連接,而後拋出異常返回 connection closed
<?php $rds = new Redis(); try { $ret = $rds->pconnect("127.0.0.1", 6390); if ($ret == false) { echo "Connect return false"; exit; } $rds->close(); var_dump($rds->get("aa")); } catch (Exception $e) { var_dump ($e); }
測試結果以下:
參考Work around PHP bug of liveness checking 編寫測試腳本 test.php 以下,鏈接上redis以後,在執行命令前kill redis 進程:
<?php $rds = new Redis(); try { $ret = $rds->pconnect("127.0.0.1", 6390); if ($ret == false) { echo "Connect return false"; exit; } echo "Press any key to continue ..."; fgetc(STDIN); var_dump($rds->get("aa")); } catch (Exception $e) { var_dump ($e); }
若是
執行步驟以下
此時會出現 'Connection lost'
鏈接上redis以後,不斷執行命令的過程當中,若是鏈接斷開,會返回 read error on connection。測試腳本以下:
<?php $rds = new Redis(); try { $ret = $rds->pconnect("127.0.0.1", 6390); if ($ret == false) { echo "Connect return false"; exit; } while(1){ $rds->get("aa"); } } catch (Exception $e) { var_dump ($e); }
若是
執行步驟以下
此時拋出異常:
或者新打開終端鏈接上redis服務端,執行client kill ,以下:
正在執行的php腳本一樣會捕獲該異常read error on connection。
在cli 模式下, 經過php經過 pconnect 鏈接redis服務端,雖然業務代碼,顯示調用close, 可是實際上該鏈接並未斷開,fpm 會維護到redis 的鏈接,下個請求再次執行pconnect 的時候並不會真正請求redis 創建鏈接。這樣一樣會帶來一個問題,假如這個鏈接已經斷開了,下個請求可能直接使用上個斷開的鏈接,對此,phpredis 在其源碼也有註釋,詳見php-src
所以php-fpm reuse 一個斷開的鏈接可能致使此類錯誤。
此種狀況最簡單的解決方案就是改長連接爲短連接了
網上有不少關於 執行超時及其解決方案的分析,可是對於鏈接斷開從新使用的分析較少,故此分析之,一方面用做記錄,另外一方面但願可以給面臨一樣問題的小夥伴一點幫助。
[1] redis read error on connection和Redis server went away錯誤排查
[2] Work around PHP bug of liveness checking
[4] php-src
[5] phpredis